feat: multi-provider routing + /v1/usage reporting endpoint#105
Merged
Conversation
- Refactor agentic-server router for multi-provider support:
- Route by X-Provider header, model prefix (anthropic/claude-3), or
known model patterns (gpt-* → openai, claude-* → anthropic, llama-* → ollama)
- Anthropic API support: x-api-key auth, system message extraction,
response transform to OpenAI-compatible format
- LLM_PROVIDERS env var for multi-provider standalone config
- GET /v1/providers endpoint to list configured providers
- Backward-compatible: single provider config still works
- Add POST /v1/usage endpoint for external usage reporting:
- Python functions (LlamaParse, HuggingFace, etc.) can report inference
usage for billing without proxying through the agentic server
- Accepts model, provider, tokens, latency, status; fires metering INSERT
- Returns 202 Accepted (fire-and-forget)
- Validates model is required, defaults service to 'chat'
- 17 new integration tests covering all routing + usage scenarios
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
- All metering (compute/inference/storage) now resolves table names from MetaSchema module registration tables instead of hardcoding schema names - Each meter uses a 60s TTL cache for resolved table references - Falls back to well-known defaults when MetaSchema is unavailable - fn-runtime/storage.ts: self-contained MetaSchema resolution for S3 metering - inference-meter.ts: self-contained MetaSchema resolution for LLM metering - storage-meter.ts in worker: delegates to UsageClient (same directory) - Updated storage-meter tests to account for MetaSchema lookup queries
- Create packages/usage-loader/ as canonical MetaSchema table resolver - UsageLoader class: resolves table names from metaschema_public with 60s TTL - All metering (compute, inference, storage) imports from shared package - job/worker/usage-client.ts is now a thin re-export for backward compat - inference-meter.ts (agentic-server): delegates to UsageLoader - storage-meter.ts (worker): delegates to UsageLoader - compute-meter.ts (worker): delegates to UsageLoader - fn-runtime/storage.ts: uses UsageLoader for createMeterCallback - 119 integration + 25 unit tests passing
- Create packages/module-loader/ with TtlCache, ComputeModuleLoader, UsageLoader, BillingLoader, and unified ModuleLoader facade - Multi-database support: all loaders accept databaseId for platform vs tenant - Add GraphExecutionModuleConfig to resolve node_states, complete_node, fail_node table/function names from MetaSchema - Remove hardcoded constructive_compute_public/private refs from compute-worker graph operations (markNodeRunning, completeGraphNode, failGraphExecution) - Migrate compute-worker to import from shared package (cache, module-loader, billing, compute-log, discovery, invocation all delegate to module-loader) - usage-loader is now a thin re-export from module-loader for backward compat - 119 integration + 25 unit tests passing
- Update job/worker/src/graph-complete.ts to use ComputeModuleLoader for resolving complete_node and fail_node function names - Remove last hardcoded constructive_compute_private refs - Fix inline-nodes tests to handle MetaSchema resolution queries - Add _resetGraphCompleteCache() for test isolation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds multi-provider routing +
/v1/usagereporting endpoint to the agentic server, and extracts the MetaSchema-based table resolution into a shared@constructive-io/usage-loaderworkspace package.Multi-provider routing:
X-Providerheader, model prefix (anthropic/claude-3.5-sonnet), or known patterns (claude-*→ anthropic,gpt-*→ openai,llama-*→ ollama)GET /v1/providers— lists configured providersPOST /v1/usage— fire-and-forget endpoint for Python functions to report inference usage@constructive-io/usage-loaderpackage:All three consumers (worker/compute-meter, agentic-server/inference-meter, fn-runtime/storage) now import from this shared package.
job/worker/usage-client.tsis a thin re-export for backward compat.Tests: 119 integration + 25 unit, all green.
Link to Devin session: https://app.devin.ai/sessions/b2291a8e333e445aa125a2efd1996206
Requested by: @pyramation