feat(platform): estimate CoPilot turn cost and require approval for high-cost requests#12877
feat(platform): estimate CoPilot turn cost and require approval for high-cost requests#12877Rushi-Balapure wants to merge 3 commits into
Conversation
…tion (Significant-Gravitas#12832) ## Problem The CoPilot system prompt contains a `gh auth status` instruction in the E2B-specific `GitHub CLI` section, but models pattern-match to `connect_integration` from the **Tool Discovery Priority** section — which is where the actual decision to call an external service is made. Because the GitHub auth check lives in a separate, later section, it's not salient at the point of decision-making. This causes the model to call `connect_integration(provider='github')` even when `gh` is already authenticated via `GH_TOKEN`, unnecessarily prompting the user. ## Fix Add a 3-line callout directly inside the **Tool Discovery Priority** section: ``` > 🔑 **GitHub exception:** Before calling `connect_integration` for GitHub, > always run `gh auth status` first. If it shows `Logged in`, proceed > directly with `gh`/`git` — no integration connection needed. ``` This places the rule at the exact location where the model decides which tool path to take, preventing the miss. ## Why this works - **Placement over repetition**: The existing instruction isn't wrong — it's just in the wrong spot relative to where the decision is made - **Negative framing**: Explicitly says "before calling `connect_integration`" which directly intercepts the incorrect reflex - **Minimal change**: 4 lines added, zero removed Co-authored-by: Toran Bruce Richards <22963551+Torantulino@users.noreply.github.com>
Prevent accidental expensive CoPilot turns by estimating request cost before execution and requiring a short-lived server-validated approval token when thresholds are exceeded.
|
|
|
This PR targets the Automatically setting the base branch to |
WalkthroughThis PR implements cost estimation and user approval workflows for CoPilot chat turns. It adds backend cost-calculation logic, signed time-limited approval tokens, configuration settings, and a frontend confirmation dialog to gate execution of requests exceeding a user-configurable cost threshold. Changes
Sequence DiagramsequenceDiagram
participant User
participant Frontend
participant Backend as Backend API
participant CostSvc as Cost Service
participant DB as Approval Token
User->>Frontend: Submit message + request
Frontend->>Backend: POST /sessions/{id}/estimate
Backend->>CostSvc: estimate_copilot_turn_cost(...)
CostSvc->>CostSvc: Resolve path/model, calculate tokens & cost
CostSvc-->>Backend: CoPilotTurnCostEstimate
alt Confirmation Required
Backend->>DB: generate_cost_approval_token(...)
DB-->>Backend: signed token
Backend-->>Frontend: {requires_confirmation: true, approval_token, ...}
Frontend->>User: Show CostConfirmationDialog
alt User Confirms
User->>Frontend: Click "Continue"
Frontend->>Backend: POST /sessions/{id}/stream + approval_token
Backend->>DB: validate_cost_approval_token(...)
DB-->>Backend: valid
Backend->>Backend: Process message + enqueue task
Backend-->>Frontend: Stream response (200)
else User Cancels
User->>Frontend: Click "Cancel"
Frontend-->>User: Dismiss dialog
end
else No Confirmation Needed
Backend-->>Frontend: {requires_confirmation: false, ...}
Frontend->>Backend: POST /sessions/{id}/stream
Backend->>Backend: Process message + enqueue task
Backend-->>Frontend: Stream response (200)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes The changes span multiple architectural layers with heterogeneous edits: cryptographic token logic requiring careful security review, complex cost-estimation heuristics with LaunchDarkly integration, async approval-based state management across frontend and backend, and tight integration points across multiple modules. The variety of concerns (security, business logic, state coordination) and density of new logic in several files elevates the cognitive load. Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev #12877 +/- ##
==========================================
- Coverage 65.14% 65.05% -0.09%
==========================================
Files 1831 1835 +4
Lines 135945 136266 +321
Branches 14534 14581 +47
==========================================
+ Hits 88555 88650 +95
- Misses 44670 44896 +226
Partials 2720 2720
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (6)
autogpt_platform/frontend/src/app/(platform)/copilot/components/CostConfirmationDialog/CostConfirmationDialog.tsx (1)
7-14: Reuse the generated estimate type instead of duplicating it.
CostEstimatemirrors a subset of the backendCoPilotTurnCostEstimateresponse. Import the generated model andPick<>the displayed fields so this stays in sync after OpenAPI regeneration.♻️ Suggested type alignment
+import type { CoPilotTurnCostEstimate } from "@/app/api/__generated__/models/coPilotTurnCostEstimate"; import { Button } from "@/components/atoms/Button/Button"; import { Text } from "@/components/atoms/Text/Text"; import { Dialog } from "@/components/molecules/Dialog/Dialog"; -type CostEstimate = { - resolved_path: "baseline" | "sdk"; - resolved_model: string; - estimated_llm_calls: number; - estimated_cost_usd: number; - confirmation_threshold_usd: number; - rationale: string; -}; +type CostEstimate = Pick< + CoPilotTurnCostEstimate, + | "resolved_path" + | "resolved_model" + | "estimated_llm_calls" + | "estimated_cost_usd" + | "confirmation_threshold_usd" + | "rationale" +>;Based on learnings, prefer using generated OpenAPI types from
@/app/api/__generated__/for payloads defined inopenapi.json; use inline interfaces only for SSE-stream-only payloads not exposed via OpenAPI.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/components/CostConfirmationDialog/CostConfirmationDialog.tsx around lines 7 - 14, Replace the duplicated inline CostEstimate type with a Pick from the generated OpenAPI type: import CoPilotTurnCostEstimate from the generated models (from "@/app/api/__generated__/...") and define CostEstimate as Pick<CoPilotTurnCostEstimate, 'resolved_path' | 'resolved_model' | 'estimated_llm_calls' | 'estimated_cost_usd' | 'confirmation_threshold_usd' | 'rationale'> so the component uses the canonical backend shape and stays in sync after OpenAPI regeneration; update any usages of CostEstimate in CostConfirmationDialog.tsx accordingly.autogpt_platform/backend/backend/copilot/cost_approval.py (2)
138-140: Minor: tighten_urlsafe_encode_jsonpayload type.The annotation
dict[str, str | int]is narrower than what is actually written —payload["h"]is a hex string,payload["e"]isint, but ifbuild_cost_approval_fingerprint's return type ever changes (e.g. to return bytes) the type checker will not catch it. Usedict[str, object]or, better, introduce a small Pydantic /TypedDictfor the token payload (u,s,h,e) sogenerateandvalidateshare the exact shape and theisinstance(payload, dict)+payload.get(...)checks invalidate_cost_approval_tokencan be replaced by a single model-parse step with typed access.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/cost_approval.py` around lines 138 - 140, The _urlsafe_encode_json payload annotation is too narrow; update it to accept the actual token payload shape (either change to dict[str, object] or better, define a TypedDict/Pydantic model for the token with keys u,s,h,e) and use that type for _urlsafe_encode_json and the generator/validator functions; implement the model/TypedDict and replace the ad-hoc isinstance/dict.get checks in validate_cost_approval_token with a single model parse so build_cost_approval_fingerprint, the generate function, _urlsafe_encode_json, and validate_cost_approval_token share the exact typed payload shape.
21-49:asyncon pure CPU functions is misleading.
generate_cost_approval_tokenandvalidate_cost_approval_tokendo noawait— they are synchronous HMAC/JSON code. Marking themasyncforces callers to pay theawaitoverhead and wrongly suggests I/O. Either dropasync(callers will need a trivial update) or keepasyncbut plan for a real async key-fetch path (e.g., pulling the signing key from Redis / Secrets Manager in the recommendation above — which would genuinely be async).Also applies to: 52-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/cost_approval.py` around lines 21 - 49, Both generate_cost_approval_token and validate_cost_approval_token are marked async but perform only synchronous CPU-bound work (JSON/HMAC) and do not await anything; remove the async keyword from both function definitions (and their corresponding async annotations/returns if present) so they become regular synchronous functions, then update any call sites that currently "await" them to call directly; alternatively, if you intend to fetch signing keys asynchronously later, replace the direct key access in _sign_payload/_verify_payload with an async key-fetch hook and keep the functions async—choose one approach and apply it consistently across generate_cost_approval_token, validate_cost_approval_token, _sign_payload, and _verify_payload and their callers.autogpt_platform/backend/backend/copilot/cost_estimation.py (1)
226-232: Redundant final branch in_model_pricing_per_million.The
"sonnet"branch and the fallback both return(3.0, 15.0), so the default case silently prices every unknown model at Sonnet rates. If that's intentional (unknown → assume cheapest supported tier), add a one-line comment; otherwise this will understate cost for future Opus/Haiku-3.5/etc. model IDs that don't contain"opus"or"sonnet". Consider pulling rates from a single source of truth (the same price table used by the billing/credit system) so this doesn't drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/cost_estimation.py` around lines 226 - 232, The _model_pricing_per_million function currently returns (3.0, 15.0) for both the "sonnet" branch and the final fallback, which silently classifies unknown models as Sonnet-tier; either make this explicit or change it to a safer behavior: update _model_pricing_per_million to (a) reference the canonical pricing table used by billing/credits instead of hardcoded literals, and (b) either add a one-line comment stating "unknown models default to Sonnet rates" if that is intended, or return a conservative default / raise/log an error for unknown model_name values so future Opus/Haiku/etc. IDs aren't underpriced; locate and modify the function _model_pricing_per_million to implement the chosen fix.autogpt_platform/backend/backend/copilot/cost_estimation_test.py (1)
37-109: Add coverage for tampering vectors not exercised by these four tests.Current tests cover valid round-trip, message tampering, expiry, and malformed input. To fully exercise
validate_cost_approval_token's branches, consider adding:
user_id/session_idmismatch between generation and validation.- Signature tampering (flip a char in the signature half).
is_user_message/context/file_ids/mode/modeldivergence (each is fingerprinted; regression if any field is accidentally dropped frombuild_cost_approval_fingerprint).- Empty/whitespace token and token missing the
.separator (exercises thesplit(".", 1)ValueErrorpath distinctly from the current malformed-but-dotted case).These would lock in the security guarantees of the helper against accidental refactors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/cost_estimation_test.py` around lines 37 - 109, Add tests exercising more tampering vectors for validate_cost_approval_token: create tokens via generate_cost_approval_token then assert validation fails when (1) user_id or session_id are changed at validation (user/session mismatch), (2) the signature is tampered by flipping a character in the signature half of the token (signature tampering), (3) any fingerprinted field differs at validation (is_user_message, context, file_ids, mode, model) — write one test per field to ensure build_cost_approval_fingerprint dependencies are covered, and (4) malformed-token edge cases: an empty/whitespace token and a token missing the '.' separator (to hit the split ValueError branch). Use the existing test patterns and the functions generate_cost_approval_token and validate_cost_approval_token to implement these assertions.autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotStream.ts (1)
41-54: Import the estimate response type from generated models instead of redeclaring it.
CoPilotCostEstimatemirrors the backendCoPilotTurnCostEstimatePydantic model (note: backend type isCoPilotTurnCostEstimate, this isCoPilotCostEstimate— the names already drift). Once the estimate endpoint is consumed via the generated client, import the generated model type so field additions/renames on the backend surface as a compile-time break here instead of silentundefined. Also pick whichever casing the generated TS model uses for"launchdarkly"/"default"and"baseline" | "sdk"rather than maintaining a hand-written union.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/useCopilotStream.ts around lines 41 - 54, Replace the local CoPilotCostEstimate interface in useCopilotStream.ts with the generated TypeScript model from the API client (the backend model is CoPilotTurnCostEstimate) by importing that generated type and updating all usages of CoPilotCostEstimate to the imported model; remove the hand-written union/casing and adopt whatever casing and enum values the generated model exposes so future backend changes surface as compile-time type errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autogpt_platform/backend/backend/api/features/chat/routes.py`:
- Around line 905-929: The approval token is validated but not consumed,
allowing reuse within the approval window; modify the flow around
estimate.requires_confirmation/validate_cost_approval_token to atomically mark
the token as consumed before saving/enqueueing the request: after
validate_cost_approval_token returns true (for request.approval_token, user_id,
session_id, message, etc.), perform an atomic consume operation (e.g., Redis SET
NX with a key derived from the token and a short TTL or store a nonce/hash) and
if the consume fails treat the token as already-used and raise
HTTPException(403) with an appropriate message; ensure the consume+check is done
before any persistence/enqueue steps so concurrent requests cannot reuse the
same token.
In `@autogpt_platform/backend/backend/copilot/cost_approval.py`:
- Line 18: _current implementation creates a per-process _ephemeral_secret at
import which causes cross-worker token validation failures and silently reuses
unrelated secrets; change this by (1) introducing a dedicated configuration key
(e.g., COST_APPROVAL_SIGNING_KEY / cost_approval_signing_key) and use it for
signing/validation in cost_approval.py instead of multiplexing ENCRYPTION_KEY,
UNSUBSCRIBE_SECRET_KEY, or SUPABASE_SERVICE_ROLE_KEY, (2) if none of those env
vars are set, log a clear [SECURITY] warning (matching the cache.py warning
pattern) when generating or validating tokens, and (3) avoid per-process
fallbacks — either require the dedicated key or derive a stable deployment-wide
fallback from a shared store/bootstrap secret (e.g., Redis or other config
provider) rather than secrets.token_bytes so tokens remain valid across workers;
update token generation/validation paths that reference _ephemeral_secret to use
the new key logic.
In `@autogpt_platform/backend/backend/copilot/cost_estimation_test.py`:
- Around line 1-109: The test file is named incorrectly—it's exercising
backend.copilot.cost_approval (generate_cost_approval_token,
validate_cost_approval_token) but is called cost_estimation_test.py; rename the
test file to cost_approval_test.py so it colocates and clearly maps to
cost_approval.py, update any test imports if they reference the old filename,
and ensure no other tests use cost_estimation_test.py so the reserved name
remains available for tests that target cost_estimation.py in the future.
In `@autogpt_platform/backend/backend/copilot/cost_estimation.py`:
- Around line 107-125: The current resolve_cost_confirmation_threshold function
misattributes the source when LaunchDarkly returns a value equal to the config
fallback; change get_feature_flag_value to return an explicit signal about
whether the returned value came from LaunchDarkly or was the fallback (e.g.,
return (raw_value, from_flag: bool) or (raw_value, source:
"launchdarkly"|"default")), then update resolve_cost_confirmation_threshold to
use that signal to set the returned source (keep using
Flag.COPILOT_COST_CONFIRMATION_THRESHOLD_USD.value and
config.copilot_cost_confirmation_threshold_usd to compute numeric fallback and
threshold but rely on the new second return value from get_feature_flag_value to
decide between "launchdarkly" and "default").
- Around line 150-158: The estimate vs approval can diverge if the session
accrues messages after estimating, so update the fingerprint to include session
identity + history version: modify build_cost_approval_fingerprint to accept and
incorporate session.session_id (or equivalent) and a
history_version/message_count (e.g., len(session.messages) or a monotonic
version) so any change in conversation invalidates the prior approval; keep
_estimate_history_tokens' slicing logic (last 30) in sync with any bounding used
when building the fingerprint so both use the same window/length semantics and
ensure the same symbols (_estimate_history_tokens and
build_cost_approval_fingerprint) are updated together.
- Around line 7-14: cost_estimation.py is importing module-private helpers
(_resolve_baseline_model and _resolve_model_and_multiplier); change the
dependency to public APIs by either (A) renaming and exporting the existing
helpers in their modules (backend.copilot.baseline.service: expose
resolve_baseline_model or get_effective_baseline_model;
backend.copilot.sdk.service: expose resolve_model_and_multiplier or
get_effective_sdk_model) and update the imports in cost_estimation.py to those
public names, or (B) add new thin public wrapper helpers in those modules (e.g.,
get_effective_baseline_model(mode) and get_effective_sdk_model(model,
session_id)) and import those from cost_estimation.py instead; ensure you update
any calls in cost_estimation.py that reference _resolve_baseline_model and
_resolve_model_and_multiplier to the new public function names.
In `@autogpt_platform/frontend/src/app/`(platform)/copilot/useCopilotStream.ts:
- Around line 106-144: fetchCostEstimate is missing an AbortSignal so in-flight
POSTs can complete after a session switch/unmount and mutate stale state; modify
the logic to create an AbortController tied to the current sessionId, pass
controller.signal into fetch inside fetchCostEstimate, and abort that controller
both in the session-switch effect and the unmount cleanup (where
costApprovalResolverRef and pendingCostEstimateRef are cleared). Ensure
pendingCostEstimateRef and costApprovalResolverRef are set/cleared only for the
matching sessionId (use sessionIdRef.current === sid guard like
handleWakeResync) so resolved promises/state updates do not apply to a different
session if aborting is not possible.
---
Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/cost_approval.py`:
- Around line 138-140: The _urlsafe_encode_json payload annotation is too
narrow; update it to accept the actual token payload shape (either change to
dict[str, object] or better, define a TypedDict/Pydantic model for the token
with keys u,s,h,e) and use that type for _urlsafe_encode_json and the
generator/validator functions; implement the model/TypedDict and replace the
ad-hoc isinstance/dict.get checks in validate_cost_approval_token with a single
model parse so build_cost_approval_fingerprint, the generate function,
_urlsafe_encode_json, and validate_cost_approval_token share the exact typed
payload shape.
- Around line 21-49: Both generate_cost_approval_token and
validate_cost_approval_token are marked async but perform only synchronous
CPU-bound work (JSON/HMAC) and do not await anything; remove the async keyword
from both function definitions (and their corresponding async
annotations/returns if present) so they become regular synchronous functions,
then update any call sites that currently "await" them to call directly;
alternatively, if you intend to fetch signing keys asynchronously later, replace
the direct key access in _sign_payload/_verify_payload with an async key-fetch
hook and keep the functions async—choose one approach and apply it consistently
across generate_cost_approval_token, validate_cost_approval_token,
_sign_payload, and _verify_payload and their callers.
In `@autogpt_platform/backend/backend/copilot/cost_estimation_test.py`:
- Around line 37-109: Add tests exercising more tampering vectors for
validate_cost_approval_token: create tokens via generate_cost_approval_token
then assert validation fails when (1) user_id or session_id are changed at
validation (user/session mismatch), (2) the signature is tampered by flipping a
character in the signature half of the token (signature tampering), (3) any
fingerprinted field differs at validation (is_user_message, context, file_ids,
mode, model) — write one test per field to ensure
build_cost_approval_fingerprint dependencies are covered, and (4)
malformed-token edge cases: an empty/whitespace token and a token missing the
'.' separator (to hit the split ValueError branch). Use the existing test
patterns and the functions generate_cost_approval_token and
validate_cost_approval_token to implement these assertions.
In `@autogpt_platform/backend/backend/copilot/cost_estimation.py`:
- Around line 226-232: The _model_pricing_per_million function currently returns
(3.0, 15.0) for both the "sonnet" branch and the final fallback, which silently
classifies unknown models as Sonnet-tier; either make this explicit or change it
to a safer behavior: update _model_pricing_per_million to (a) reference the
canonical pricing table used by billing/credits instead of hardcoded literals,
and (b) either add a one-line comment stating "unknown models default to Sonnet
rates" if that is intended, or return a conservative default / raise/log an
error for unknown model_name values so future Opus/Haiku/etc. IDs aren't
underpriced; locate and modify the function _model_pricing_per_million to
implement the chosen fix.
In
`@autogpt_platform/frontend/src/app/`(platform)/copilot/components/CostConfirmationDialog/CostConfirmationDialog.tsx:
- Around line 7-14: Replace the duplicated inline CostEstimate type with a Pick
from the generated OpenAPI type: import CoPilotTurnCostEstimate from the
generated models (from "@/app/api/__generated__/...") and define CostEstimate as
Pick<CoPilotTurnCostEstimate, 'resolved_path' | 'resolved_model' |
'estimated_llm_calls' | 'estimated_cost_usd' | 'confirmation_threshold_usd' |
'rationale'> so the component uses the canonical backend shape and stays in sync
after OpenAPI regeneration; update any usages of CostEstimate in
CostConfirmationDialog.tsx accordingly.
In `@autogpt_platform/frontend/src/app/`(platform)/copilot/useCopilotStream.ts:
- Around line 41-54: Replace the local CoPilotCostEstimate interface in
useCopilotStream.ts with the generated TypeScript model from the API client (the
backend model is CoPilotTurnCostEstimate) by importing that generated type and
updating all usages of CoPilotCostEstimate to the imported model; remove the
hand-written union/casing and adopt whatever casing and enum values the
generated model exposes so future backend changes surface as compile-time type
errors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 378f69d2-b70f-4b9e-b495-0394a0bd23cb
📒 Files selected for processing (12)
autogpt_platform/backend/backend/api/features/chat/routes.pyautogpt_platform/backend/backend/api/features/chat/routes_test.pyautogpt_platform/backend/backend/copilot/config.pyautogpt_platform/backend/backend/copilot/cost_approval.pyautogpt_platform/backend/backend/copilot/cost_estimation.pyautogpt_platform/backend/backend/copilot/cost_estimation_test.pyautogpt_platform/backend/backend/util/feature_flag.pyautogpt_platform/frontend/src/app/(platform)/copilot/CopilotPage.tsxautogpt_platform/frontend/src/app/(platform)/copilot/__tests__/CopilotPage.test.tsxautogpt_platform/frontend/src/app/(platform)/copilot/components/CostConfirmationDialog/CostConfirmationDialog.tsxautogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.tsautogpt_platform/frontend/src/app/(platform)/copilot/useCopilotStream.ts
| if estimate.requires_confirmation: | ||
| if not request.approval_token: | ||
| raise HTTPException( | ||
| status_code=403, | ||
| detail=( | ||
| "This task requires cost confirmation before execution. " | ||
| "Request an estimate and confirm to continue." | ||
| ), | ||
| ) | ||
| approved = await validate_cost_approval_token( | ||
| token=request.approval_token, | ||
| user_id=user_id, | ||
| session_id=session_id, | ||
| message=request.message, | ||
| is_user_message=request.is_user_message, | ||
| context=request.context, | ||
| file_ids=request.file_ids, | ||
| mode=request.mode, | ||
| model=request.model, | ||
| ) | ||
| if not approved: | ||
| raise HTTPException( | ||
| status_code=403, | ||
| detail="Cost confirmation token is invalid or expired.", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Inspect the cost approval and route code for one-time token consumption.
# Expected: a stateful consume/nonce check (for example Redis SET NX / stored token ID)
# should exist on the stream approval path before append/enqueue.
rg -nP 'validate_cost_approval_token|generate_cost_approval_token|consume|nonce|jti|used|replay|setnx|set\(|delete\(' \
autogpt_platform/backend/backend/copilot/cost_approval.py \
autogpt_platform/backend/backend/api/features/chat/routes.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 2056
🏁 Script executed:
#!/bin/bash
# Get the implementation of both approval token functions
echo "=== COST APPROVAL FUNCTIONS ==="
cat -n autogpt_platform/backend/backend/copilot/cost_approval.py
echo ""
echo "=== TOKEN VALIDATION IN ROUTES (lines 905-930) ==="
sed -n '905,930p' autogpt_platform/backend/backend/api/features/chat/routes.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 6279
🏁 Script executed:
#!/bin/bash
# Check for any Redis operations on approval tokens
rg -A 5 -B 5 'approval.*redis|redis.*approval|approval.*token.*delete|approval.*token.*nonce' \
autogpt_platform/backend/backend/copilot/ \
autogpt_platform/backend/backend/api/features/chat/Repository: Significant-Gravitas/AutoGPT
Length of output: 54
🏁 Script executed:
#!/bin/bash
# Search for token consumption or one-time use patterns in backend
fd -e py autogpt_platform/backend/backend/copilot/ autogpt_platform/backend/backend/api/features/chat/ \
| xargs rg -l 'jti|nonce|one.time|one_time|consumed|consume_token' 2>/dev/nullRepository: Significant-Gravitas/AutoGPT
Length of output: 3347
🏁 Script executed:
#!/bin/bash
# Search for dedup/duplicate-post lock duration and approval token TTL
rg -n 'dedup|duplicate|idempotent|ttl.*approval|approval.*ttl|600' \
autogpt_platform/backend/backend/api/features/chat/routes.py \
autogpt_platform/backend/backend/copilot/cost_approval.py \
autogpt_platform/backend/backend/api/features/chat/Repository: Significant-Gravitas/AutoGPT
Length of output: 7638
🏁 Script executed:
#!/bin/bash
# Find where approval tokens are generated/called - check default TTL
rg -B 3 -A 3 'generate_cost_approval_token' autogpt_platform/backend/backend/api/features/chat/routes.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 587
🏁 Script executed:
#!/bin/bash
# Search for Redis dedup or request deduplication logic
rg -n 'duplicate|dedup|idempotent|seen.*request|request.*seen|reques.*hash' \
autogpt_platform/backend/backend/api/features/chat/routes.py | head -20Repository: Significant-Gravitas/AutoGPT
Length of output: 1135
🏁 Script executed:
#!/bin/bash
# Find the config class definition with copilot_cost_approval_ttl_seconds
rg -n 'copilot_cost_approval_ttl_seconds|class.*Config|ChatConfig' \
autogpt_platform/backend/backend/copilot/ \
| grep -E '(copilot_cost_approval|ChatConfig|class)' | head -20Repository: Significant-Gravitas/AutoGPT
Length of output: 2250
🏁 Script executed:
#!/bin/bash
# Search for ChatConfig definition
fd -e py autogpt_platform/backend/backend/copilot/ | xargs grep -l 'class ChatConfig'Repository: Significant-Gravitas/AutoGPT
Length of output: 543
🏁 Script executed:
#!/bin/bash
# Find config.py files in copilot
find autogpt_platform/backend/backend/copilot -name '*config*' -type fRepository: Significant-Gravitas/AutoGPT
Length of output: 303
🏁 Script executed:
#!/bin/bash
# Read the config.py file to find copilot_cost_approval_ttl_seconds
cat -n autogpt_platform/backend/backend/copilot/config.py | head -100Repository: Significant-Gravitas/AutoGPT
Length of output: 4711
🏁 Script executed:
#!/bin/bash
# Search for the approval TTL value
rg -n 'copilot_cost_approval_ttl_seconds|approval.*ttl' autogpt_platform/backend/backend/copilot/config.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 127
🏁 Script executed:
#!/bin/bash
# Get the full definition of copilot_cost_approval_ttl_seconds
sed -n '129,135p' autogpt_platform/backend/backend/copilot/config.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 267
🏁 Script executed:
#!/bin/bash
# Read message_dedup.py to understand the dedup lock mechanism
cat -n autogpt_platform/backend/backend/copilot/message_dedup.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 2973
Add one-time consumption of approval tokens before enqueueing.
The token validation checks signature, expiry, and request fingerprint, but does not mark the token as consumed. Within the 600-second approval window, the same token can be reused to execute the identical high-cost request after the dedup lock expires. Add an atomic consume step (e.g., Redis SET NX with a nonce/hash) before saving and enqueueing the request, and reject already-consumed tokens.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/api/features/chat/routes.py` around lines
905 - 929, The approval token is validated but not consumed, allowing reuse
within the approval window; modify the flow around
estimate.requires_confirmation/validate_cost_approval_token to atomically mark
the token as consumed before saving/enqueueing the request: after
validate_cost_approval_token returns true (for request.approval_token, user_id,
session_id, message, etc.), perform an atomic consume operation (e.g., Redis SET
NX with a key derived from the token and a short TTL or store a nonce/hash) and
if the consume fails treat the token as already-used and raise
HTTPException(403) with an appropriate message; ensure the consume+check is done
before any persistence/enqueue steps so concurrent requests cannot reuse the
same token.
| CopilotLlmModel = Literal["standard", "advanced"] | ||
|
|
||
| settings = Settings() | ||
| _ephemeral_secret = secrets.token_bytes(32) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm these secrets are actually populated in every deployed env (.env.default, compose, docs).
rg -nP 'encryption_key|unsubscribe_secret_key|supabase_service_role_key|cost_approval' autogpt_platform/backend -g '!**/cost_approval.py' -C1
rg -nP 'ENCRYPTION_KEY|UNSUBSCRIBE_SECRET_KEY|SUPABASE_SERVICE_ROLE_KEY' autogpt_platform -g '*.env*' -g 'docker-compose*.yml' -C1
# Check worker count in production config to confirm multi-worker risk.
rg -nP 'workers\s*[=:]' autogpt_platform -g '*.py' -g '*.toml' -g '*.yml' -C1Repository: Significant-Gravitas/AutoGPT
Length of output: 12569
🏁 Script executed:
cat -n autogpt_platform/backend/backend/copilot/cost_approval.pyRepository: Significant-Gravitas/AutoGPT
Length of output: 5268
🏁 Script executed:
# Check for tests that might exercise the ephemeral fallback
rg -nA5 'ephemeral|all three|unset.*secret' autogpt_platform/backend/backend/copilot --type py
# Check if startup validates secrets
rg -nA5 'encryption_key|unsubscribe_secret_key|supabase_service_role_key' autogpt_platform/backend/backend/util/settings.py -B2 -A2
# Check deployment docs
find autogpt_platform -name '*.md' -o -name 'README*' | head -20 | xargs grep -l 'ENCRYPTION_KEY\|secret' 2>/dev/null | head -5Repository: Significant-Gravitas/AutoGPT
Length of output: 33655
🏁 Script executed:
# Check if there's startup validation that would fail if secrets are missing
rg -nB3 -A3 'Settings()' autogpt_platform/backend/backend/util/settings.py | head -40
# Check if there's any environment validation at app startup
rg -nA5 'def.*init|startup|validate' autogpt_platform/backend/backend/api/main.py -C2 | head -50
# Check if tests actually trigger the ephemeral fallback path
rg -nA10 'test.*approval.*token' autogpt_platform/backend/backend/copilot/cost_estimation_test.py | head -60Repository: Significant-Gravitas/AutoGPT
Length of output: 2977
🏁 Script executed:
# Look for where the FastAPI app is initialized and if there's startup validation
fd -t f -name 'app.py' -o -name 'server.py' autogpt_platform/backend | head -5
# Check what the default/common deployment scenarios are
rg -nA5 'if.*encryption_key|if.*unsubscribe_secret_key|if not.*secret' autogpt_platform/backend/backend/util -C1
# Verify that .env.default is actually used and consulted in normal deployments
find autogpt_platform/backend -name 'README*' -o -name 'docker-compose*' -o -name 'deploy*' | xargs cat 2>/dev/null | grep -i 'env\|secret' | head -20Repository: Significant-Gravitas/AutoGPT
Length of output: 1600
🏁 Script executed:
# Find the main app initialization
find autogpt_platform/backend -name 'main.py' -type f
# Check if there's any secret validation in __init__ or setup
rg -n 'def __init__|def startup|logger.warning.*encryption_key' autogpt_platform/backend/backend/util --type py -A3
# Look for the pattern of how cache.py handles missing encryption_key as a model
cat -n autogpt_platform/backend/backend/util/cache.py | sed -n '449,460p'Repository: Significant-Gravitas/AutoGPT
Length of output: 13004
Silent fallback to per-process ephemeral secret creates multi-worker validation failures and lacks warning precedent.
When none of encryption_key, unsubscribe_secret_key, or supabase_service_role_key are configured, _ephemeral_secret (line 18) provides a per-process signing key generated at module import time. This breaks token validation across workers: a cost-approval token signed by worker A using its ephemeral secret cannot be validated by worker B using a different ephemeral secret, producing intermittent 403 approval_required / invalid token errors.
While .env.default does populate ENCRYPTION_KEY and UNSUBSCRIBE_SECRET_KEY, making this fallback unlikely in standard deployments, the design lacks the safety precedent established in cache.py (lines 449–456), which logs [SECURITY] warning when encryption_key is missing. Additionally, reusing three unrelated secrets (encryption_key for data, unsubscribe_secret_key for email HMAC, supabase_service_role_key for external bearer) as HMAC signing material violates cryptographic key separation, even though collision risk is low for distinct message domains.
Recommendations:
- Add explicit warning at token generation/validation if all three secrets are unset, matching
cache.pypattern. - Prefer a dedicated
cost_approval_signing_keyover multiplexing unrelated secrets. - If fallback is necessary, derive a stable per-deployment key (e.g., from Redis or a shared bootstrap secret) rather than per-process.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/copilot/cost_approval.py` at line 18,
_current implementation creates a per-process _ephemeral_secret at import which
causes cross-worker token validation failures and silently reuses unrelated
secrets; change this by (1) introducing a dedicated configuration key (e.g.,
COST_APPROVAL_SIGNING_KEY / cost_approval_signing_key) and use it for
signing/validation in cost_approval.py instead of multiplexing ENCRYPTION_KEY,
UNSUBSCRIBE_SECRET_KEY, or SUPABASE_SERVICE_ROLE_KEY, (2) if none of those env
vars are set, log a clear [SECURITY] warning (matching the cache.py warning
pattern) when generating or validating tokens, and (3) avoid per-process
fallbacks — either require the dedicated key or derive a stable deployment-wide
fallback from a shared store/bootstrap secret (e.g., Redis or other config
provider) rather than secrets.token_bytes so tokens remain valid across workers;
update token generation/validation paths that reference _ephemeral_secret to use
the new key logic.
| """Tests for CoPilot cost-estimation helpers.""" | ||
|
|
||
| import pytest | ||
|
|
||
| from backend.copilot import cost_approval | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_approval_token_round_trip_valid(): | ||
| token = await cost_approval.generate_cost_approval_token( | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Analyze this task", | ||
| is_user_message=True, | ||
| context={"url": "https://example.com", "content": "ctx"}, | ||
| file_ids=["11111111-1111-1111-1111-111111111111"], | ||
| mode="extended_thinking", | ||
| model="advanced", | ||
| ttl_seconds=600, | ||
| ) | ||
|
|
||
| is_valid = await cost_approval.validate_cost_approval_token( | ||
| token=token, | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Analyze this task", | ||
| is_user_message=True, | ||
| context={"url": "https://example.com", "content": "ctx"}, | ||
| file_ids=["11111111-1111-1111-1111-111111111111"], | ||
| mode="extended_thinking", | ||
| model="advanced", | ||
| ) | ||
|
|
||
| assert is_valid is True | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_approval_token_invalid_when_message_changes(): | ||
| token = await cost_approval.generate_cost_approval_token( | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Original message", | ||
| is_user_message=True, | ||
| context=None, | ||
| file_ids=None, | ||
| mode=None, | ||
| model=None, | ||
| ttl_seconds=600, | ||
| ) | ||
|
|
||
| is_valid = await cost_approval.validate_cost_approval_token( | ||
| token=token, | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Tampered message", | ||
| is_user_message=True, | ||
| context=None, | ||
| file_ids=None, | ||
| mode=None, | ||
| model=None, | ||
| ) | ||
|
|
||
| assert is_valid is False | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_approval_token_invalid_when_expired(): | ||
| token = await cost_approval.generate_cost_approval_token( | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Original message", | ||
| is_user_message=True, | ||
| context=None, | ||
| file_ids=None, | ||
| mode=None, | ||
| model=None, | ||
| ttl_seconds=-1, | ||
| ) | ||
|
|
||
| is_valid = await cost_approval.validate_cost_approval_token( | ||
| token=token, | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Original message", | ||
| is_user_message=True, | ||
| context=None, | ||
| file_ids=None, | ||
| mode=None, | ||
| model=None, | ||
| ) | ||
|
|
||
| assert is_valid is False | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_approval_token_invalid_for_malformed_payload(): | ||
| is_valid = await cost_approval.validate_cost_approval_token( | ||
| token="not-a-valid-token.payload", | ||
| user_id="user-1", | ||
| session_id="sess-1", | ||
| message="Original message", | ||
| is_user_message=True, | ||
| context=None, | ||
| file_ids=None, | ||
| mode=None, | ||
| model=None, | ||
| ) | ||
|
|
||
| assert is_valid is False |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Test file name does not match the module under test.
This file exclusively tests backend.copilot.cost_approval (generate_cost_approval_token / validate_cost_approval_token), but the filename cost_estimation_test.py pairs it with cost_estimation.py. Per the colocation convention, rename this file to cost_approval_test.py and keep cost_estimation_test.py reserved for tests of cost_estimation.py. This will also prevent future confusion when a real test module for cost_estimation.py is added.
As per coding guidelines: "Colocate test files with source files using *_test.py naming convention".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/copilot/cost_estimation_test.py` around
lines 1 - 109, The test file is named incorrectly—it's exercising
backend.copilot.cost_approval (generate_cost_approval_token,
validate_cost_approval_token) but is called cost_estimation_test.py; rename the
test file to cost_approval_test.py so it colocates and clearly maps to
cost_approval.py, update any test imports if they reference the old filename,
and ensure no other tests use cost_estimation_test.py so the reserved name
remains available for tests that target cost_estimation.py in the future.
| from backend.copilot.baseline.service import _resolve_baseline_model | ||
| from backend.copilot.config import ChatConfig, CopilotLlmModel, CopilotMode | ||
| from backend.copilot.executor.processor import ( | ||
| resolve_effective_mode, | ||
| resolve_use_sdk_for_mode, | ||
| ) | ||
| from backend.copilot.model import ChatSession | ||
| from backend.copilot.sdk.service import _resolve_model_and_multiplier |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm these are truly private in their source modules (no other non-test import sites).
rg -nP '\b_resolve_baseline_model\b' autogpt_platform/backend -C2
rg -nP '\b_resolve_model_and_multiplier\b' autogpt_platform/backend -C2Repository: Significant-Gravitas/AutoGPT
Length of output: 6285
Importing private helpers from sibling modules (_resolve_baseline_model, _resolve_model_and_multiplier).
Both symbols are _-prefixed, signalling module-private internals. Importing them into cost_estimation.py from backend.copilot.baseline.service and backend.copilot.sdk.service couples this module to implementation details that the owning modules are free to refactor without notice. Either promote these to public names (resolve_baseline_model, resolve_model_and_multiplier) in their source modules, or expose dedicated public helpers (get_effective_baseline_model(mode), get_effective_sdk_model(model, session_id)) that cost_estimation.py can depend on.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/copilot/cost_estimation.py` around lines 7 -
14, cost_estimation.py is importing module-private helpers
(_resolve_baseline_model and _resolve_model_and_multiplier); change the
dependency to public APIs by either (A) renaming and exporting the existing
helpers in their modules (backend.copilot.baseline.service: expose
resolve_baseline_model or get_effective_baseline_model;
backend.copilot.sdk.service: expose resolve_model_and_multiplier or
get_effective_sdk_model) and update the imports in cost_estimation.py to those
public names, or (B) add new thin public wrapper helpers in those modules (e.g.,
get_effective_baseline_model(mode) and get_effective_sdk_model(model,
session_id)) and import those from cost_estimation.py instead; ensure you update
any calls in cost_estimation.py that reference _resolve_baseline_model and
_resolve_model_and_multiplier to the new public function names.
| async def resolve_cost_confirmation_threshold( | ||
| user_id: str | None, | ||
| ) -> tuple[float, Literal["launchdarkly", "default"]]: | ||
| """Resolve threshold from LaunchDarkly with config fallback.""" | ||
| fallback = max(0.0, float(config.copilot_cost_confirmation_threshold_usd)) | ||
| if not user_id: | ||
| return fallback, "default" | ||
| raw_value = await get_feature_flag_value( | ||
| Flag.COPILOT_COST_CONFIRMATION_THRESHOLD_USD.value, | ||
| user_id, | ||
| fallback, | ||
| ) | ||
| try: | ||
| threshold = max(0.0, float(raw_value)) | ||
| except (TypeError, ValueError): | ||
| return fallback, "default" | ||
| if threshold == fallback: | ||
| return threshold, "default" | ||
| return threshold, "launchdarkly" |
There was a problem hiding this comment.
threshold_source misattribution when LaunchDarkly returns a value equal to the config fallback.
The equality check if threshold == fallback: return threshold, "default" conflates two different states:
- LaunchDarkly had no rule matching this user,
get_feature_flag_valuereturned thefallbackdefault → correctly "default". - LaunchDarkly does have a rule, and the targeted value happens to equal
config.copilot_cost_confirmation_threshold_usd(e.g. both are10.0) → should be "launchdarkly" but is reported as "default".
This affects the response field exposed to the frontend and anyone logging it for analytics ("which users are on LD-targeted thresholds?"). If get_feature_flag_value can distinguish "no variation" vs. "variation returned", prefer that signal. Otherwise, document the known ambiguity in a comment so it isn't treated as a bug later.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/copilot/cost_estimation.py` around lines 107
- 125, The current resolve_cost_confirmation_threshold function misattributes
the source when LaunchDarkly returns a value equal to the config fallback;
change get_feature_flag_value to return an explicit signal about whether the
returned value came from LaunchDarkly or was the fallback (e.g., return
(raw_value, from_flag: bool) or (raw_value, source: "launchdarkly"|"default")),
then update resolve_cost_confirmation_threshold to use that signal to set the
returned source (keep using Flag.COPILOT_COST_CONFIRMATION_THRESHOLD_USD.value
and config.copilot_cost_confirmation_threshold_usd to compute numeric fallback
and threshold but rely on the new second return value from
get_feature_flag_value to decide between "launchdarkly" and "default").
| def _estimate_history_tokens(session: ChatSession) -> int: | ||
| messages = [ | ||
| {"role": m.role, "content": m.content or ""} | ||
| for m in session.messages[-30:] | ||
| if m.role in {"system", "user", "assistant", "tool"} | ||
| ] | ||
| if not messages: | ||
| return 0 | ||
| return estimate_token_count(messages) |
There was a problem hiding this comment.
Fingerprint inputs and estimator history should stay in sync.
_estimate_history_tokens counts the last 30 messages of session.messages, but the cost-approval fingerprint (build_cost_approval_fingerprint in cost_approval.py) only binds message, is_user_message, context, file_ids, mode, model — not the session's prior history. That is intentional (you can't easily re-bind a conversation's state), but be aware: between estimate and confirm, the session can accrue new messages (e.g., a second tab), the re-run-time history is larger, and the real cost may exceed the approved estimate. Consider either:
- documenting this gap in the module docstring / API docs, or
- binding
session_id+ a message-count/version into the fingerprint so a noticeably-different conversation forces re-estimation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/backend/backend/copilot/cost_estimation.py` around lines 150
- 158, The estimate vs approval can diverge if the session accrues messages
after estimating, so update the fingerprint to include session identity +
history version: modify build_cost_approval_fingerprint to accept and
incorporate session.session_id (or equivalent) and a
history_version/message_count (e.g., len(session.messages) or a monotonic
version) so any change in conversation invalidates the prior approval; keep
_estimate_history_tokens' slicing logic (last 30) in sync with any bounding used
when building the fingerprint so both use the same window/length semantics and
ensure the same symbols (_estimate_history_tokens and
build_cost_approval_fingerprint) are updated together.
| async function fetchCostEstimate( | ||
| sid: string, | ||
| requestBody: { | ||
| message: string; | ||
| is_user_message: boolean; | ||
| context: null; | ||
| file_ids: string[] | null; | ||
| mode: CopilotMode | null; | ||
| model: CopilotLlmModel | null; | ||
| }, | ||
| headers: Record<string, string>, | ||
| ): Promise<CoPilotCostEstimate> { | ||
| const response = await fetch( | ||
| `${environment.getAGPTServerBaseUrl()}/api/chat/sessions/${sid}/estimate`, | ||
| { | ||
| method: "POST", | ||
| headers: { | ||
| ...headers, | ||
| "Content-Type": "application/json", | ||
| }, | ||
| body: JSON.stringify(requestBody), | ||
| }, | ||
| ); | ||
|
|
||
| if (!response.ok) { | ||
| const detail = await response.text(); | ||
| throw new Error(detail || "Failed to estimate request cost"); | ||
| } | ||
| return (await response.json()) as CoPilotCostEstimate; | ||
| } | ||
|
|
||
| useEffect(() => { | ||
| return () => { | ||
| const resolver = costApprovalResolverRef.current; | ||
| costApprovalResolverRef.current = null; | ||
| pendingCostEstimateRef.current = null; | ||
| resolver?.(null); | ||
| }; | ||
| }, []); |
There was a problem hiding this comment.
In-flight fetchCostEstimate is not cancelled on session switch / unmount.
fetchCostEstimate has no AbortSignal. If the user switches sessions (or the component unmounts) while the estimate POST is in flight, the promise still resolves against the old session and the subsequent setPendingCostEstimate / waitForCostApproval mutates state for a session the UI no longer owns. The unmount cleanup only resolves the approval promise (costApprovalResolverRef); it does not abort the HTTP request. Pass an AbortController.signal into fetch and abort it from the session-switch effect and the unmount cleanup, or guard the post-fetch state writes with a session-id check (similar to the sessionIdRef.current !== sid pattern already used in handleWakeResync).
🛠️ Sketch
- async function fetchCostEstimate(
- sid: string,
- requestBody: { ... },
- headers: Record<string, string>,
- ): Promise<CoPilotCostEstimate> {
+ async function fetchCostEstimate(
+ sid: string,
+ requestBody: { ... },
+ headers: Record<string, string>,
+ signal?: AbortSignal,
+ ): Promise<CoPilotCostEstimate> {
const response = await fetch(
`${environment.getAGPTServerBaseUrl()}/api/chat/sessions/${sid}/estimate`,
- { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(requestBody) },
+ { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(requestBody), signal },
);
...
}Thread an AbortController tied to the current sessionId and abort from the session-switch effect and the unmount cleanup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@autogpt_platform/frontend/src/app/`(platform)/copilot/useCopilotStream.ts
around lines 106 - 144, fetchCostEstimate is missing an AbortSignal so in-flight
POSTs can complete after a session switch/unmount and mutate stale state; modify
the logic to create an AbortController tied to the current sessionId, pass
controller.signal into fetch inside fetchCostEstimate, and abort that controller
both in the session-switch effect and the unmount cleanup (where
costApprovalResolverRef and pendingCostEstimateRef are cleared). Ensure
pendingCostEstimateRef and costApprovalResolverRef are set/cleared only for the
matching sessionId (use sessionIdRef.current === sid guard like
handleWakeResync) so resolved promises/state updates do not apply to a different
session if aborting is not possible.
`self._active_tasks_lock = threading.Lock()` in `__init__` (added in #12877 to make `_cleanup_completed_tasks` thread-safe) holds a `_thread.lock` that the forkserver/spawn start method cannot serialize. With it set eagerly, `Process(target=self.execute_run_command).start()` from `AppProcess.start()` raises `TypeError: cannot pickle '_thread.lock' object` and `poetry run app` aborts at startup before the REST server binds. Move the lock to a lazy `@property _active_tasks_lock` so the parent process never holds a real `threading.Lock` instance — the lock is materialized inside the forked child the first time `_cleanup_completed_tasks` runs, where pickling is no longer in play. This mirrors the existing lazy-init pattern already used for the ThreadPoolExecutor, RabbitMQ clients, and consumer threads in this class.
Why / What / How
Why
CoPilot requests can unexpectedly become expensive (large prompts, high-token contexts, expensive model paths), and users currently have no guardrail before execution. This creates risk of accidental high-cost turns.
What
This MR introduces a cost-safety flow:
How
Changes 🏗️
approval_required, missing token, and expired/invalid token.Checklist 📋
For code changes:
approval_requiredand estimated costFor configuration changes:
.env.defaultis updated or already compatible with my changesdocker-compose.ymlis updated or already compatible with my changesConfiguration changes (if applicable):
If you want, I can also give you a shorter “squashed” version (more concise for internal MRs) or a stricter version aligned to Conventional Commit wording.