Skip to content

feat(platform): Add AllQuiet alert integration alongside Discord alerts#11234

Draft
ntindle wants to merge 19 commits into
devfrom
ntindle/systemallquietalerts
Draft

feat(platform): Add AllQuiet alert integration alongside Discord alerts#11234
ntindle wants to merge 19 commits into
devfrom
ntindle/systemallquietalerts

Conversation

@ntindle
Copy link
Copy Markdown
Member

@ntindle ntindle commented Oct 21, 2025

  • Added system_alert method to NotificationManager that sends both Discord and AllQuiet alerts
  • Implemented correlation IDs for all system alerts to prevent duplicate incidents:
    • Late executions: Based on threshold and execution types (stable across runs)
    • Block errors: Based on MD5 hash of affected block IDs and date
    • Balance alerts: Based on user ID
    • Retry failures: Based on function, context, and error type
  • Updated all alert locations to use NotificationManager.system_alert() method
  • Added AllQuiet webhook URL configuration in settings
  • Maintained backward compatibility with existing Discord alerts
  • AllQuiet alerts gated behind allquiet-alerts LaunchDarkly feature flag (killswitch)
  • Discord failure no longer prevents AllQuiet from firing (and vice versa)

AllQuiet alerts are only sent when correlation_id is provided, ensuring
controlled rollout. Severity levels (critical/warning/minor) and extra
attributes provide better incident management and debugging context.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Changes 🏗️

  • NotificationManager.system_alert() sends to both Discord and AllQuiet with error isolation
  • Stable correlation IDs for deduplication (hashed block IDs, threshold-based late execution IDs)
  • AllQuietAlert model with severity, status, correlation_id, channel, and extra_attributes
  • billing.py migrated from discord_system_alert to system_alert
  • _extract_clean_title() helper for stripping markdown/emoji from alert titles
  • allquiet-alerts LaunchDarkly feature flag as killswitch (default: on)

Checklist 📋

For code changes:

  • I have clearly listed my changes in the PR description
  • I have made a test plan
  • I have tested my changes according to the test plan:
    • Send a test alert by triggering the code via an admin page (code to trigger not in pr) and confirm it creates an all quiet alert

For configuration changes:

  • .env.default is updated or already compatible with my changes
  • docker-compose.yml is updated or already compatible with my changes
  • I have included a list of my configuration changes in the PR description (under Changes)
Platform deployment changes needed
  • Add new secret configuration variable ALLQUIET_WEBHOOK_URL to the infra repo
  • Create allquiet-alerts boolean feature flag in LaunchDarkly (default: true) — this is the killswitch

Note

Medium Risk
Touches cross-cutting alerting paths (executor billing alerts, monitoring, retry infrastructure) and adds a new outbound webhook; failures could drop or spam incidents if correlation IDs/severity are mis-set, though Discord behavior is preserved.

Overview
System alerting is upgraded from Discord-only to a unified NotificationManager.system_alert that always posts to Discord and optionally posts an AllQuiet incident when a correlation_id is provided, with normalized severity/status, derived titles, and extra attributes.

Low-balance/insufficient-funds, late execution, block error-rate, and retry-threshold alerts are migrated to use system_alert and now emit correlation IDs to dedupe incidents and attach debugging metadata. The PR also adds AllQuiet webhook support via a new allquiet_webhook_url secret and updates affected tests to assert system_alert usage and correlation IDs.

Written by Cursor Bugbot for commit 07f5913. This will update automatically on new commits. Configure here.

ntindle and others added 2 commits October 21, 2025 15:31
- Added system_alert method to NotificationManager that sends both Discord and AllQuiet alerts
- Implemented correlation IDs for all system alerts to prevent duplicate incidents:
  - Late executions: Based on threshold, count, and affected users
  - Block errors: Based on affected blocks and date
  - Balance alerts: Based on user ID
  - Retry failures: Based on function, context, and error type
- Updated all alert locations to use NotificationManager.system_alert() method
- Added AllQuiet webhook URL configuration in settings
- Maintained backward compatibility with existing Discord alerts

AllQuiet alerts are only sent when correlation_id is provided, ensuring
controlled rollout. Severity levels (critical/warning/minor) and extra
attributes provide better incident management and debugging context.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Applied isort for import sorting
- Applied black for code formatting
- Applied ruff for linting
- Fixed type annotations for severity parameter in retry.py
- All formatters and linters now pass successfully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Oct 21, 2025

Deploy Preview for auto-gpt-docs-dev canceled.

Name Link
🔨 Latest commit 85548a4
🔍 Latest deploy log https://app.netlify.com/projects/auto-gpt-docs-dev/deploys/694446a18e903f00087dcb08

@netlify
Copy link
Copy Markdown

netlify Bot commented Oct 21, 2025

Deploy Preview for auto-gpt-docs canceled.

Name Link
🔨 Latest commit 85548a4
🔍 Latest deploy log https://app.netlify.com/projects/auto-gpt-docs/deploys/694446a198f4ea0008c9f898

@github-actions github-actions Bot added the conflicts Automatically applied to PRs with merge conflicts label Oct 21, 2025
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@github-actions github-actions Bot added platform/backend AutoGPT Platform - Back end size/l labels Oct 21, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 21, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 52b13933-daf1-42fb-8418-077695518269

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This pull request refactors the alerting system across multiple monitoring and notification modules to replace single-channel Discord alerts with a multi-channel system that supports both Discord and AllQuiet webhooks. The changes introduce structured alert payloads with correlation_id, severity, status, and extra_attributes metadata throughout the monitoring infrastructure.

Changes

Cohort / File(s) Summary
Notification System Core
autogpt_platform/backend/backend/notifications/notifications.py, autogpt_platform/backend/backend/util/metrics.py, autogpt_platform/backend/backend/util/settings.py
Introduced new system_alert() and allquiet_system_alert() public methods in NotificationManager that coordinate Discord and AllQuiet alerts. Added AllQuietAlert Pydantic model with structured metadata and new send_allquiet_alert() async function. Added allquiet_webhook_url configuration field to Secrets.
Executor Alert Logic
autogpt_platform/backend/backend/executor/manager.py, autogpt_platform/backend/backend/executor/manager_low_balance_test.py
Updated insufficient funds and low balance alert dispatch from discord_system_alert() to new system_alert() interface with correlation_id, severity, status, and enriched extra_attributes (user_id, email, balance, cost, shortfall, agent_name).
Monitoring System Alerts
autogpt_platform/backend/backend/monitoring/block_error_monitor.py, autogpt_platform/backend/backend/monitoring/late_execution_monitor.py
Replaced direct Discord calls with structured system_alert() calls. Introduced correlation_id generation based on monitoring context. Enhanced payloads include severity, status, and extra_attributes with monitoring-specific metadata (affected blocks, dates, thresholds, execution counts).
Retry Infrastructure Alerts
autogpt_platform/backend/backend/util/retry.py, autogpt_platform/backend/backend/util/retry_test.py
Expanded send_rate_limited_discord_alert() signature to accept correlation_id, severity, and extra_attributes. Updated retry failure alerts with error_type extraction and structured metadata. Updated test mocks from discord_system_alert to system_alert.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Monitoring/Executor
    participant Manager as NotificationManager
    participant AllQuiet as AllQuiet Webhook
    participant Discord as Discord API

    Caller->>Manager: system_alert(content, channel, correlation_id, severity, status, extra_attributes)
    activate Manager
    
    Manager->>Manager: Create AllQuietAlert from content & metadata
    Manager->>AllQuiet: send_allquiet_alert(alert)
    activate AllQuiet
    AllQuiet-->>Manager: POST response
    deactivate AllQuiet
    note over Manager: Log if AllQuiet fails, continue flow
    
    Manager->>Discord: discord_send_alert(content, channel)
    activate Discord
    Discord-->>Manager: response
    deactivate Discord
    
    Manager-->>Caller: Alert dispatched
    deactivate Manager
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

Review effort 4/5

Suggested reviewers

  • Pwuts
  • Bentlybro

Poem

🐰 Alert systems hop in new directions,
Discord and AllQuiet join in collections,
Correlation threads tie metadata tight,
Multi-channel flows shine extra bright! 📢✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main change: adding AllQuiet alert integration alongside Discord alerts, which is the core feature across all modified files.
Description check ✅ Passed The PR description comprehensively outlines changes related to the changeset, including the new system_alert method, correlation ID implementation, AllQuiet integration, and migration of alert locations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ntindle/systemallquietalerts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AutoGPT-Agent
Copy link
Copy Markdown
Contributor

Thanks for this PR adding AllQuiet alert integration alongside Discord alerts. The implementation looks solid with correlation IDs to prevent duplicate incidents, and the code changes are well-structured.

However, I need to flag that the PR checklist is completely unchecked. Before we can merge this PR, please complete the checklist in the PR description:

  • Check that you've clearly listed your changes (which you have in the PR description)
  • Confirm you have a test plan
  • Verify you've tested according to the plan
  • For the configuration changes (adding allquiet_webhook_url), make sure .env.default is updated and that you've listed this configuration change in the Changes section

Once you've addressed the checklist items, this PR should be ready for approval.

@deepsource-io
Copy link
Copy Markdown

deepsource-io Bot commented Oct 21, 2025

Here's the code health analysis summary for commits 4c47441..85548a4. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource JavaScript LogoJavaScript✅ SuccessView Check ↗
DeepSource Python LogoPython✅ Success
❗ 30 occurences introduced
🎯 26 occurences resolved
View Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

@github-actions
Copy link
Copy Markdown
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@github-actions github-actions Bot removed the conflicts Automatically applied to PRs with merge conflicts label Oct 22, 2025
@AutoGPT-Agent
Copy link
Copy Markdown
Contributor

Thank you for implementing the AllQuiet alert integration alongside Discord alerts! The changes look well-structured and consistent across the codebase.

Your implementation of correlation IDs for different alert types is thorough, and I appreciate the backward compatibility with existing Discord alerts.

Before this can be merged:

  • Please complete the PR checklist. Currently, none of the items are checked off.

    • At minimum, the code changes checklist items need to be completed
    • Please include a test plan showing how you've verified the AllQuiet integration works
  • You've added the allquiet_webhook_url configuration item correctly in settings.py, but this should be noted in the configuration changes section of your PR description.

The code implementation itself looks good, but the PR process requirements need to be addressed before this can be merged. Please update the PR description with the completed checklist and any additional configuration changes information.

Comment on lines +443 to +457
# Send AllQuiet alert if correlation_id is provided
if correlation_id:
# Extract title from content (first line or first sentence)
lines = content.split("\n")
title = lines[0] if lines else content[:100]
# Remove Discord formatting from title
title = (
title.replace("**", "")
.replace("🚨", "")
.replace("⚠️", "")
.replace("❌", "")
.replace("✅", "")
.replace("📊", "")
.strip()
)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdythink on alwasy sending even if no correlation id? (could cause duplicate notifications)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept AllQuiet gated on correlation_id so Discord still always sends, but AllQuiet only opens incidents when we have a stable dedupe key. Verified in 6ca0ad7.

Comment on lines +77 to +79
async with httpx.AsyncClient() as client:
response = await client.post(hook_url, json=alert.model_dump())
response.raise_for_status()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@claude replace this with Requests() like we do elsewhere in the codebase. search Requests() to make sure you know what i mean

Copy link
Copy Markdown
Contributor

@claude claude Bot Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude finished @ntindle's task —— View job


✅ Done! I've replaced the httpx implementation with the Requests() helper.

Changes Made:

  • Replaced httpx.AsyncClient() with Requests() helper from backend.util.request
  • The Requests() helper provides built-in SSRF protection and automatic retry logic for transient failures
  • Simplified the code by removing the context manager since Requests() handles session management internally

Todo List:

  • Search for Requests() usage in the codebase to understand the pattern
  • Read the current implementation in metrics.py
  • Replace the HTTP request with Requests() pattern
  • Commit and push changes

The change maintains the same functionality while using the consistent HTTP request pattern established in the codebase.
| Branch: ntindle/systemallquietalerts

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in 60cdcbasend_allquiet_alert now uses Requests().post(...) like the rest of the backend.

title=title[:100], # Limit title length
description=content,
correlation_id=correlation_id,
extra_attributes=extra_attributes or {},
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@claude all quiet should take in the channel too and add that as an attribute to AllQuietAlert since we already know what it is. then we can filter better in all quiet

Copy link
Copy Markdown
Contributor

@claude claude Bot Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude finished @ntindle's task —— View job


✅ Done! I've added the channel information as an attribute to AllQuiet alerts.

Changes Made:

  • Modified the system_alert method in /autogpt_platform/backend/backend/notifications/notifications.py
  • The channel value (e.g., "platform" or "product") is now included in the extra_attributes sent to AllQuiet
  • This enables better filtering and categorization of alerts in the AllQuiet dashboard

Todo List:

  • Read the current implementation in notifications.py
  • Identify how channel information flows through the system
  • Update system_alert method to pass channel to send_allquiet_alert
  • Update send_allquiet_alert to include channel in AllQuiet alert attributes
  • Commit and push changes

The implementation preserves backward compatibility while adding the channel information for better alert management in AllQuiet.
| Branch: ntindle/systemallquietalerts

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in 5485a96channel is now passed as a first-class field on AllQuietAlert, so it can be filtered directly in AllQuiet.

Uses the existing Requests() helper from backend.util.request for consistency
with the rest of the codebase. The Requests() helper includes built-in SSRF
protection and retry logic.

Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
@AutoGPT-Agent
Copy link
Copy Markdown
Contributor

Thanks for implementing this AllQuiet alert integration! This will definitely help improve our incident management capabilities.

However, there are a few things that need to be addressed before we can merge this PR:

  1. The PR checklist isn't filled out. Please complete the checklist items, particularly:

    • Clearly listing your changes in the PR description (the "Changes" section is currently empty)
    • Making and documenting a test plan for how you've verified this functionality
  2. Since you're adding a new configuration option (allquiet_webhook_url), please:

    • Confirm if .env.default needs updating
    • Mention the configuration changes in your PR description
  3. It would be helpful to include some details about:

    • How you've tested this integration
    • Any considerations for rolling this out in production
    • Whether existing alerts might be affected

The code changes themselves look good! The implementation with correlation IDs for deduplication and the extra context attributes will be very helpful for incident management.

This allows for better filtering and categorization of alerts in AllQuiet.
The channel (platform/product) is now included in the extra_attributes
sent to AllQuiet, enabling teams to filter alerts by their source channel.

Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
@AutoGPT-Agent
Copy link
Copy Markdown
Contributor

Thank you for your PR implementing AllQuiet alerts alongside Discord alerts. The implementation looks well-structured with correlation IDs and severity levels to improve incident management.

However, there are a couple of items that need to be addressed before this can be merged:

  1. The PR checklist is incomplete - none of the checkboxes have been checked. Please fill out the checklist to confirm you've tested your changes according to a test plan.

  2. You've added a new configuration field allquiet_webhook_url to the Secrets class, but there's no mention of this in the Changes section of your PR description. Please:

    • Update the PR description to list this configuration change
    • Confirm that .env.default is compatible with this change
    • Confirm that docker-compose.yml is compatible with this change

Once you've addressed these items, the PR should be ready for another review.

)

# Add channel to extra attributes for better filtering in AllQuiet
all_attributes = extra_attributes or {}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@claude let's make the discord channel (product or platform) a first class member of the base model AllQuietAlert rather than an extra attribute

Copy link
Copy Markdown
Contributor

@claude claude Bot Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude finished @ntindle's task —— View job


✅ Done! I've made the Discord channel a first-class member of the AllQuietAlert model.

Changes Made:

  • Added channel as a direct field in the AllQuietAlert model in /autogpt_platform/backend/backend/util/metrics.py:62
  • Updated system_alert method in /autogpt_platform/backend/backend/notifications/notifications.py:465 to pass channel directly instead of in extra_attributes
  • This enables better filtering and categorization of alerts in the AllQuiet dashboard

Todo List:

  • Read the AllQuietAlert model definition in metrics.py
  • Add channel as a first-class field to the AllQuietAlert model
  • Update send_allquiet_alert function to use channel field directly
  • Remove channel from extra_attributes in system_alert method
  • Verify changes with git diff
  • Commit and push changes

The channel value (e.g., "platform" or "product") is now a proper field in the AllQuiet alert structure, making it easier to filter and manage alerts based on their source channel.
| Branch: ntindle/systemallquietalerts

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in 5485a96channel now lives on the base AllQuietAlert model instead of being tucked into extra_attributes.

- Added channel as a first-class field in AllQuietAlert model
- Updated system_alert to pass channel directly instead of in extra_attributes
- Enables better filtering and categorization in AllQuiet dashboard

Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
@AutoGPT-Agent
Copy link
Copy Markdown
Contributor

Thank you for adding the AllQuiet alert integration! This looks like a valuable addition to provide better incident management alongside the existing Discord alerts.

However, before we can approve this PR:

  1. Please complete the PR checklist by checking all the applicable boxes. For code changes, you need to:

    • Make sure you have clearly listed your changes in the PR description (which you've done well)
    • Create a test plan
    • Confirm you've tested your changes according to the test plan
  2. For configuration changes, since you're adding a new allquiet_webhook_url setting, please:

    • Confirm that .env.default and docker-compose.yml are updated or compatible
    • Include the configuration changes in the PR description under the 'Changes' section

The code implementation looks thorough, with correlation IDs for different alert types and proper severity levels. Once you complete the checklist items, this PR should be ready for approval.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 11, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicts Automatically applied to PRs with merge conflicts platform/backend AutoGPT Platform - Back end Review effort 3/5 size/xl

Projects

Status: 🆕 Needs initial review

Development

Successfully merging this pull request may close these issues.

5 participants