Skip to content

Telemetry, documentation, and edge case handling for extension upgrade #7840

Description

@wbreza

Background & Motivation

No per-extension telemetry exists for upgrades — we can't track success rates, promotion adoption, or failure patterns. Help text doesn't document the new default registry or promotion behavior. Edge cases (delisted extensions, network failures mid-batch) aren't handled gracefully and can crash the batch loop or corrupt config.

Related: #6235 | Parent EPIC: #5767

User Stories

  • As an extension author, I want telemetry on promotion adoption rates so I know when to deprecate dev listings.
  • As a team lead, I want accurate documentation of default behavior for onboarding guides.
  • As a developer, I want upgrade to handle delisted extensions gracefully without crashing.

Implementation Tasks

Task 1: Per-extension and migration telemetry events

Emit structured telemetry per extension upgrade using existing OpenTelemetry/Application Insights infrastructure:

  • Per-upgrade event attributes:

    • extension.name — extension identifier
    • extension.version.from — installed version before upgrade
    • extension.version.to — target version after upgrade
    • extension.source — source used for upgrade
    • extension.upgrade.duration_ms — time taken for this extension
    • extension.upgrade.outcomesuccess, failure, skipped, promoted
  • Promotion-specific event (extension.migrate):

    • extension.source.from — original source (e.g., "dev")
    • extension.source.to — new source (e.g., "azd")
    • Classified distinctly from regular upgrades for separate dashboarding
  • Use existing patterns: tracing.SetUsageAttributes() and span attributes following the extension.installed field pattern in internal/tracing/

  • Add per-extension telemetry span/attributes in upgrade action (cmd/extension.go)

  • Emit distinct extension.migrate event for promotions

  • Add telemetry fields to UpgradeResult for duration tracking

  • Unit tests verifying telemetry event emission (verify event fields are populated)

Task 2: CLI help text and documentation updates

Update command help text and documentation to reflect new behavior:

  • Update Short and Long descriptions in cmd/extension.go for:

    • upgrade subcommand — document that it uses the stored source by default
    • upgrade --all — document batch behavior with continue-on-error
    • --source flag — document that it overrides automatic resolution and persists the new source
  • Document default registry fallback behavior (empty/missing Source → main registry)

  • Document auto-promotion behavior (dev → main when extension exists in both)

  • Update snapshot tests: UPDATE_SNAPSHOTS=true go test ./cmd -run 'TestFigSpec|TestUsage'

  • Update Short/Long help text in cmd/extension.go for upgrade command

  • Update --source flag description

  • Run UPDATE_SNAPSHOTS=true go test ./cmd -run 'TestFigSpec|TestUsage' to update snapshots

  • Verify golangci-lint passes on updated files

Task 3: Edge case handling

Handle delisted extensions, network failures, and config corruption gracefully:

  • Delisted extensions: When FindExtensions returns 0 matches for an installed extension:

    • Report (-) Skipped: with message "extension not found in any configured registry"
    • Add suggestion to azd extension uninstall <id> if the user wants to clean up
    • Continue batch — don't abort
  • Network failures: Distinguish transport errors from "not found" responses:

    • Transport errors (DNS, timeout, TLS): (x) Failed: with "check network connectivity" suggestion
    • 404/not-found: (-) Skipped: with "extension no longer available" message
    • Both continue the batch
  • Atomic config writes: Use write-to-temp-then-rename for ~/.azure/config.json updates:

    • Write updated config to a temp file in the same directory
    • Rename (atomic on most filesystems) to the target path
    • Prevents corruption if the process is interrupted mid-write
    • Implement in pkg/config/ if not already present
  • Handle 0-match results from FindExtensions with skip status and suggestion

  • Distinguish transport errors from not-found in error handling

  • Implement atomic config write pattern (write-temp-then-rename) in pkg/config/

  • Unit tests for delisted extension handling (mock 0 results)

  • Unit tests for network failure handling (mock transport error vs. 404)

Acceptance Criteria

  • Each extension upgrade emits a telemetry event with name, version from/to, source, outcome
  • Promotion events are classified distinctly from regular upgrades in telemetry
  • azd extension upgrade --help documents default registry behavior and auto-promotion
  • Snapshot tests updated for new help text
  • Delisted extensions show (-) Skipped: with clear message, batch continues
  • Network errors show (x) Failed: with retry/connectivity suggestion, batch continues
  • Config file writes use atomic pattern (write-temp-then-rename)

Testing Expectations

  • Unit tests for telemetry event emission (verify event fields)
  • Snapshot tests for updated help text
  • Unit tests for delisted extension handling (mock 0 results from FindExtensions)
  • Unit tests for network failure handling (mock transport error)
  • go test ./cmd -run 'TestFigSpec|TestUsage' passes with updated snapshots
  • golangci-lint run ./... passes

Key Files

File Action
cmd/extension.go Modify (telemetry, help text, edge cases)
internal/tracing/ Reference/Modify (telemetry infrastructure)
pkg/extensions/manager.go Modify (edge case handling in FindExtensions callers)
pkg/config/ Modify (atomic config writes)
Command snapshot files Update via UPDATE_SNAPSHOTS=true

Dependencies

  • Epic 1 (Registry Resolution and Source Management) — telemetry uses resolver outcomes
  • Epic 2 (Batch Upgrade Orchestration) — telemetry and edge cases build on the UpgradeResult type and continue-on-error loop

References

Metadata

Metadata

Assignees

Labels

No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions