Background & Motivation
No per-extension telemetry exists for upgrades — we can't track success rates, promotion adoption, or failure patterns. Help text doesn't document the new default registry or promotion behavior. Edge cases (delisted extensions, network failures mid-batch) aren't handled gracefully and can crash the batch loop or corrupt config.
Related: #6235 | Parent EPIC: #5767
User Stories
- As an extension author, I want telemetry on promotion adoption rates so I know when to deprecate dev listings.
- As a team lead, I want accurate documentation of default behavior for onboarding guides.
- As a developer, I want upgrade to handle delisted extensions gracefully without crashing.
Implementation Tasks
Task 1: Per-extension and migration telemetry events
Emit structured telemetry per extension upgrade using existing OpenTelemetry/Application Insights infrastructure:
Task 2: CLI help text and documentation updates
Update command help text and documentation to reflect new behavior:
Task 3: Edge case handling
Handle delisted extensions, network failures, and config corruption gracefully:
Acceptance Criteria
Testing Expectations
- Unit tests for telemetry event emission (verify event fields)
- Snapshot tests for updated help text
- Unit tests for delisted extension handling (mock 0 results from
FindExtensions)
- Unit tests for network failure handling (mock transport error)
go test ./cmd -run 'TestFigSpec|TestUsage' passes with updated snapshots
golangci-lint run ./... passes
Key Files
| File |
Action |
cmd/extension.go |
Modify (telemetry, help text, edge cases) |
internal/tracing/ |
Reference/Modify (telemetry infrastructure) |
pkg/extensions/manager.go |
Modify (edge case handling in FindExtensions callers) |
pkg/config/ |
Modify (atomic config writes) |
| Command snapshot files |
Update via UPDATE_SNAPSHOTS=true |
Dependencies
- Epic 1 (Registry Resolution and Source Management) — telemetry uses resolver outcomes
- Epic 2 (Batch Upgrade Orchestration) — telemetry and edge cases build on the
UpgradeResult type and continue-on-error loop
References
Background & Motivation
No per-extension telemetry exists for upgrades — we can't track success rates, promotion adoption, or failure patterns. Help text doesn't document the new default registry or promotion behavior. Edge cases (delisted extensions, network failures mid-batch) aren't handled gracefully and can crash the batch loop or corrupt config.
Related: #6235 | Parent EPIC: #5767
User Stories
Implementation Tasks
Task 1: Per-extension and migration telemetry events
Emit structured telemetry per extension upgrade using existing OpenTelemetry/Application Insights infrastructure:
Per-upgrade event attributes:
extension.name— extension identifierextension.version.from— installed version before upgradeextension.version.to— target version after upgradeextension.source— source used for upgradeextension.upgrade.duration_ms— time taken for this extensionextension.upgrade.outcome—success,failure,skipped,promotedPromotion-specific event (
extension.migrate):extension.source.from— original source (e.g.,"dev")extension.source.to— new source (e.g.,"azd")Use existing patterns:
tracing.SetUsageAttributes()and span attributes following theextension.installedfield pattern ininternal/tracing/Add per-extension telemetry span/attributes in upgrade action (
cmd/extension.go)Emit distinct
extension.migrateevent for promotionsAdd telemetry fields to
UpgradeResultfor duration trackingUnit tests verifying telemetry event emission (verify event fields are populated)
Task 2: CLI help text and documentation updates
Update command help text and documentation to reflect new behavior:
Update
ShortandLongdescriptions incmd/extension.gofor:upgradesubcommand — document that it uses the stored source by defaultupgrade --all— document batch behavior with continue-on-error--sourceflag — document that it overrides automatic resolution and persists the new sourceDocument default registry fallback behavior (empty/missing
Source→ main registry)Document auto-promotion behavior (dev → main when extension exists in both)
Update snapshot tests:
UPDATE_SNAPSHOTS=true go test ./cmd -run 'TestFigSpec|TestUsage'Update
Short/Longhelp text incmd/extension.goforupgradecommandUpdate
--sourceflag descriptionRun
UPDATE_SNAPSHOTS=true go test ./cmd -run 'TestFigSpec|TestUsage'to update snapshotsVerify
golangci-lintpasses on updated filesTask 3: Edge case handling
Handle delisted extensions, network failures, and config corruption gracefully:
Delisted extensions: When
FindExtensionsreturns 0 matches for an installed extension:(-) Skipped:with message "extension not found in any configured registry"azd extension uninstall <id>if the user wants to clean upNetwork failures: Distinguish transport errors from "not found" responses:
(x) Failed:with "check network connectivity" suggestion(-) Skipped:with "extension no longer available" messageAtomic config writes: Use write-to-temp-then-rename for
~/.azure/config.jsonupdates:pkg/config/if not already presentHandle 0-match results from
FindExtensionswith skip status and suggestionDistinguish transport errors from not-found in error handling
Implement atomic config write pattern (write-temp-then-rename) in
pkg/config/Unit tests for delisted extension handling (mock 0 results)
Unit tests for network failure handling (mock transport error vs. 404)
Acceptance Criteria
azd extension upgrade --helpdocuments default registry behavior and auto-promotion(-) Skipped:with clear message, batch continues(x) Failed:with retry/connectivity suggestion, batch continuesTesting Expectations
FindExtensions)go test ./cmd -run 'TestFigSpec|TestUsage'passes with updated snapshotsgolangci-lint run ./...passesKey Files
cmd/extension.gointernal/tracing/pkg/extensions/manager.goFindExtensionscallers)pkg/config/UPDATE_SNAPSHOTS=trueDependencies
UpgradeResulttype and continue-on-error loopReferences
docs/extension-upgrade-all/product-reqs.mddocs/extension-upgrade-all/architecture.mdazd extension upgrade --allbehavior #6235