[build] Add shared DownloadFileWithRetry target and use it for SDK downloads#11647
Merged
Merged
Conversation
…wnloads Fixes transient `MSB3923 ... ResponseEnded` build failures when downloading SDK components (emulator, build-tools, NDK, JDK, etc.) from `dl.google.com` and `aka.ms`. Example failure: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1461495 Root cause: MSBuild's built-in `<DownloadFile>` task has `Retries` and `RetryDelayMilliseconds`, but its internal `IsRetriable` check only retries when a `HttpRequestException` wraps an inner `IOException`. The `ResponseEnded` failure that flaky CDNs produce is thrown as a top-level `HttpIOException` (added in .NET 8), so `Retries` does NOT cover it - the task errors out on the first mid-stream disconnect with no retry attempt. New `build-tools/scripts/DownloadFileWithRetry.targets` exposes: * `$(DownloadFileWithRetryFile)` - absolute path to itself * `DownloadOneFileWithRetry` - target that wraps `<DownloadFile>` in three outer attempts (first two with `ContinueOnError="WarnAndContinue"`, third lets the error propagate) and optionally verifies SHA-256. When the caller supplies `_DownloadSha256` and a cached file already exists with the wrong hash, the target deletes it (and any `_DownloadCleanupOnMismatch` siblings) up-front so the download attempts actually re-fetch in the same build, rather than failing at the post-download verify step. `openjdk.targets` uses this to invalidate a cached `.sha256sum.txt` if the archive ever fails verification. Callers build a `_DownloadFile` item group in their target body with per-file params as `AdditionalProperties` metadata, then invoke: <MSBuild Projects="@(_DownloadFile->'$(DownloadFileWithRetryFile)')" Targets="DownloadOneFileWithRetry" BuildInParallel="true" /> The projection turns each item into a distinct `(project, properties)` build request, so MSBuild fans them out across worker nodes. For multi-file callers (aapt2, androidsdk) this means downloads happen in parallel (verified locally: 3 build-tools zips, ~200 MB, in 3.6 s). Two MSBuild semantics forced the `<MSBuild>`-task dispatch (instead of a simpler `<CallTarget>`): 1. Items added inside a target body are NOT visible to a target invoked via `<CallTarget>` from that same body, so the shared target could not see a `FilesToDownload` collection assembled by the caller. 2. A target executes at most once per build context, so a `<CallTarget>` could not be used for openjdk's two-phase hash-file-then-archive flow. Migrated all five SDK download targets to the shared helper: * src/binutils/binutils.targets * src/bundletool/bundletool.targets * src/aapt2/aapt2.targets * src/openjdk/openjdk.targets (two-phase: .sha256sum.txt then archive) * src/androidsdk/androidsdk.targets Verified end-to-end with real downloads: * bundletool single file: downloads successfully * aapt2 (3 files): downloads in parallel (~3.6 s) * cache-hit re-run: target-skips fast (~0.4 s) * corrupted cache + correct expected hash: detected, deleted, re-fetched * wrong expected hash: errors and deletes both file and cleanup sibling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a shared MSBuild targets file to wrap <DownloadFile> with additional outer retry attempts (and optional SHA-256 verification/self-heal) to reduce transient CDN download failures during toolchain/SDK setup.
Changes:
- Add
build-tools/scripts/DownloadFileWithRetry.targetswithDownloadOneFileWithRetrytarget (outer retries + optional SHA-256 verify and cleanup). - Migrate several tool/SDK download targets (binutils, bundletool, aapt2, openjdk, androidsdk) to dispatch downloads via
<MSBuild>using per-itemAdditionalProperties. - Enable parallel download dispatch for multi-file scenarios via
BuildInParallel="true".
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| build-tools/scripts/DownloadFileWithRetry.targets | New shared retry + SHA-256 verification wrapper around <DownloadFile>. |
| src/openjdk/openjdk.targets | Switch OpenJDK download to two-phase hash-then-archive flow using the shared helper. |
| src/bundletool/bundletool.targets | Use shared helper for bundletool download + SHA-256 verification. |
| src/binutils/binutils.targets | Use shared helper for binutils archive download + SHA-256 verification. |
| src/androidsdk/androidsdk.targets | Use shared helper for SDK package downloads and attempt parallelization. |
| src/aapt2/aapt2.targets | Use shared helper for build-tools zip downloads and attempt parallelization. |
Reviewer caught that '_DownloadBuildTools' and '_DownloadAndroidSdkPackages' had `Outputs="...%(_X.Identity)"` which is target batching - the target body runs once per item, so each <MSBuild BuildInParallel="true"/> call only saw one item to dispatch. Effectively serial. Switch to item-transform syntax `Outputs="@(_X->'...')"` so the body runs exactly once and the inner <MSBuild> call fans out the full item set across worker nodes. Verified parallelism is now real: worker nodes emit byte counts with different culture formatting (e.g. `58699878` from one node vs `58,699,878` from another), confirming separate-node execution. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
simonrozsival
approved these changes
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes transient
MSB3923 ... ResponseEndedbuild failures when downloading SDK components (emulator, build-tools, NDK, JDK, etc.) fromdl.google.comandaka.ms. Example failure: dnceng-public build 1461495:Root cause
MSBuild's built-in
<DownloadFile>task hasRetriesandRetryDelayMilliseconds, but its internalIsRetriablecheck only retries when aHttpRequestExceptionwraps an innerIOException. TheResponseEndedfailure that flaky CDNs produce is thrown as a top-levelHttpIOException(added in .NET 8), soRetriesdoes not cover it — the task errors out on the first mid-stream disconnect with no retry attempt. (Confirmed by reading the failing build's log: zero "Retrying" messages.)Change
New
build-tools/scripts/DownloadFileWithRetry.targetsexposes:$(DownloadFileWithRetryFile)— absolute path to itselfDownloadOneFileWithRetry— target that wraps<DownloadFile>in three outer attempts (first two withContinueOnError="WarnAndContinue", third lets the error propagate) and optionally verifies SHA-256.When the caller supplies
_DownloadSha256and a cached file already exists with the wrong hash, the target deletes it (and any_DownloadCleanupOnMismatchsiblings) up-front so the download attempts actually re-fetch in the same build, rather than failing at the post-download verify step.openjdk.targetsuses this to invalidate a cached.sha256sum.txtif the archive ever fails verification.Callers build a
_DownloadFileitem group in their target body with per-file params asAdditionalPropertiesmetadata, then invoke:The projection turns each item into a distinct
(project, properties)build request, so MSBuild fans them out across worker nodes. For multi-file callers (aapt2,androidsdk) this means downloads happen in parallel.Why
<MSBuild>instead of<CallTarget>Two MSBuild semantics forced this:
<CallTarget>from that same body, so the shared target could not see a collection assembled by the caller.<CallTarget>could not be used foropenjdk's two-phase hash-file-then-archive flow.Migrated callers
src/binutils/binutils.targetssrc/bundletool/bundletool.targetssrc/aapt2/aapt2.targetssrc/openjdk/openjdk.targets(two-phase:.sha256sum.txtthen archive)src/androidsdk/androidsdk.targets(~30 SDK packages, now in parallel)Verified locally
bundletoolsingle fileaapt2(3 files, ~200 MB)