Skip to content

Add split GPU text encoder cache#9310

Open
Jacid23 wants to merge 1 commit into
invoke-ai:mainfrom
Jacid23:codex/dual-gpu-text-encoder
Open

Add split GPU text encoder cache#9310
Jacid23 wants to merge 1 commit into
invoke-ai:mainfrom
Jacid23:codex/dual-gpu-text-encoder

Conversation

@Jacid23

@Jacid23 Jacid23 commented Jun 28, 2026

Copy link
Copy Markdown

Summary

  • Add an optional split-GPU text encoder mode for systems with multiple CUDA GPUs.
  • When enabled, selected text encoders are loaded on the secondary CUDA device while the main generation model stays on the primary execution device.
  • Add active load/unload sync endpoints so turning the toggle off releases the secondary GPU cache instead of leaving the encoder resident.
  • Add compact hardware/cache status in the UI and a model-cache sleep timer setting for idle cleanup.

Why

Text encoder loads can force the denoise model to unload/reload on single-device cache paths. On dual-GPU systems, keeping text encoders resident on the other CUDA device avoids that churn and makes repeated generation materially smoother.

Behavior

  • The UI control is only useful when at least two CUDA devices are available.
  • Disabling the toggle actively drops the split-GPU text encoder cache so that GPU can be used elsewhere.
  • CPU offload behavior is not changed.

Verification

  • pnpm lint:prettier
  • pnpm lint:tsc
  • pnpm lint:knip
  • OpenAPI schema generated output matches checked-in openapi.json
  • Typegen output is stable after regeneration

Notes

This branch was prepared from upstream/main and squashed to one focused commit. It does not include local fork/runtime update scripts, batch-specific files, or unrelated compatibility work.

@github-actions github-actions Bot added api python PRs that change python files Root backend PRs that change backend files services PRs that change app services frontend PRs that change frontend files labels Jun 28, 2026
@Jacid23 Jacid23 marked this pull request as ready for review June 28, 2026 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api backend PRs that change backend files frontend PRs that change frontend files python PRs that change python files Root services PRs that change app services

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant