docs(rfc): add sandbox resource requirements proposal by elezar · Pull Request #1360 · NVIDIA/OpenShell

elezar · 2026-05-13T14:54:14Z

Summary

Add RFC 0004 proposing a typed sandbox resource requirements model for CPU, memory, GPUs, and future resource domains. The RFC separates portable resource requirements from driver/platform-specific configuration and realization, and includes concrete realization examples for Kubernetes, Docker, Podman, and VM drivers.

Related Issue

Related to #1338 and #1340.

Changes

Adds rfc/0004-sandbox-resource-requirements/README.md.
Proposes SandboxSpec.resource_requirements with compute, device, dataset, and extension domains.
Reserves JSON-formatted CLI input for --driver-config-json, mapped to SandboxTemplate.driver_config.
Explicitly avoids exposing JSON-formatted portable resource request flags.
Documents how CPU/memory and GPU requests map to Kubernetes resources, CDI device injection, and VM device assignment.
Captures conflict handling between portable resource requirements and SandboxTemplate.resources passthrough.

Testing

mise run pre-commit passes
Unit tests added/updated (not applicable; RFC-only change)
E2E tests added/updated (not applicable; RFC-only change)

Checklist

Follows Conventional Commits
Commits are signed off (DCO)

Signed-off-by: Evan Lezar <elezar@nvidia.com>

drew · 2026-05-14T05:27:23Z

+The CLI should not expose a JSON flag for `resource_requirements`. Common
+portable requests should use typed flags such as CPU, memory, and GPU-count
+flags, and SDK/API callers should use the typed protobuf messages directly.
+JSON-formatted CLI input is reserved for driver-specific configuration.


This seems right to me. NemoClaw team needs basic mem/cpu requests, so I started to implement this here, #1376.

derekwaynecarr · 2026-05-14T16:30:01Z

+  repeated GenericResourceRequirement extensions = 100;
+}
+
+message ComputeResourceRequirements {


I’m good with this.

We may want ephemeral storage in future, but am happy to defer that now.

I am not aware of any use case right now that would demand hugepages.

Right now pid limiting in k8s is a node level pod setting that is per pod cgroup enforced, if we wanted to expose a pids.max. Ultimately we need our pid limiting to be cgroup enforced but I think having that as settable can come later.

derekwaynecarr · 2026-05-14T16:33:39Z

+
+| Driver | Realization |
+|---|---|
+| Kubernetes | Populate pod container `resources.requests.cpu`, `resources.limits.cpu`, `resources.requests.memory`, and `resources.limits.memory`. |


if and when we separate the proxy into a separate pod, we may need to revisit this slightly to either have a fixed proxy overhead or enable it to have a separate resource configuration. We can explore that when we get there as this rfc improves the current model.

derekwaynecarr

This is a good next step. I wanted to let @mrunalp take a look as well, but this is LGTM for me.

drew

LGTM

drew · 2026-05-15T06:43:31Z

+```shell
+openshell sandbox create \
+  --driver-config-json '{"kubernetes.openshell.ai":{"nodeSelector":{"accelerator":"nvidia"}}}'
+```


This will be really useful across all our drivers.

elezar requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 13, 2026 14:54

docs(rfc): add sandbox resource requirements proposal

1aafb69

Signed-off-by: Evan Lezar <elezar@nvidia.com>

elezar force-pushed the 1338-sandbox-resource-requirements/elezar branch from 96945aa to 1aafb69 Compare May 13, 2026 16:29

drew mentioned this pull request May 14, 2026

feat(cli): add sandbox resource flags #1376

Merged

5 tasks

drew reviewed May 14, 2026

View reviewed changes

drew added the rfc label May 14, 2026

derekwaynecarr reviewed May 14, 2026

View reviewed changes

drew added this to OpenShell Roadmap May 15, 2026

github-project-automation Bot moved this to Todo in OpenShell Roadmap May 15, 2026

drew moved this from Todo to In progress in OpenShell Roadmap May 15, 2026

drew approved these changes May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(rfc): add sandbox resource requirements proposal#1360

docs(rfc): add sandbox resource requirements proposal#1360
elezar wants to merge 1 commit into
NVIDIA:mainfrom
elezar:1338-sandbox-resource-requirements/elezar

elezar commented May 13, 2026 •

edited

Loading

Uh oh!

drew May 14, 2026

Uh oh!

derekwaynecarr May 14, 2026 •

edited

Loading

Uh oh!

derekwaynecarr May 14, 2026

Uh oh!

derekwaynecarr left a comment

Uh oh!

drew left a comment

Uh oh!

drew May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

elezar commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Testing

Checklist

Uh oh!

drew May 14, 2026

Choose a reason for hiding this comment

Uh oh!

derekwaynecarr May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

derekwaynecarr May 14, 2026

Choose a reason for hiding this comment

Uh oh!

derekwaynecarr left a comment

Choose a reason for hiding this comment

Uh oh!

drew left a comment

Choose a reason for hiding this comment

Uh oh!

drew May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

elezar commented May 13, 2026 •

edited

Loading

derekwaynecarr May 14, 2026 •

edited

Loading