Fix environment variable forwarding to ray runtime env by listar2000 · Pull Request #265 · rllm-org/rllm

listar2000 · 2025-10-24T04:06:55Z

What is this PR about?

Fix issue #262

Specifically, this PR now correctly overrides (instead of dropping) the default PPO_RAY_RUNTIME_ENV in ray_runtime_env.py when we export certain environment variables from script. We add a default VLLM_USE_V1 = 1 as we are now pinning verl = 0.5.0 (which uses a version of vllm that embraces the V1 engine).

With the suggestion of @kylemontgomery1, for a more complete fix, we will also forward any driver environment variable with the following prefixes are forwarded:

Inference Engines: VLLM_, SGL_, SGLANG_
HuggingFace Libraries: HF_, TOKENIZERS_, DATASETS_
Training Frameworks: TORCH_, PYTORCH_, DEEPSPEED_, MEGATRON_
CUDA/NCCL: NCCL_, CUDA_, CUBLAS_, CUDNN_, NV_, NVIDIA_

We further let the user to specify a flag RLLM_EXCLUDE to rule out any prefix or particular variable that the user wants to exclude from the above forwarding. As an example:

export RLLM_EXCLUDE="VLLM*,CUDA*,NCCL_IB_DISABLE"
# Excludes all VLLM_*, all CUDA_*, and the specific NCCL_IB_DISABLE variable

Corresponding tests and the documentation of the ray_runtime_env module is also added.

This PR also makes sure that ray_init_settings can be properly passed into train_workflow_pipeline (so that it's now consistent with train_agent_ppo.

kylemontgomery1 · 2025-10-24T04:11:27Z

@listar2000 For a complete fix, maybe we can forward the relevant variables from the driver process (in order to ensure env variables not already in PPO_RAY_RUNTIME_ENV get forwarded).

PPO_RAY_RUNTIME_ENV = {
    "env_vars": {
        "TOKENIZERS_PARALLELISM": "true",
        "NCCL_DEBUG": "WARN",
        "VLLM_LOGGING_LEVEL": "WARN",
        "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true",
        "CUDA_DEVICE_MAX_CONNECTIONS": "1",
        "VLLM_USE_V1": "1",
    },
    "worker_process_setup_hook": "rllm.patches.verl_patch_hook.setup",
}

FORWARD_PREFIXES = (
    "VLLM_", "SGL_", "SGLANG_", 
    "HF_", "TOKENIZERS_", "DATASETS_",
    "TORCH_", "PYTORCH_", "DEEPSPEED_", "MEGATRON_", 
    "NCCL_", "CUDA_", "CUBLAS_", "CUDNN_", "NV_", "NVIDIA_",
)

def get_ppo_ray_runtime_env():
    env = PPO_RAY_RUNTIME_ENV["env_vars"].copy()
    forwarded = {
        k: v for k, v in os.environ.items()
        if any(k.startswith(p) for p in FORWARD_PREFIXES)
    }
    env.update(forwarded)
    return {
        "env_vars": env,
        "worker_process_setup_hook": PPO_RAY_RUNTIME_ENV["worker_process_setup_hook"],
    }

listar2000 · 2025-10-24T04:16:52Z

The idea LGTM, but I wonder whether it's possible that this will introduce unwanted side effects (e.g. the user is setting some driver variables not intended for ray, or even for rLLM)?

listar2000 · 2025-10-24T05:54:02Z

@kylemontgomery1
Bear with me 😂 for making this PR heavier. My recent commit integrates your suggestion, while also adding a RLLM_EXCLUDE flag to let the user decide what env var (either excluding with prefix matching with VLLM*, or excluding a particular variable name directly).

Since the logic is more complicated (while IMO more robust now), I've added some tests and the documentation for this module as well.

kylemontgomery1 · 2025-10-24T19:43:58Z

Looks good to me. Thanks!

fix issue rllm-org#262

6c3b570

listar2000 added 2 commits October 24, 2025 00:24

add forward logic and test

066080c

add docs for the rllm exclude logic

5929253

listar2000 changed the title ~~Fix issue #262 (problem with ray_runtime_env)~~ Fix environment variable forwarding to ray runtime env Oct 24, 2025

jeffreysijuntan merged commit bd57a54 into rllm-org:main Oct 24, 2025
1 check passed

listar2000 deleted the fix-ray-runtime-env branch October 24, 2025 20:56

kylemontgomery1 mentioned this pull request Oct 25, 2025

ValueError: Using V1 AsyncLLMEngine, but envs.VLLM_USE_V1=False. #262

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix environment variable forwarding to ray runtime env#265

Fix environment variable forwarding to ray runtime env#265
jeffreysijuntan merged 3 commits into
rllm-org:mainfrom
listar2000:fix-ray-runtime-env

listar2000 commented Oct 24, 2025 •

edited

Loading

Uh oh!

kylemontgomery1 commented Oct 24, 2025

Uh oh!

listar2000 commented Oct 24, 2025

Uh oh!

listar2000 commented Oct 24, 2025

Uh oh!

kylemontgomery1 commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

listar2000 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this PR about?

Uh oh!

kylemontgomery1 commented Oct 24, 2025

Uh oh!

listar2000 commented Oct 24, 2025

Uh oh!

listar2000 commented Oct 24, 2025

Uh oh!

kylemontgomery1 commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

listar2000 commented Oct 24, 2025 •

edited

Loading