Skip to content

Neuron integration#3935

Merged
michaelbenayoun merged 18 commits into
huggingface:mainfrom
michaelbenayoun:neuron_integration
Feb 26, 2026
Merged

Neuron integration#3935
michaelbenayoun merged 18 commits into
huggingface:mainfrom
michaelbenayoun:neuron_integration

Conversation

@michaelbenayoun
Copy link
Copy Markdown
Member

@michaelbenayoun michaelbenayoun commented Feb 24, 2026

What does this PR do?

Add support for AWS Trainium chips (Neuron Cores).

Missing integrations:

  • src/accelerate/hooks.py

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the integration ! Can you fix the merge conflict ?

@michaelbenayoun
Copy link
Copy Markdown
Member Author

Done

@michaelbenayoun michaelbenayoun merged commit 3bffed5 into huggingface:main Feb 26, 2026
14 of 25 checks passed
@michaelbenayoun michaelbenayoun deleted the neuron_integration branch February 26, 2026 15:31
@BrownianNotion
Copy link
Copy Markdown

@michaelbenayoun awesome work! I'm not sure I understand the relationship between accelerate's Trainium support and optimum-neuron, would you be able to help me out?

Is the idea that optimum-neuron is a higher-level wrapper, and is the trainium drop-in replacement for TRL, while accelerate only handles parallelism and allows finer-grained control of the training loop? What made trainium sufficiently different such that it received its own library, as opposed to being part of TRL? Are the current plans to keep the two separate or will they be unified in future?

Thank you so much!

@michaelbenayoun
Copy link
Copy Markdown
Member Author

Trainium now has native support in PyTorch, which was not the case before when it was built on top of torch-xla. It is the main motivation behind moving from a standalone library (optimum-neuron) to integration inside the various libraries we have at HF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants