Fix device KeyError in tied_params_map#3403
Conversation
|
@SunMarc @faaany @zucchini-nlp : at the moment this PR just adds if condition to avoid stepping into KeyError. However, I am not sure why this situation happens. I afraid that I might not have addressed actual issue and just fixed symptom. Can someone help suggest a better fix or explain why this fix would be correct one? |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
@SunMarc, @muellerzr : the #3448 is a better way to fix the reported #3402, but the fix is XPU specific (as well as 3402 to be fair). I do worry that the issue might still exists for accelerators other than cuda and xpu which got aligned behavior after 3448. I don't have a way to verify that however. So, I am rebasing this PR and leave that to maintainers and users./developers of non-cuda/xpu devices to take it from here if needed. Better way however might be to aligned behavior of these accelerators on pytorch side and then make a fix similar to 3448. |
SunMarc
left a comment
There was a problem hiding this comment.
Thanks for the report. Let's merge this nevertheless just to be more careful
Fixes: huggingface#3402 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Fixes: #3402
The #3448 is a better way to fix the reported #3402, but the fix is XPU specific (as well as 3402 to be fair). I do worry that the issue might still exists for accelerators other than cuda and xpu which got aligned behavior after 3448. I don't have a way to verify that however. So, I am rebasing this PR and leave that to maintainers and users/developers of non-cuda/xpu devices to take it from here if needed. Better way however might be to aligned behavior of these accelerators on pytorch side and then make a fix similar to 3448.
CC: @SunMarc @faaany @zucchini-nlp