-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Updated test_graph_optims and test_graph_scaling_fused_optimizers to use new OptimizerInfo infrastructure #125127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…st_graph_scaling_fused_optimizers to
… class and using OptimizerInfos
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125127
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (3 Unrelated Failures)As of commit 041afac with merge base 1a28f73 ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your thorough look at the kwargs. For RMSprop, feel free to add a maximize with no weight_decay option here:
supports_param_groups: bool = True, |
For Adam/W, it's okay to not have amsgrad alone--we test it sufficiently with the "capturable, amsgrad" and the "amsgrad" described inputs.
This way we can use the helper function to get the kwargs instead of needing a separate dictionary.
Thank you for the review! To add a maximize with no weight_decay option in RMSprop, should I just edit |
Yes
Yes |
Yes, if the compiled optimizer test fails again, you may need to add a new entry to
|
I believe |
Yep, looks like you need to add an entry to the code snippet I linked above as you added new configs. |
Should I just add this to
|
That looks reasonable--does that pass the test? |
It looks like it works. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@pytorchbot revert -m "Broken trunk" -c nosignal |
@pytorchbot successfully started a revert job. Check the current status here. |
…izers to use new OptimizerInfo infrastructure (#125127)" This reverts commit cf35a59. Reverted #125127 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](#125127 (comment)))
@jayanthd04 your PR has been successfully reverted. |
This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.
Hi @jayanthd04! Sorry, I need to revert your PR because it broke cuda tests. Here you can find more details. |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
…rs (#133749) Fixes #123451 This is a rework of a reverted pull request, #125127. The test failure is fixed. Pull Request resolved: #133749 Approved by: https://github.com/janeyx99
…rs (pytorch#133749) Fixes pytorch#123451 This is a rework of a reverted pull request, pytorch#125127. The test failure is fixed. Pull Request resolved: pytorch#133749 Approved by: https://github.com/janeyx99
This PR is meant to address issue #123451, more specifically, the
test_graph_optims
andtest_graph_scaling_fused_optimizers
functions intest_cuda.py
have been updated so that they now use the new OptimizerInfo infrastructure.Lintrunner passed:
Tests passed:
Both the functions have been moved to the newly created TestCase class
TestCudaOptims
. The test is mostly the same except the@optims
decorator is used at the top of the function to implicitly call the function using each of the optimizers mentioned in the decorator instead of explicitly using a for loop to iterate through each of the optimizers.I was unable to use the
_get_optim_inputs_including_global_cliquey_kwargs
to get all kwargs for each of the optimizers since some of the kwargs that are used in the originaltest_graph_optims
function are not being returned by the new OptimizerInfo infrastructure, more specifically, for thetorch.optim.rmsprop.RMSprop
optimizer, the following kwargs are not returned whenever_get_optim_inputs_including_global_cliquey_kwargs
is called:I ran into the same issue for
test_graph_scaling_fused_optimizers
, for thetorch.optim.adamw.AdamW
optimizer, wheneveroptim_info.optim_inputs_func(device=device)
was called, the following kwarg was not returned:Due to this issue, I resorted to using a dictionary to store the kwargs for each of the optimizers, I am aware that this is less than ideal. I was wondering whether I should use the OptimizerInfo infrastructure to get all the kwargs regardless of the fact that it lacks some kwargs.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang