pass `*args, **kwargs` to `Optimizer.zero_grad` #4026

njzjz · 2024-03-01T22:44:53Z

Checklist before submitting

Did you read the contributor guide?
Did you update the docs?
Did you write any tests to validate this change?
Did you update the CHANGELOG, if this change affects users?

Description

PyTorch 1.7 added set_to_none to Optimizer.zero_grad (see pytorch/pytorch@c515881). Horovod needs to be compatible with PyTorch 1.5, so I pass *args, **kwargs to zero_grad in this PR, which is safe in both old and new PyTorch versions.

Review process to land

All tests and other checks must succeed.
At least one member of the technical steering committee must review and approve.
If any member of the technical steering committee requests changes, they must be addressed.

PyTorch 1.7 added `set_to_none` to `Optimizer.zero_grad` (see pytorch/pytorch@c515881). Horovod needs to be compatible with PyTorch 1.5, so I pass `*args, **kwargs` to `zero_grad` in this PR, which is safe in both old and new PyTorch versions. Signed-off-by: Jinzhe Zeng <[email protected]>

github-actions · 2024-03-02T00:11:08Z

Unit Test Results

568 files - 156 568 suites - 156 6h 43m 3s ⏱️ - 1h 23m 4s
887 tests ± 0 692 ✅ - 76 195 💤 + 76 0 ❌ ±0
12 738 runs - 3 471 8 641 ✅ - 2 712 4 097 💤 - 759 0 ❌ ±0

Results for commit 8fb5874. ± Comparison against base commit 9f88e1d.

This pull request skips 76 tests.

test.parallel.test_mxnet1.MX1Tests ‑ test_gluon_trainer
test.parallel.test_mxnet1.MX1Tests ‑ test_gpu_required
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_allreduce_cpu_gpu_error
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_grouped_allgather_cpu_gpu_error
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_grouped_allreduce_cpu_gpu_error
test.parallel.test_tensorflow.TensorFlowTests ‑ test_gpu_required
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_fused_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_grad_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_variable_size_fused_gpu
…

github-actions · 2024-03-02T00:12:04Z

Unit Test Results (with flaky tests)

568 files - 320 568 suites - 320 6h 43m 3s ⏱️ - 2h 15m 38s
887 tests ± 0 692 ✅ - 76 195 💤 + 76 0 ❌ ±0
12 738 runs - 7 501 8 641 ✅ - 5 148 4 097 💤 - 2 353 0 ❌ ±0

Results for commit 8fb5874. ± Comparison against base commit 9f88e1d.

This pull request skips 76 tests.

test.parallel.test_mxnet1.MX1Tests ‑ test_gluon_trainer
test.parallel.test_mxnet1.MX1Tests ‑ test_gpu_required
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_allreduce_cpu_gpu_error
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_grouped_allgather_cpu_gpu_error
test.parallel.test_mxnet1.MX1Tests ‑ test_horovod_grouped_allreduce_cpu_gpu_error
test.parallel.test_tensorflow.TensorFlowTests ‑ test_gpu_required
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_fused_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_grad_gpu
test.parallel.test_tensorflow.TensorFlowTests ‑ test_horovod_allgather_variable_size_fused_gpu
…

njzjz force-pushed the patch-3 branch from c8d6c87 to 8fb5874 Compare March 1, 2024 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pass `*args, **kwargs` to `Optimizer.zero_grad` #4026

pass `*args, **kwargs` to `Optimizer.zero_grad` #4026

njzjz commented Mar 1, 2024

github-actions bot commented Mar 2, 2024

github-actions bot commented Mar 2, 2024

pass *args, **kwargs to Optimizer.zero_grad #4026

Are you sure you want to change the base?

pass *args, **kwargs to Optimizer.zero_grad #4026

Conversation

njzjz commented Mar 1, 2024

Checklist before submitting

Description

Review process to land

github-actions bot commented Mar 2, 2024

Unit Test Results

github-actions bot commented Mar 2, 2024

Unit Test Results (with flaky tests)

pass `*args, **kwargs` to `Optimizer.zero_grad` #4026

pass `*args, **kwargs` to `Optimizer.zero_grad` #4026