-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddp ERROR #71
Comments
Hey @liyingjie1991 - are you using torch compile while training? I personally didn't test training with this configuration, but would expect it to work for training as expected (static shapes). The generate step during evaluation probably won't work, since we use a dynamic k/v cache in Transformers, and so have dynamic shapes. If you're using torch compile, could you try disabling it for evaluation? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi, when I run the training code, I met the following error. Can you give me some advice?
` File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(*args, **kwargs)
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 675, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: compile_fn raised TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting:
torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 670, in call_user_compiler
compiled_fn = compiler_fn(gm, self.fake_example_inputs())
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/backends/distributed.py", line 203, in compile_fn
return self.backend_compile_fn(gm, example_inputs)
TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'`
Version:
torch: '2.0.1+cu117'
The text was updated successfully, but these errors were encountered: