-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With GPU enabled, tensorflow freezes unless I force "Discounted Monte-Carlo returns." to CPU #55
Comments
Thank for you investigating this! Computing returns on the CPU seems reasonable anyway. Would you like to create a PR with this change? I don't know of any more efficient way to debug deadlocks in TensorFlow -- personally I usually disable code until I can locate the problematic part. |
I don't really understand why forcing this particular part of the graph onto CPU fixes (or masks) the issue. So I'll not be able to explain what this PR does ;) |
Haha, okay. It seems like there is a problem when running the |
With GPU enabled, tensorflow freezes unless I force "Discounted Monte-Carlo returns." to CPU. Adding with tf.device("/cpu") into discounted_return(reward, length, discount) seem to address the issue.
I've seen this with TF 1.7, 1.11, 1.12, CUDA 8, 9, 10 and CUDA compute capability from 5.2 to 7.5.
Not sure how to debug TensorFlow when it quietly freezes (or crashes). Tried the thing with TensorFlow Debugger - it doesn't really show where it happens and also has GRPC issues. GDB shows that the process is in the following place, but with so many threads it is hard to tell if this has any relevance:
The text was updated successfully, but these errors were encountered: