We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, very thanks for your project and you effort !! Do you have any idea, why train.py doesn't work. ? I have tensorflow 1.4.0. Thank you very much!
pci bus id: 0000:00:0a.0, compute capability: 6.0) 2018-07-16 10:10:26.461140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0b.0, compute capability: 6.0) 2018-07-16 10:10:26.461146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0c.0, compute capability: 6.0) 2018-07-16 10:10:26.461151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0d.0, compute capability: 6.0) restore and continue training! 244.201 sec 520.449 sec Model saved in file: s3://bucket-7601/models/model.ckpt-1001 523.704 sec Model saved in file: s3://bucket-7601/models/model.ckpt-1152 Traceback (most recent call last): File "train/train_tf.py", line 147, in log_dir=args.log_path, start_lr=args.learning_rate, wd=args.weight_decay, kp=args.keep_prob) File "train/train_tf.py", line 125, in run_training coord.join(threads) File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join six.reraise(*self._exc_info_to_raise) File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run enqueue_callable() File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1231, in _single_operation_run target_list_as_strings, status, None) File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.DataLossError: truncated record at 2904972900 [[Node: input/ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/device:CPU:0"](input/TFRecordReaderV2, input/input_producer)]]
The text was updated successfully, but these errors were encountered:
Hello, has this problem been solved? I have the same problem.
Sorry, something went wrong.
Hello, has this problem been solved? I have the same problem. sorry, I haven't solved the problem so far
Hi, how you use mutil gpu ?
No branches or pull requests
Hi, very thanks for your project and you effort !!
Do you have any idea, why train.py doesn't work. ? I have tensorflow 1.4.0. Thank you very much!
pci bus id: 0000:00:0a.0, compute capability: 6.0)
2018-07-16 10:10:26.461140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0b.0, compute capability: 6.0)
2018-07-16 10:10:26.461146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0c.0, compute capability: 6.0)
2018-07-16 10:10:26.461151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:0d.0, compute capability: 6.0)
restore and continue training!
244.201 sec
520.449 sec
Model saved in file: s3://bucket-7601/models/model.ckpt-1001
523.704 sec
Model saved in file: s3://bucket-7601/models/model.ckpt-1152
Traceback (most recent call last):
File "train/train_tf.py", line 147, in
log_dir=args.log_path, start_lr=args.learning_rate, wd=args.weight_decay, kp=args.keep_prob)
File "train/train_tf.py", line 125, in run_training
coord.join(threads)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run
enqueue_callable()
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1231, in _single_operation_run
target_list_as_strings, status, None)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.DataLossError: truncated record at 2904972900
[[Node: input/ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/device:CPU:0"](input/TFRecordReaderV2, input/input_producer)]]
The text was updated successfully, but these errors were encountered: