Error when training is done #14

wxx07 · 2019-10-15T08:42:19Z

Hi, I used the following command to train a fluorescence task with pretrained transformer weights(I used this config for my test run):
tape with model=transformer tasks=fluorescence gpu.device=0 load_from=pretrained_models/transformer_weights.h5 num_epochs=1 steps_per_epoch=10

When the training was done, it received a error. The traceback is as follows:

Traceback (most recent calls WITHOUT Sacred internals):
  File "/work01/home/wxxie/project/biolang/tape/tape/__main__.py", line 330, in main
    train_metrics = train_graph.run_for_n_steps(_config['steps_per_epoch'], epoch_num=epoch)
  File "/work01/home/wxxie/conda-env/tape/lib/python3.6/site-packages/rinokeras/core/v1x/train/RinokerasGraph.py", line 196, in run_for_n_steps
    self.run('default')
  File "/work01/home/wxxie/conda-env/tape/lib/python3.6/site-packages/rinokeras/core/v1x/train/RinokerasGraph.py", line 132, in __exit__
    self.progress_bar.__exit__()
TypeError: __exit__() missing 3 required positional arguments: 'exc_type', 'exc_value', and 'traceback'

It seems to me the error comes from rinokeras but I am new to it and I have installed the required version of rinokeras. Could you help?

Thanks

Bruce

The text was updated successfully, but these errors were encountered:

fishguysword · 2019-11-14T02:38:34Z

same problem

rmrao · 2019-11-14T02:50:19Z

Strange. Plausibly tqdm was updated and we don't handle the versions correctly. For a quick and dirty solution, you can just edit the RinokerasGraph.py file on line 132, replacing self.progress_bar.__exit__() with self.progress_bar.__exit__(None, None, None).

I'm not sure if/when this will make it up into rinokeras itself - that library is mostly deprecated as everyone who was contributing to it has switched to pytorch :). If I have time in the coming weeks I'll see if I can push a fix.

We will also be releasing a pytorch version of TAPE. Ideally we'll try to have that out by NeurIPS 2019 (early December). This version will have a lot fewer dependencies, so hopefully will be easier to work with.

fishguysword · 2019-11-14T05:43:44Z

Strange. Plausibly tqdm was updated and we don't handle the versions correctly. For a quick and dirty solution, you can just edit the RinokerasGraph.py file on line 132, replacing self.progress_bar.__exit__() with self.progress_bar.__exit__(None, None, None).

I'm not sure if/when this will make it up into rinokeras itself - that library is mostly deprecated as everyone who was contributing to it has switched to pytorch :). If I have time in the coming weeks I'll see if I can push a fix.

We will also be releasing a pytorch version of TAPE. Ideally we'll try to have that out by NeurIPS 2019 (early December). This version will have a lot fewer dependencies, so hopefully will be easier to work with.

deal!!, thank you for your advice!

Kevin-chen-sheng · 2019-12-18T11:32:14Z

I reinstalled rinokeras, but it still didn't work

rmrao · 2019-12-18T17:09:52Z

@Kevin-chen-sheng This seems like you don't have the right tensorflow version. Tensorflow's distributed API has changed radically over the last several versions, and it's not possible for us to support all of them.

Kevin-chen-sheng · 2019-12-18T17:15:28Z

Could you please tell me the tensorflow-gpu version and the corresponding cuda version you use?My cuda version is 9.2.148, and I have tried tensorflow-gpu versions from 1.12 to 1.6 many times, but none of them can be installed, as it seems to conflict with other dependencies.The python version is 3.7

rmrao · 2019-12-18T20:27:47Z

tensorflow-gpu 1.12 or 1.13 should work. Please look at directions for installing tensorflow - this is not a problem with tape. Tensorflow's pip binaries do not support versions 1.12 and 1.13 with python 3.7.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when training is done #14

Error when training is done #14

wxx07 commented Oct 15, 2019

fishguysword commented Nov 14, 2019

rmrao commented Nov 14, 2019

fishguysword commented Nov 14, 2019

Kevin-chen-sheng commented Dec 18, 2019

rmrao commented Dec 18, 2019

Kevin-chen-sheng commented Dec 18, 2019 •

edited

Loading

rmrao commented Dec 18, 2019

Error when training is done #14

Error when training is done #14

Comments

wxx07 commented Oct 15, 2019

fishguysword commented Nov 14, 2019

rmrao commented Nov 14, 2019

fishguysword commented Nov 14, 2019

Kevin-chen-sheng commented Dec 18, 2019

rmrao commented Dec 18, 2019

Kevin-chen-sheng commented Dec 18, 2019 • edited Loading

rmrao commented Dec 18, 2019

Kevin-chen-sheng commented Dec 18, 2019 •

edited

Loading