Skip to content

Using Seq2rel function with cuda #387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jjyhh1223 opened this issue Apr 18, 2023 · 3 comments
Open

Using Seq2rel function with cuda #387

jjyhh1223 opened this issue Apr 18, 2023 · 3 comments

Comments

@jjyhh1223
Copy link

jjyhh1223 commented Apr 18, 2023

Here's how I pass the device argument for function Seq2rel

from seq2rel import Seq2Rel
from seq2rel.common import util
model = 'model.tar.gz'
kwargs = {'cuda_device': 1}
seq2rel = Seq2Rel(model, **kwargs)

and got this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-15-d951d9bfd1c5>](https://localhost:8080/#) in <cell line: 2>()
      1 kwargs = {'cuda_device': 1}
----> 2 seq2rel = Seq2Rel(model, **kwargs)

12 frames
[/content/drive/MyDrive/seq2rel/seq2rel/seq2rel.py](https://localhost:8080/#) in __init__(self, pretrained_model_name_or_path, **kwargs)
     86         if "overrides" in kwargs:
     87             overrides.update(kwargs.pop("overrides"))
---> 88         archive = load_archive(pretrained_model_name_or_path, overrides=overrides, **kwargs)
     89         self._predictor = Predictor.from_archive(archive, predictor_name="seq2seq")
     90 

[/usr/local/lib/python3.9/dist-packages/allennlp/models/archival.py](https://localhost:8080/#) in load_archive(archive_file, cuda_device, overrides, weights_file)
    233             config.duplicate(), serialization_dir
    234         )
--> 235         model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device)
    236 
    237         # Load meta.

[/usr/local/lib/python3.9/dist-packages/allennlp/models/archival.py](https://localhost:8080/#) in _load_model(config, weights_path, serialization_dir, cuda_device)
    277 
    278 def _load_model(config, weights_path, serialization_dir, cuda_device):
--> 279     return Model.load(
    280         config,
    281         weights_file=weights_path,

[/usr/local/lib/python3.9/dist-packages/allennlp/models/model.py](https://localhost:8080/#) in load(cls, config, serialization_dir, weights_file, cuda_device)
    436             # get_model_class method, that recurses whenever it finds a from_archive model type.
    437             model_class = Model
--> 438         return model_class._load(config, serialization_dir, weights_file, cuda_device)
    439 
    440     def extend_embedder_vocab(self, embedding_sources_mapping: Dict[str, str] = None) -> None:

[/usr/local/lib/python3.9/dist-packages/allennlp/models/model.py](https://localhost:8080/#) in _load(cls, config, serialization_dir, weights_file, cuda_device)
    341         # in sync with the weights
    342         if cuda_device >= 0:
--> 343             model.cuda(cuda_device)
    344         else:
    345             model.cpu()

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in cuda(self, device)
    686             Module: self
    687         """
--> 688         return self._apply(lambda t: t.cuda(device))
    689 
    690     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    576     def _apply(self, fn):
    577         for module in self.children():
--> 578             module._apply(fn)
    579 
    580         def compute_should_use_set_data(tensor, tensor_applied):

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    576     def _apply(self, fn):
    577         for module in self.children():
--> 578             module._apply(fn)
    579 
    580         def compute_should_use_set_data(tensor, tensor_applied):

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    576     def _apply(self, fn):
    577         for module in self.children():
--> 578             module._apply(fn)
    579 
    580         def compute_should_use_set_data(tensor, tensor_applied):

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    576     def _apply(self, fn):
    577         for module in self.children():
--> 578             module._apply(fn)
    579 
    580         def compute_should_use_set_data(tensor, tensor_applied):

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    576     def _apply(self, fn):
    577         for module in self.children():
--> 578             module._apply(fn)
    579 
    580         def compute_should_use_set_data(tensor, tensor_applied):

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _apply(self, fn)
    599             # `with torch.no_grad():`
    600             with torch.no_grad():
--> 601                 param_applied = fn(param)
    602             should_use_set_data = compute_should_use_set_data(param, param_applied)
    603             if should_use_set_data:

[/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in <lambda>(t)
    686             Module: self
    687         """
--> 688         return self._apply(lambda t: t.cuda(device))
    689 
    690     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

What is the correct way of using Seq2rel with cuda? Thank you!

@JohnGiorgi
Copy link
Owner

JohnGiorgi commented Apr 18, 2023

seq2rel is just pytorch, really, so this suggests a problem with your environment. Does 'cuda_device': 1 exist? (can you see it if you call it nvidia-smi?). Any chance you meant 'cuda_device': 0?

@jjyhh1223
Copy link
Author

jjyhh1223 commented May 8, 2023

I tried doing the following,

from seq2rel import Seq2Rel
from seq2rel.common import util
model = 'model.tar.gz'
kwargs = {'cuda_device': 0}
seq2rel = Seq2Rel(model, **kwargs)

It worked, but I'm not sure if it's loaded on cuda because running examples seems very slowly (much slower than validating in training). Other than this, is there any ways to quickly running inference using Seq2rel?

@JohnGiorgi
Copy link
Owner

JohnGiorgi commented May 8, 2023

To see if the GPU is being utilized, you can run nvidia-smi on the same machine this code is running, and see if GPU utilization is >0%.

In general, the speed of inference will depend on quite a few arguments that you can control with the config or kwargs. max_length and beam_size have the largest effect, so I would try reducing either to see if you get a speed boost without harming performance. use_amp will really help if your GPU supports mixed-precision (and can actually slow things down if it doesn't). The larger the batch_size the better, for inference use the largest size that doesn't cause OOM errors.

The Seq2Rel interface is meant to just be a convenience thing. If you are trying to evaluate on a bunch of examples it might make more sense to use allennlp evaluate. You can see how to use that command here. The advice above on how to set the parameters for fastest inference is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants