Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

52 gpu utilization #58

Merged
merged 6 commits into from
Jun 26, 2024
Merged

52 gpu utilization #58

merged 6 commits into from
Jun 26, 2024

Conversation

giorgiapitteri
Copy link
Collaborator

@giorgiapitteri giorgiapitteri commented Jun 20, 2024

PR related to the issue #52

CUDA utilization now reaches 70% max. It still drops to 10% for some operations.

Changes made:

  • The whole dataset is loaded into cuda.
  • Covariance operations are done on cuda.
  • Batch size increased to 128.

@giorgiapitteri giorgiapitteri changed the title 52 gpu utilization DRAFT: 52 gpu utilization Jun 20, 2024
class Flow(torch.nn.Module):
"""Base implementation of a flow model"""

# Export mode determines whether the log_prob or the sample function is exported to onnx
export_modes = Literal["log_prob", "sample"]
export: export_modes = "log_prob"
device = "cpu"
device = "cuda"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this field "device" used somewhere?

Copy link
Collaborator

@fariedabuzaid fariedabuzaid Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class variable acts as a default value in case the someone accesses the device before calling to

obj = Flow(...)
obj.device

Since strings are immutable, the class variable cannot be changed and the instance variable "device" is created when someone sets the device variable. As soon as that happens the instance variable has precedence over the class variable.

In case that you are not familiar with class variables, have a look into this post

Anyway, this assumes that the initial location is always cpu. It is probably better to have it as another instantiation argument with default value "cpu". Then you can change the value in the config file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I was wondering indeed why then we need to pass it as an argument of the fit method. I get now that is the variable we set also in the config when creating the Flow object, if I am not mistaken.

Copy link
Collaborator

@fariedabuzaid fariedabuzaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I only saw that there where two potentially unnecessary casts to torch.Tensor. Could be that this causes some unnecessary copying.

src/veriflow/flows.py Outdated Show resolved Hide resolved
src/veriflow/flows.py Outdated Show resolved Hide resolved
@giorgiapitteri giorgiapitteri changed the title DRAFT: 52 gpu utilization 52 gpu utilization Jun 25, 2024
@giorgiapitteri giorgiapitteri marked this pull request as ready for review June 25, 2024 07:01
@giorgiapitteri giorgiapitteri mentioned this pull request Jun 25, 2024
3 tasks
Copy link
Collaborator

@fariedabuzaid fariedabuzaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks great improvement!

Summary:

  • On MNIST3 with batch size of 128 GPU utilization
    • Mostly between 50% - 70%
    • Some sudden short drops to 1% (after epoch?)
  • Hence, problem (mostly) solved

@fariedabuzaid fariedabuzaid merged commit bdd121f into main Jun 26, 2024
7 checks passed
@fariedabuzaid fariedabuzaid deleted the 52_gpu_utilization branch June 26, 2024 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants