-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataLoader is very slow when using SubjectsDataset #941
Comments
Hi, @ivezakis. You are using one process to load 8 images, so it will be 8 times slower. This is expected. To make it faster, you should use a |
Hi @fepegar, in fact I am using the maximum number of workers for my machine in the dataloader, num_workers = 12. Sorry that wasn't accurate on the code I provided. Please consider re-opening this. The difference is rather large in my experience. For a batch size of 8, it is over 40 times. Picture attached. Edit: Also tried it with batch size one, it's 6.8 seconds vs 3.6. |
Yes,I have meet the same problem with yours. it is very very slow!(at least 30 times than actually model traning time) but I don't have good ways to resolve it. Have you get any good method? |
hi |
Can you please provide a minimal, reproducible example? |
@romainVala I've also noticed that behavior. For example, in a DGX with 40 cores, my code was fastest using only 12. |
yes, after i increase the num_workers(16) of Queue, the speed of preparing dataloader get fast. By the way, i found the transform i used influence the speed. when i remove the RandomAffine(degrees=20), load time reduce half. |
Is there an existing issue for this?
Problem summary
When using SubjectsDataset with PyTorch dataloader, iterating over the dataloader is incredibly slow. Naturally, this slows training down as well.
When iterating over the SubjectsDataset however, it is significantly faster.
In my experience, starting to iterate over Subjects dataset takes a few seconds (<10), while for dataloader to begin, it takes more than a minute.
Code for reproduction
Actual outcome
Iterating over loader is much more slow than iterating over subjects.
Error messages
No response
Expected outcome
Performance should be similar.
System info
The text was updated successfully, but these errors were encountered: