New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
starting visit in a directory with 200K files is slow #19473
Comments
Do they have to start it there? |
I think we could be smarter about this. I think we may be calling I'd bet other software doesn't take long to launch in such a context. |
Why do we scan at startup? Why not only when the user requests File->Open or any other such commands that needs the file list? |
Great question and, honestly, I don't know for sure that we do scan on startup. I think we do somewhere early though and, in align with your thoughts, I think we do it unnecessarily. We should do it as you describe and, perhaps, with some limits on what we try to scan. |
FYI, there are these notes from a 2015 tutorial series. |
Good find! |
user reports that starting in another dir and then browsing to that dir takes like 10 seconds to list the files. starting visit in that dir, takes like 20min to startup. I wonder if there are other checks going on. |
Still concerned that this user seems to be working in a way that suggests they think this approach is "normal". I thought all of our codes now are putting |
I did some quick tests on my laptop with a dir with 20,000 |
Is there any config setting the user could have that may exacerbate issues? |
I added a
Didn't see any attempts to scan file contents. But, on macOS, we may be able to get regular file vs. directory entry back from a From ChatGPT... In general, when using The
The Using
|
I think the issue is that Lustre doesn't support the |
Here is what we could do. In cases where the size of a directory is above K entries (or K entries falling into some common pattern naming), we could opt to assume all entries are regular files and avoid the |
So, I am wrong. I just tested the above |
I think this case was on GPFS, which might not be able to handle the metadata ops well. |
Well, as I mention above, I tested both GPFS (IBM Spectrum Scale now) and Lustre on a dir 10% the size (in terms of inodes anyways) and saw nothing approaching even 10 seconds worth of delay. |
Was this on SCF by any chance? |
Yes |
Ok, I'll perform some similar tests on SCF next tuesday. |
@cyrush do we know what version of VisIt this was? |
Ok, I created a dir on |
Ok, I've taken a closer look with I created a dir with 100,000 temp symlinks (random names) all pointing to a singe, 30 MB, silo file. The situation with VisIt doing However, 100,000 I do not think the delay in VisIt startup is related to mdserver getting the current file list. From what I can tell, the mdserver starts quickly but the GUI splash screen seems to hang after printing a message The timings files all show very small values for each activity and then end with |
Describe the bug
User reports starting visit in a directory with 200K files on gpfs file system, launching the GUI hangs and takes quite some time.
Pretty sure its listing those files, and there may not be much we can do to avoid that cost.
The text was updated successfully, but these errors were encountered: