-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReproZip tends to pack all fonts #360
Comments
Files whose stats only are looked at (not content) don't get packed, so it is probable that the program actually read all those files. I can see that being a problem size-wise, and you might want to remove them from the configuration file, unfortunately ReproZip can't really tell the difference between files "opened and needed" and files "opened just to look". Special code could be written for the case of fonts, but only if we can figure out which ones actually get used... |
I can reproduce it here, matplotlib looks at all the fonts. They actually get read. I am not sure there is something we can do about it 😅 Maybe something about the specific parts of the files that are read, or we can hook whatever library is used to enumerate the fonts? Thank you for reporting! |
Could it be possible to filter specific files from the rpz? E.g. everything in The many font files might also be the issue why the created graphs are unusable... and I noticed a few copyrighted fonts there, too, which I shouldn't upload as part of an rpz to a repo. The file Without the font cache, With the font cache and a manual configuration of the default font only the selected font ( Maybe the recommendation is simply to run the trace twice and define a default font if you use matplotlib? @remram44 It would be great if you could reproduce my findings. Not sure what reprozip can do better... reducing the size of the rpz is not really my issue, but the understandability of the tl;dr By running |
thouching/reading all the installed fonts to build a font cache blows up the number of files listed and packed by tools to trace reproducibility, e.g. ReproZip (see VIDA-NYU/reprozip#360)
This might be something ReproZip could look for, e.g. if |
Or even check if the trace writes anything to |
This is likely to catch things that write there every time, so we can't recommend that the user run the experiment a second time on all those files. There could be a warning though, the same we have for files that are read then overwritten. |
I am trying to pack a Python 2-based model, and see a lot of font-related files in my config.yml.
Note: I have not packed or unzipped yet!
More information below, a "Don't worry about it, storage for fonts is cheap" might be a perfectly fine answer to this issue.
Snippet of
packages
:Snippet of
other_files
:I'm rougly doing the following:
I am running this in a virtual environment. If that might shake things up, I'm happy to extend the script above to also do that.
The text was updated successfully, but these errors were encountered: