Crossplatform fast and robust Python utility to free space on the data volume by hardlinking duplicate files.
- Python 3.x
- Modules:
pandas
tqdm
- Minimum 512Mb free RAM for optimal performance, 1Gb is recommended
myhardlinker.py <path-to-the-directory>
myhardlinker.py C:\test
- this command recursevily scan and deduplicate files in C:\test
myhardlinker.py .
- this command recursevily scan and deduplicate files in current location
Files systems have limitations on how many links could be made to one file. For example, NTFS allows maximum 1023 links to a single file, EXT4 - 65000 links. Currently, if the limit is exided for one unique source file the duplicate files over the limit will be left untouched.
- Improve RAM usage efficiency
- Workaround the FS limitations by using more unique files with same hash
- Possibility to continue script work from last place after pause/stop