Skip to content

Crossplatform fast and robust Python utility to free space on the data volume by hardlinking duplicate files.

Notifications You must be signed in to change notification settings

valuxin/myhardlinker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

myhardlinker

Crossplatform fast and robust Python utility to free space on the data volume by hardlinking duplicate files.

Requirments

  • Python 3.x
  • Modules: pandas tqdm
  • Minimum 512Mb free RAM for optimal performance, 1Gb is recommended

Usage

myhardlinker.py <path-to-the-directory>

myhardlinker.py C:\test - this command recursevily scan and deduplicate files in C:\test

myhardlinker.py . - this command recursevily scan and deduplicate files in current location

Screenshot 2021-12-15 162431

Limitations

Files systems have limitations on how many links could be made to one file. For example, NTFS allows maximum 1023 links to a single file, EXT4 - 65000 links. Currently, if the limit is exided for one unique source file the duplicate files over the limit will be left untouched.

To Do

  • Improve RAM usage efficiency
  • Workaround the FS limitations by using more unique files with same hash
  • Possibility to continue script work from last place after pause/stop

About

Crossplatform fast and robust Python utility to free space on the data volume by hardlinking duplicate files.

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages