Skip to content

A command line application that finds duplicate files and removes them. Duplicate files can also be replaced with symbolic links or hard links.

Notifications You must be signed in to change notification settings

samuelkai/Dedup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dedup

A command line application that finds duplicate files and removes them. Duplicate files can also be replaced with symbolic links or hard links.

The application can be built using CMake. Example:

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build .

The following help text for the application can be accessed with dedup --help:

 - find and delete duplicate files
Usage:
  /home/samuel/koodi/gradu/Dedup/build/dedup [OPTION...] path1 [path2] [path3]...
By default, the user is prompted to select which duplicates to keep.
Symbolic links, extra hard links and empty files are ignored.

 Action options:
  -d, --delete     Without prompting, delete duplicate files, keeping only
                   one file in each set of duplicates. Files in paths given
                   earlier in the argument list have higher precedence to be kept.
                   If there are duplicates that were found in the same given
                   path, the file with the earliest modification time is kept.
                   Must be specified twice in order to avoid accidental use.
  -k, --hardlink   Without prompting, keep only one file in each set of
                   duplicates and replace the others with hard links to the one
                   kept. Files in paths given earlier in the argument list have
                   higher precedence to be the target of the link (whose
                   metadata is kept). If there are duplicates that were found in the
                   same given path, the file with the earliest modification
                   time is the target.
  -l, --list       List found duplicates.
  -s, --summarize  Print only a summary of found duplicates.
  -y, --symlink    Without prompting, keep only one file in each set of
                   duplicates and replace the others with symlinks to the one kept.
                   Files in paths given earlier in the argument list have
                   higher precedence to be the target of the link. If there are
                   duplicates that were found in the same given path, the file
                   with the earliest modification time is the target.

 Other options:
  -a, --hash N   Hash digest size in bytes, valid values are 1, 2, 4, 8
                 (default: 8)
  -b, --bytes N  Number of bytes from the beginning of each file that are
                 used in hash calculation. 0 means that the whole file is hashed.
                 (default: 4096)
  -h, --help     Print this help
  -n, --no-hash  In the initial comparison step, use file contents instead of
                 hash digests. Doesn't affect the result of the program.
                 Mutually exclusive with the argument 'two'. Implies the argument
                 'vector', and is mutually exclusive with it.
  -r, --recurse  Search the paths for duplicates recursively
  -t, --two      Use two layers of unordered maps to store the candidates for
                 deduplication. Doesn't affect the result of the program.
                 Mutually exclusive with the arguments 'no-hash' and 'vector'.
  -v, --vector   Use a vector instead of an unordered map to store the
                 candidates for deduplication. Doesn't affect the result of the
                 program. Mutually exclusive with the arguments 'no-hash' and
                 'two'.

About

A command line application that finds duplicate files and removes them. Duplicate files can also be replaced with symbolic links or hard links.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published