I have reorganized my backups. In the process, I wanted to do some cleaning up, and rmlint proved to be a practical tool for removing duplicates.
Tip: Make a backup beforehand to ensure you don’t lose important data. Most backup tools like restic deduplicate anyway, so the cleanup is for your own benefit, not to save space in backups!
https://github.com/sahib/rmlint
https://rmlint.readthedocs.io/en/master/#
For macOS:
brew install rmlint
And for Linux:
sudo apt-get install rmlint
Then simply run rmlint
in the desired directory. Without any additional arguments or options, rmlint
compares the hash of files and does not consider the file name or date. Two files will be created: a rmlint.sh
bash script and a rmlint.json
file.
The JSON file contains the found duplicates, as does the Bash script. The bash script also includes all the commands to clean up the duplicates. If you are satisfied with the contents of rmlint.sh
, you can execute this script. However, you should definitely review the script, as rmlint
does not always clean up as desired. The option “–merge-directories” is also helpful – the tutorial in the official documentation is very good and should be read and understood.
For the first test run, it is recommended to start with rmlint
in a manageable folder and not directly in large directories with many duplicates.
Good luck with your cleanup!
Photo by Christina Radevich on Unsplash