Clean Up Duplicates on Linux and macOS

I have reorganized my backups. In the process, I wanted to do some cleaning up, and rmlint proved to be a practical tool for removing duplicates.

Tip: Make a backup beforehand to ensure you don’t lose important data. Most backup tools like restic deduplicate anyway, so the cleanup is for your own benefit, not to save space in backups!

https://github.com/sahib/rmlint

https://rmlint.readthedocs.io/en/master/#

For macOS:

brew install rmlint

And for Linux:

sudo apt-get install rmlint

Then simply run rmlint in the desired directory. Without any additional arguments or options, rmlint compares the hash of files and does not consider the file name or date. Two files will be created: a rmlint.sh bash script and a rmlint.json file.

The JSON file contains the found duplicates, as does the Bash script. The bash script also includes all the commands to clean up the duplicates. If you are satisfied with the contents of rmlint.sh, you can execute this script. However, you should definitely review the script, as rmlint does not always clean up as desired. The option “–merge-directories” is also helpful – the tutorial in the official documentation is very good and should be read and understood.

For the first test run, it is recommended to start with rmlint in a manageable folder and not directly in large directories with many duplicates.

Good luck with your cleanup!

Photo by Christina Radevich on Unsplash

Clean Up Duplicates on Linux and macOS

Leave a Reply Cancel reply