Finding and deleting duplicates
- David Moore
- Apr 12, 2021
- 3 min read
In this day and age having duplicates of files is not the disaster it used to be.
In the old days memory / storage was so limited and expensive that we were encouraged to aggresively trim the fat...and this inlcuded having more than one copy of a file...a photo...a whatever.
Of course this often resulted in the wrong things being deleted and forever lost.
Now it is common to keep three copies of your data, especially if you are following the 3-2-1 backup rules which states "3 copies of your data, in 2 different formats and 1 of them off-site".
All of my blog posts on backups are here FYI:
These days though, memory and storage are cheap and plentiful.
For a long time I have been telling people to NOT bother finding old files and deleting them.
Your time is far more valuable than the cost of more storage.
Most people have so much spare room on their computer that finding a few small files and deleting them is of absolutely no benefit to the machine's behaviour. It's just a waste of your time.
This is especially true when copying lots of files, photos etc. to a new machine or backing them up in some way.
And it is especailly, especially true if you are paying someone to do this for you.
But, and here comes a big "but", sometimes you discover that you have a "duplicates" issue and that your creeping dilligence (and reluctance to delete things) means you really do have more copies of files than you need.
Sure, you probably still have plenty of room to get on with things without needing to delete the redundancies, but sometimes these things need tidying up anyway.
I got to this point with my photos recently.
I had about 150GB of photos of which maybe 8GB were duplicates.
It doesn't sound like much, but it was a lot of photos.
This had come about because I'd got my hands on holiday photos and the like from my wife.
I'd moved all the photos into logically named folders but also left copies of them in the folders the way my wife had arranged them.
Now it sounds like I have a pretty good handle on my problem.
But that's only because I ran a Duplicate Finder program over my Pictures folder.
I used this one: https://www.digitalvolcano.co.uk/duplicatecleaner.html
Before then I only had a suspicion that there may be quite a few duplicates.
I had in fact been quite diligent in recent years with naming files and photos and reconciling things....but obviously not quite good enough.
As you might expect, if you Google around this subject you'll find a lot of options. Most of them are far from good.
Because this used to be a common problem, and many people still think it is, the bad guys often use "Duplicate Finding" as a means to sneak viruses and other bad things onto your computer.
So be wary when hunting down one that suits you.
When you do find one you like and trust, make sure you perform some dry runs and ensure that nothing important is being swept up in the fervour of tidying in-bulk.
These tools should have various ways for you to scan, review, backup, reverse out and investigate the ramifications of what they've found.
In the one I used, from Digital Volcano, it is possible to pick and choose a selection of folders to scan, folders to "protect", folders to exclude and much more.
I did a lot of playing and testing to ensure that I kept the versions of files I wanted and only deleted the ones I didn't.
This may sound weird. If the files were proper duplicates then why would I prefer one over another?
Well, for example, files can have exactly the same content or appearance but be named differently.
In the case with my wife's holiday photos, I'd given the files new names based on location etc.
So I only wanted to delete the duplicates which had the less useful name on them.
It's not as straight forward as it sounds and you may well be thinking (once again) that life is too short for this kind of stuffing around. Only you can make that call for you.
In the end, safe in the knowledge that I had good backups, I got to a point where I was able to save myself a lot of time by deleting the "right" duplicates found by the program.
Even if you don't end up deleting anything, these duplicate finding tools can be very useful for assessing just what your problems may be.
You can always park the deleting and do some manual tidying up before you go ahead and actually delete things.
Good luck, have fun.
David
Comments