A friend lost all their photos from their phone; luckily they had been on a MicroSD card and photorec recovered them all for me, and renrot renamed them all (and fixed their timestamps) based on their EXIF data.
So what I now have is a directory of images, but of-course the phone had created thumbnails and the recovery has recovered them as well, and I'd like to delete them.
As a human it's easy to do: find two images that "look" the same, and delete the smaller one. But there are hundreds so this human isn't up for the job.
Any suggestions as to how to do this in software?
Just deleting all the smaller images isn't enough as there do seem to be some small images for which there isn't a larger counterpart (I'm guessing maybe they came from the front facing camera or some other source).
On 07/09/17 10:43, Mark Rogers wrote:
A friend lost all their photos from their phone; luckily they had been on a MicroSD card and photorec recovered them all for me, and renrot renamed them all (and fixed their timestamps) based on their EXIF data.
So what I now have is a directory of images, but of-course the phone had created thumbnails and the recovery has recovered them as well, and I'd like to delete them.
Two obvious solutions. List all files by file size (is there an option in ls, or use ls -al then sort?) Delete files smaller than a certain size, or move files smaller than a certain size to a different directory and then delete ones that are not thumbnails. Alternatively, do the same graphically in a file manager, sort by size, pick a small size, then go through each and manually delete the thumbnails.
The other solution. Use a tool to read the exif data of all files (is it exiftool?). Work out the image sizes. Work out what sizes your thumbnails are. write a script to delete or move these files somewhere else where they can be manually pruned.
As a human it's easy to do: find two images that "look" the same, and delete the smaller one. But there are hundreds so this human isn't up for the job.
Any suggestions as to how to do this in software?
Just deleting all the smaller images isn't enough as there do seem to be some small images for which there isn't a larger counterpart (I'm guessing maybe they came from the front facing camera or some other source).
Well, that's the problem isn't it. You can either spend time programming a solution may work for most, or you can spend time manually deleting.
Good luck.
Steve
On Thu, 2017-09-07 at 11:04 +0100, steve-ALUG@hst.me.uk wrote:
Two obvious solutions. List all files by file size (is there an option in ls, or use ls -al then sort?)
ls -S
Or
ls -rS
If you want them the other way round
Or
ls -lS
If you want a full entry.
On Thu, 2017-09-07 at 11:04 +0100, steve-ALUG@hst.me.uk wrote:
The other solution. Use a tool to read the exif data of all files (is it exiftool?). Work out the image sizes. Work out what sizes your thumbnails are. write a script to delete or move these files somewhere else where they can be manually pruned.
Won't full size images and their thumbnails have close (identical?) creation dates in the EXIF data? You could use that to identify matching images.
Other than doing some kind of Google-esque image recognition, I don't think there's any way of matching them by appearance!
On 07/09/17 10:43, Mark Rogers wrote:
A friend lost all their photos from their phone; luckily they had been on a MicroSD card and photorec recovered them all for me, and renrot renamed them all (and fixed their timestamps) based on their EXIF data.
So what I now have is a directory of images, but of-course the phone had created thumbnails and the recovery has recovered them as well, and I'd like to delete them.
As a human it's easy to do: find two images that "look" the same, and delete the smaller one. But there are hundreds so this human isn't up for the job.
Any suggestions as to how to do this in software?
Just deleting all the smaller images isn't enough as there do seem to be some small images for which there isn't a larger counterpart (I'm guessing maybe they came from the front facing camera or some other source).
Well, sorting by size sorts the first problem, but not the second.
If there's a way of sorting them so they small and large versions are together, then a two-pane approach with sorted by size on one side, and sorted by pairing on the other would make the job easier.
Sadly, I don't know of a tool to do it automatically.
Cheers, Laurie.
On 7 September 2017 at 11:07, Laurie Brown laurie@brownowl.com wrote:
Well, sorting by size sorts the first problem, but not the second.
If there's a way of sorting them so they small and large versions are together, then a two-pane approach with sorted by size on one side, and sorted by pairing on the other would make the job easier.
Really, what I think I need is some kind of tool that could create some kind of image fingerprint based on what it "looks" like so that I can in some way sort by similarity. That's lead me to some interesting websites but no actual tools yet. But "apt search similar images" lead me to findimagedupes, which looks promising; it's man page says that what it does is:
To calculate an image fingerprint: 1) Read image. 2) Resample to 160x160 to standardize size. 3) Grayscale by reducing saturation. 4) Blur a lot to get rid of noise. 5) Normalize to spread out intensity as much as possible. 6) Equalize to make image as contrasty as possible. 7) Resample again down to 16x16. 8) Reduce to 1bpp. 9) The fingerprint is this raw image data.
The default similar is 90% which is too vague for me as it picks up two similar images taken together, but at a 100% target similarity it seems spot on.
I still need to script something to use the results though...
Mark