On 18-Jun-10 09:42:47, Mark Rogers wrote:
On 18/06/10 10:36, MJ Ray wrote:
I'd be looking at scripting something with find, its -exec option, md5sum, sort, mv and maybe ln if I want to keep the filenames, but I've not looked at fslint.
The more I think about it the more I am thinking that I should go down this route. It will be a one-off process, so if it takes overnight to run, it doesn't really matter, and that way I can control what it does.
However, if a standard tool (like fslint) *can* do it, then I think it's the kind of tool I should learn to use. After all, I could write a script to find the files without using "find", but it's a good tool for that job and I'm glad I have learnt to use it.
--
Possbly useful in locating the duplicates (indeed using 'find') may be:
for i in `find . -type f -print` ; do ls -lgG --block-size=1 $i | awk -v F=$i '{S=$3};{print S " " $5 " " F}' ; done | sort -n
(note the back-quotes around the "find ... ").
This will produce a listing of all the files in or below the current directory, with their size in bytes in first position, followed by timestamp (hh:mm) in case that is useful, followed by the full pathname of the file, and sorted by increasing order of file size.
Thus duplicate files (i.e. with identical binary content) will have identical file sizes and so will be listed adjacent to each other (with the possible exception of other files which happen to have exactly the same file size -- though this is unlikely for image files).
If you know that all the image files have the same (e.g. ".jpg") extension, then the 'find' part could be replaced by
`find . -name '*.jpg' -print`
(or equivalent for other extensions). Note the necessity to put the "*.jpg" within ordinary single quotes (in addition to the back-quotes round the whole thing), since this is a regular expression to be passed as-is to find without being interpreted by the shell.
-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 18-Jun-10 Time: 11:31:47 ------------------------------ XFMail ------------------------------