On Tue, 2009-11-24 at 17:09 +0000, Mark Rogers wrote:
Does anyone make use of any tools which can search for duplicate files and (assuming they're on the same filesystem) replace the duplicates with hard links to the first copy of the file?
There seem to be a few scripts around but since it'll be messing with my files I'd rather go by recommendation if I can. (The plan is to free up some space on a server which has lots of duplicate images.)
Also, any "gotchas" that should put me off trying this would be appreciated.
The 'gotcha' is that hard-linked files are not copy-on-write, i.e. if a program opens the file under one of its names and re-writes the contents (or appends to the file) the change happens to the single underlying file under all of its names. As long as that is what you are expecting to happen then all is well.
When I am happy this will not cause a problem I have been using a home-grown program which I tried attaching and then this message got held for approval so instead, if you are interested, it is available at: http://pelvoux.gotadsl.co.uk/dupfind.c
Regards, Steve.