As it says really. I need some advice on an rsync mechanism I use to routinely backup my desktop to my NAS.
The command I have used for some time is:
/usr/bin/rsync -rLptgoDvz --stats / --exclude-from=rsync-excludes /home/mick nas:/home/mick-backup
("/" indicates line break above and is not part of the command line)
That has worked fine for some long time, but I have just upgraded my desktop disk from 1TB to 2TB. In the process of doing so I moved the old disk to a new SATA port so that I could just copy across internally all my data from the old disk to the new. In doing so (using something like "cp -r /old-disk/micks-stuff /new-disk/micks-stuff") like a pillock I forgot to add the -p switch so all my files now have a current date/time as last modified. Guess what? My latest rsync now seems to be updating ALL by backed up files (over 800 GB of videos/photos/music etc) to the NAS. This will take some time.
I thought that rsync used a rather more sophisticated algorithm (including checksums for example) than simply a check of last modified time stamp. In trying to reduce the file transfer I have tried modifying the rsync command by taking out the -t switch, but of course that doesn't work, all that happens is that the target file date/time remains unmodified after the copy.
So, short of actually letting the rsync run to completion, can anyone suggest a better approach (and yes, I know I could have saved myself a lot of pain by being more careful earlier....)
Cheers
Mick ---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
I think you need a u in the options. -u, --update skip files that are newer on the receiver it might also be an idea to use --dry-run to see what it's going to do.
Nev
On 15/02/13 12:29, mick wrote:
As it says really. I need some advice on an rsync mechanism I use to routinely backup my desktop to my NAS.
The command I have used for some time is:
/usr/bin/rsync -rLptgoDvz --stats / --exclude-from=rsync-excludes /home/mick nas:/home/mick-backup
("/" indicates line break above and is not part of the command line)
That has worked fine for some long time, but I have just upgraded my desktop disk from 1TB to 2TB. In the process of doing so I moved the old disk to a new SATA port so that I could just copy across internally all my data from the old disk to the new. In doing so (using something like "cp -r /old-disk/micks-stuff /new-disk/micks-stuff") like a pillock I forgot to add the -p switch so all my files now have a current date/time as last modified. Guess what? My latest rsync now seems to be updating ALL by backed up files (over 800 GB of videos/photos/music etc) to the NAS. This will take some time.
I thought that rsync used a rather more sophisticated algorithm (including checksums for example) than simply a check of last modified time stamp. In trying to reduce the file transfer I have tried modifying the rsync command by taking out the -t switch, but of course that doesn't work, all that happens is that the target file date/time remains unmodified after the copy.
So, short of actually letting the rsync run to completion, can anyone suggest a better approach (and yes, I know I could have saved myself a lot of pain by being more careful earlier....)
On Fri, 15 Feb 2013 16:05:54 +0000 nev young nev@nevilley.org.uk allegedly wrote:
I think you need a u in the options. -u, --update skip files that are newer on the receiver it might also be an idea to use --dry-run to see what it's going to do.
Nev
Nev
Thnaks for that, but I don't think it will work. The problem I have is that all the files on the /sender/ are now newer than the original backups on the NAS.
I may just have to recopy all the source files from disk to disk using the -p switch to cp so that the file time stamps are maintained. That way the NAS won't see all the source files as being newer next time I kick off the rsync.
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 15/02/13 19:29, mick wrote:
Nev
Thnaks for that, but I don't think it will work. The problem I have is that all the files on the /sender/ are now newer than the original backups on the NAS.
I may just have to recopy all the source files from disk to disk using the -p switch to cp so that the file time stamps are maintained. That way the NAS won't see all the source files as being newer next time I kick off the rsync.
I'm not sure why you didn't use rsync to copy them back rather than cp ?
Anyway the answer to your original question is that the default behaviour for rsync is to only compute file differences via the checksums if the modification dates between source and destination indicate the file has changed (source is newer than the destination) So it's not actually recopying everything (though if things like permissions have changed on the source files they will now overwrite those on the target and naturally the timestamps will be updated) Previously when you ran your backup it would have just skipped anything that had the same or older timestamp than the target.
The other thing you could do is use rsync to copy over the top resetting the timestamps back. This would be quicker than using cp as it won't have to copy the whole file over, however you'd need to set some options to tell to ignore the newer timestamps on the target.
On Sat, 16 Feb 2013 11:23:52 +0000 Wayne Stallwood ALUGlist@digimatic.co.uk allegedly wrote:
I'm not sure why you didn't use rsync to copy them back rather than cp ?
Because I'm stupid and didn't think of that. :-)
I automatically equate rsync (along with rcp) with remote copying and cp with local copying. So my reflex action was to use cp between the two local internal disks and then to later run my rsync backup when some of the local files had changed (mainly email). I then saw that rsync was apparently re-copying /all/ my local files to the NAS so killed it.
Anyway the answer to your original question is that the default behaviour for rsync is to only compute file differences via the checksums if the modification dates between source and destination indicate the file has changed (source is newer than the destination) So it's not actually recopying everything (though if things like permissions have changed on the source files they will now overwrite those on the target and naturally the timestamps will be updated) Previously when you ran your backup it would have just skipped anything that had the same or older timestamp than the target.
That is what puzzled me and prompted my question. The rsync certainly /looked/ as if it was recopying everything. The process was taking a long time.
The other thing you could do is use rsync to copy over the top resetting the timestamps back. This would be quicker than using cp as it won't have to copy the whole file over, however you'd need to set some options to tell to ignore the newer timestamps on the target.
Thanks for the advice.
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 15/02/13 12:29, mick wrote:
internally all my data from the old disk to the new. In doing so (using something like "cp -r /old-disk/micks-stuff /new-disk/micks-stuff") like a pillock I forgot to add the -p switch so all my files now have a current date/time as last modified. Guess what? My latest rsync now
For that sort of copying I recommend using the '-a' option aka '--archive'. Even better might be to use a streaming tar 'tar -c... | tar -x ...' or similar with cpio or just rsync. Just read the manuals and be careful of the gotchas such as symlinks, sparse files, very long file names, device files, ...
On 15/02/13 19:29, mick wrote:
I may just have to recopy all the source files from disk to disk using the -p switch to cp so that the file time stamps are maintained. That way the NAS won't see all the source files as being newer next time I kick off the rsync.
You just need to fix the timestamps so take a look at 'find' and 'touch'. For 'find' look at the '-exec' option and '{}'. For 'touch' look at '--reference=', ' --no-dereference', and '--no-create' options. The rest is left as an exercise for the reader. :)
Mike.
On Sun, 17 Feb 2013 11:38:16 +0000 Michael Dorrington michael.dorrington@gmail.com allegedly wrote:
On 15/02/13 12:29, mick wrote:
internally all my data from the old disk to the new. In doing so (using something like "cp -r /old-disk/micks-stuff /new-disk/micks-stuff") like a pillock I forgot to add the -p switch so all my files now have a current date/time as last modified. Guess what? My latest rsync now
For that sort of copying I recommend using the '-a' option aka '--archive'. Even better might be to use a streaming tar 'tar -c... | tar -x ...' or similar with cpio or just rsync. Just read the manuals and be careful of the gotchas such as symlinks, sparse files, very long file names, device files, ...
cpio? obviously an old unix hand....
On 15/02/13 19:29, mick wrote:
I may just have to recopy all the source files from disk to disk using the -p switch to cp so that the file time stamps are maintained. That way the NAS won't see all the source files as being newer next time I kick off the rsync.
You just need to fix the timestamps so take a look at 'find' and 'touch'. For 'find' look at the '-exec' option and '{}'. For 'touch' look at '--reference=', ' --no-dereference', and '--no-create' options. The rest is left as an exercise for the reader. :)
Actually after Wayne's earlier response (and a bit more thought) I did consider just touching all the NAS files so that they had a later timestamp than my (new) desktop source. I think that should have solved the rsync update problem. But what I actually ended up doing was re-copying from the old disk to new (which didn't take all that long). What I didn't say in my earlier email (because it wasn't relevant) was that I have an anoracky reason of my own for wanting to keep the original file modification times.
I have a very large collection of MP4 video files I have ripped from my DVDs. This collection dates back to 2007 when I first started watching such videos on a PSP on my daily commute to and from London. Over the years since, I have updated the PSP to an Archos 4.3, and Archos 7.0, and latterly a Samsung Galaxy tab. The original PSP was very finicky about the format of the files, and of course had a small screen. Now whilst my latest device can still happily display my old 320x240 and 368x208 rips, it can handle much higher resolutions encoded in H264. So my later files have different encoding standards.
Now that I have more time on my hands I keep promising myself that I will re-encode the older files originally ripped for the PSP. The file modification time helps me identify those files pretty quickly.
Of course, whether I now bother to do that given that my commuting days are over is another matter.
Thanks
Mick ---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 17 February 2013 13:26, mick mbm@rlogin.net wrote:
On Sun, 17 Feb 2013 11:38:16 +0000
Michael Dorrington michael.dorrington@gmail.com allegedly wrote:
You just need to fix the timestamps so take a look at 'find' and 'touch'. For 'find' look at the '-exec' option and '{}'. For 'touch' look at '--reference=', ' --no-dereference', and '--no-create' options. The rest is left as an exercise for the reader. :)
What I didn't say in my earlier email (because it wasn't relevant) was that I have an anoracky reason of my own for wanting to keep the original file modification times.
Actually, Michael's suggestion would have solved this for you. You can tell touch to change the timestamp of file A to match that of file B. Given that the paths of A (the copied file) & B (the original file) will be pretty similar, using "find" to find any files on the old disk which haven't changed since you did the copy, and touch (via --exec) to have the copied file's timestamp updated to match the orginal file's, the process should have been quite quick even across the network. Quite elegant, in fact.
On Mon, 18 Feb 2013 09:05:10 +0000 Mark Rogers mark@quarella.co.uk allegedly wrote:
On 17 February 2013 13:26, mick mbm@rlogin.net wrote:
On Sun, 17 Feb 2013 11:38:16 +0000
Michael Dorrington michael.dorrington@gmail.com allegedly wrote:
You just need to fix the timestamps so take a look at 'find' and 'touch'. For 'find' look at the '-exec' option and '{}'. For 'touch' look at '--reference=', ' --no-dereference', and '--no-create' options. The rest is left as an exercise for the reader. :)
What I didn't say in my earlier email (because it wasn't relevant) was that I have an anoracky reason of my own for wanting to keep the original file modification times.
Actually, Michael's suggestion would have solved this for you. You can tell touch to change the timestamp of file A to match that of file B. Given that the paths of A (the copied file) & B (the original file) will be pretty similar, using "find" to find any files on the old disk which haven't changed since you did the copy, and touch (via --exec) to have the copied file's timestamp updated to match the orginal file's, the process should have been quite quick even across the network. Quite elegant, in fact.
OK - I surrender. Whilst I am familiar enough with find to use formulations such as :
find . "*.o" -exec rm {} ;
and I can see that something like
find /old-disk/micks-stuff -exec touch -c -r {} ;
would pass the access time of the file(s) found on "old-disk" to touch for application to the new files, I don't see how I can get touch to recurse through "new-disk" to apply those modification times.
A naive formulation such as:
find /old-disk/micks-stuff -exec touch -c -r {} ; /new-disk/micks-stuff is just syntactically wrong.
So - what am I missing?
Cheers
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 19 Feb 12:37, mick wrote:
On Mon, 18 Feb 2013 09:05:10 +0000 Mark Rogers mark@quarella.co.uk allegedly wrote:
On 17 February 2013 13:26, mick mbm@rlogin.net wrote:
On Sun, 17 Feb 2013 11:38:16 +0000
Michael Dorrington michael.dorrington@gmail.com allegedly wrote:
You just need to fix the timestamps so take a look at 'find' and 'touch'. For 'find' look at the '-exec' option and '{}'. For 'touch' look at '--reference=', ' --no-dereference', and '--no-create' options. The rest is left as an exercise for the reader. :)
What I didn't say in my earlier email (because it wasn't relevant) was that I have an anoracky reason of my own for wanting to keep the original file modification times.
Actually, Michael's suggestion would have solved this for you. You can tell touch to change the timestamp of file A to match that of file B. Given that the paths of A (the copied file) & B (the original file) will be pretty similar, using "find" to find any files on the old disk which haven't changed since you did the copy, and touch (via --exec) to have the copied file's timestamp updated to match the orginal file's, the process should have been quite quick even across the network. Quite elegant, in fact.
OK - I surrender. Whilst I am familiar enough with find to use formulations such as :
find . "*.o" -exec rm {} ;
and I can see that something like
find /old-disk/micks-stuff -exec touch -c -r {} ;
would pass the access time of the file(s) found on "old-disk" to touch for application to the new files, I don't see how I can get touch to recurse through "new-disk" to apply those modification times.
A naive formulation such as:
find /old-disk/micks-stuff -exec touch -c -r {} ; /new-disk/micks-stuff is just syntactically wrong.
So - what am I missing?
That by now you could have just done: rsync -avP /old-disk/micks-stuff /new-place
And have had done with it?
On Tue, 19 Feb 2013 14:53:07 +0000 Brett Parker iDunno@sommitrealweird.co.uk allegedly wrote:
On 19 Feb 12:37, mick wrote:
So - what am I missing?
That by now you could have just done: rsync -avP /old-disk/micks-stuff /new-place
And have had done with it?
He!
Well, the original rsync problem is long solved. :-)
What I am asking now is what I am missing in the "find -exec touch" approach.
Cheers
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On Tue, 19 Feb 2013 12:37:58 +0000, mbm@rlogin.net said:
find . "*.o" -exec rm {} ;
You may have omitted a '-name' there. Either way, simpler is:
find . -name "*.o" -delete
On Tue, 19 Feb 2013 19:17:46 +0000 Keith Edmunds kae@midnighthax.com allegedly wrote:
On Tue, 19 Feb 2013 12:37:58 +0000, mbm@rlogin.net said:
find . "*.o" -exec rm {} ;
You may have omitted a '-name' there.
Yes - simple typo.
Either way, simpler is:
find . -name "*.o" -delete
Again, yes, but that does not demonstrate the file name substitution in the use of the "{}" string which was the object of Michael's (and Mark's) suggestion. It is that usage with touch -r that I was asking about.
Cheers
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 20 February 2013 10:12, mick mbm@rlogin.net wrote:
Again, yes, but that does not demonstrate the file name substitution in the use of the "{}" string which was the object of Michael's (and Mark's) suggestion. It is that usage with touch -r that I was asking about.
Something like: cd /old-disk/micks-stuff find . -exec echo touch "/new-place/{}" --reference="{}" --no-dereference --no-create ;
(untested).
Note I have put an echo in there so you can see what it would do, take the echo out to run it for real.
Mark
On Wed, 20 Feb 2013 10:37:06 +0000 Mark Rogers mark@quarella.co.uk allegedly wrote:
On 20 February 2013 10:12, mick mbm@rlogin.net wrote:
Again, yes, but that does not demonstrate the file name substitution in the use of the "{}" string which was the object of Michael's (and Mark's) suggestion. It is that usage with touch -r that I was asking about.
Something like: cd /old-disk/micks-stuff find . -exec echo touch "/new-place/{}" --reference="{}" --no-dereference --no-create ;
(untested).
Ahah! Thank you Mark. Almost, but not quite, perfect. Once I could see what you were doing it was easy enough to correct to:
cd /old-disk/micks-stuff
find . -exec touch /new-disk/micks-stuff/{} --reference={} --no-dereference --no-create ;
(i.e. take out the additional escaped quotes)
Many thanks again.
Mick
---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------
On 20 February 2013 17:13, mick mbm@rlogin.net wrote:
(i.e. take out the additional escaped quotes)
My concern would be filenames that contain spaces. If it didn't work with the quotes I'm sure someone else can tell me how I should have quoted it though.
-c will do it, from the man:
-c, --checksum skip based on checksum, not mod-time & size
But you'll have to do this every time from now on and it takes a while as it has to checksum each file. Quicker than copying it again though.
Maybe do that once to bring everything in sync, then run something to update the timestamps in bulk?
Neil
On Sat, 16 Feb 2013 13:04:18 +0000 Neil Sedger neil@moley.org.uk allegedly wrote:
-c will do it, from the man:
-c, --checksum skip based on checksum, not mod-time & size
But you'll have to do this every time from now on and it takes a while as it has to checksum each file. Quicker than copying it again though.
Neil
Thanks for this. It (along with other comments) prompted a thorough re-read of the man page. I can see that this might help, but there appears to be a downside in the computation of a 128 bit MD5 hash for each file. That might be quick enough on my desktop, but I fear that the NAS might struggle a bit and the rsync might then take longer than the straight copy would have done. I might experiment some time later.
I did, however, note that the --size-only option might have helped me because that would check only the file sizes and not the modification time. Since the files haven't changed in size, then no re-transfer would be necessary.
Overall lesson. Don't assume that what I think I know is correct. And read the manual properly. Oh, and don't cock things up in the first place.
Thanks again to all who replied.
Mick ---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------