On Wed, 13 Feb 2013 23:05:23 +0000 steve-ALUG@hst.me.uk allegedly wrote:
To get a entire website, my thought would be wget e.g.
http://www.linuxjournal.com/content/downloading-entire-web-site-wget
If you download someone's website, you're almost certainly falling foul of copyright rules at the very least, and possibly the (er) computer miss-use act (or whatever it's called). If you try an access part of a website that the owner didn't intend for you to access (e.g. you access a hidden part of the website the owner didn't want anyone to access, and they haven't given you permission) then you're probably breaking the law.
wget respects the robots.txt exclusion standard. So you should be OK scraping the site for off-line viewing. After all, most websites get indexed by search engines and the web owner must expect the site to be viewed. (But IANAL either!)
Mick ---------------------------------------------------------------------
blog: baldric.net gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312
---------------------------------------------------------------------