On 08 Jul 12:33, Richard Parsons wrote:
On Thu, 8 Jul 2010, Brett Parker wrote:
but I'm now bemused as to why the echo of the first line...
echo '1,$n' | edbrowse http://yahoo.com/
Would work just as well...
Or:
edbrowse <<EOF e http://yahoo.com/ 1,n$ EOF
Which removes the echo entirely, and makes it obvious what to run...
Yes, the "$n" is a literal string to be passed to edbrowse. "$" means the last line of the file and so "1,$n" means "print the lines from the first line to the last line including the line numbers".
I like your final version best with the EOF. It's very clear, thanks. I'm quite new to bash scripting.
By the way, I'm very excited to have found a browser that is easily scriptable. Suddenly it seems a breeze to write a shell script that will web scrape. Has everyone always been able to do this easily with other programs and I just never noticed?
I must admit I tend to use python's mechanize module for it instead, and not worry about javascript at all... But then, I tend to assume that people aren't building non-accessable websites, and when they are, I tend to not want to use 'em ;)
(Oh, and in that example, I should have put a \ in front of the $, but as there was nothing following the dollar, it wasn't resolved as a variable :)
I have, before, scripted a fair few things using urllib + urllib2 and HTMLParser in the python base, but that's for when you need something with very few dependencies... and the newer httplib is quite nice. By far, though, mechanize's Browser 'just works' for scripting.
The usual shell type way would be to use one of: wget -O - http://that.place/ w3m -dump_source http://that.place/ w3m -dump http://that.place/ lynx -dump http://that.place/
The first two will actually just dump the html. the last two do slightly more interesting things :)
But, it does look like edbrowse is quite sensible in its handling of javascript, so I'll make a note of it. Thanks.
Cheers,