iDunno@sommitrealweird.co.uk wrote:
I have, before, scripted a fair few things using urllib + urllib2 and HTMLParser in the python base, but that's for when you need something with very few dependencies... and the newer httplib is quite nice. By far, though, mechanize's Browser 'just works' for scripting.
The usual shell type way would be to use one of: wget -O - http://that.place/ w3m -dump_source http://that.place/ w3m -dump http://that.place/ lynx -dump http://that.place/
The first two will actually just dump the html. the last two do slightly more interesting things :)
But, it does look like edbrowse is quite sensible in its handling of javascript, so I'll make a note of it. Thanks.
Another thing to make note of for people who need a middle ground between a whole browser and HTMLParser would be BeautifulSoup, which is wonderful for scraping web content, especially if you're used to using HTMLParser. I say middle ground because while it's not a standard library, it covers lots of bases like bad markup and so on, without the overhead of running a browser.
Programs like Firefox and Evolution are clearly marvellous achievements, but what a shame that the rise of >the GUI necessitates the fall of interoperability between programs.
Out of interest, the rise of the GUI means no such thing. Two of the large desktop suites, gnome and KDE have had scripting interfaces for a while to allow users and apps to communicate with each other.
In fact, before KDE and its apps were completely ruined, I used the superlatively useful DCOP for years to make desktop and development apps do whatever I wished, including scraping content derived using Konqueror, running an automatic spider for broken links on a website when an in-place edit over ssh had just been done with Kate, changing all sorts of things when my bluetooth-enabled handset was near the machine, managing playlists, pausing torrents and stuff if youtube was opened (on a very old machine with a bad connection), alerting me (really sending a contact via bluetooth called "alert") when something was finished, shutting down the machine when a DVD finished playing, etc. etc.. Pretty much every KDE app was scriptable (although some of that stuff will have involved me working around any limitations). In fact, it was the principal thing that made that DE so much better than anything else I have used before or since.
Of course, DCOP had its own problems, which is part of the reason for KDE moving on from it, and KDE itself is far from being recommendable again, but there is a healthy number of weirdos like myself who like to be able to do that kind of thing, so I can't see it disappearing soon.