Thanks slef for the page on Syndication Tools in the ALUG Wiki.¹
I was just wondering, why is it that certain HTML produced by LiveJournal² (especially images) breaks Planet ALUG/Mabloss³? I'm assuming it's because LiveJournal produces bad code in the RSS feed. Is this the case?
What's the best approach to fix this?
1) Syndication Tools - http://www.alug.org.uk/contrib/?SyndicationTools 2) LiveJournal - http://www.livejournal.com 3) MaBloss - http://mjr.towers.org.uk/mabloss.html
On Thu, Nov 04, 2004 at 07:04:32PM +0000, Ben Francis wrote:
Thanks slef for the page on Syndication Tools in the ALUG Wiki.¹
I was just wondering, why is it that certain HTML produced by LiveJournal² (especially images) breaks Planet ALUG/Mabloss³? I'm assuming it's because LiveJournal produces bad code in the RSS feed. Is this the case?
Generally, it's because the markup, possibly entered by the evil user of LiveJournal, is not XHTML compliant. Certainly the Planet ALUG code does only work with XHTML compliant markup in the RSS feed, otherwise it escapes all entities. It's not "broken", it's just dealing with the input in the easiest way possible.
It could be that people are using <img src="somewhere"> instead of <img src="somewhere" alt="something" />.
What's the best approach to fix this?
Make the world learn how to post clean XHTML.
Cheers,
On 2004-11-04 18:14:12 +0000 Brett Parker iDunno@sommitrealweird.co.uk wrote:
[...] Certainly the Planet ALUG code does only work with XHTML compliant markup in the RSS feed, otherwise it escapes all entities.
That's why. It checks the description field and passes stuff starting with a < to an XML parser (SSAX). If it gets an error back, it decides it was plain text. http://ssax.sf.net/
A possible fix is to optionally use a more robust HTML parser. Maybe Neil Van Dyke's HtmlPrag. http://www.neilvandyke.org/htmlprag/
The MaBloss sources are at http://mjr.towers.org.uk/mabloss.html (tar.gz) or http://www.mjr.dsl.pipex.com/ (Arch). I'll do another release as soon as I get time, as there have been some essential fixes to schycyroll, the script that drives Planet ALUG.
It's not "broken", it's just dealing with the input in the easiest way possible.
Want to know why? See http://c2.com/cgi/wiki?DoTheSimplestThingThatCouldPossiblyWork
What's the best approach to fix this?
Make the world learn how to post clean XHTML.
I know you think learning lisp is difficult, but even so!