Re: [alug social] Planet Alug

23 Aug 2005


      Jonathan McDowell noodles@earth.li wrote:
...
Secondly it seems to validate just fine for
http://www.andrewsavory.com/blog/index.rdf with
http://feedvalidator.org/
Should escaped html be in description or content:encoded?
It's not something I looked at too deeply when writing it,
as both seem to happen in the wild (and even link and
title can be tricky too).
...
[...] I fed Andrew's blog to Planet to see what it did and it
appears to do the right thing. So it looks like a schycyroll issue. I
don't see why it should need to parse HTML to cope with this, but I
haven't looked at its code at all.
The xml parser doesn't like the missing </li> in the Links item.
I think Planet just prints the unescaped code into the output,
so can it reliably produce a valid page? (Schycyroll doesn't
yet produce a valid page, but has a reasonably easy fix for
the current bug stopping it.)
Trying to regex out all the stupid things people do with html
isn't feasible. Parsing it, merging it and then serialising
should give reliably valid pages and enable some other features.
The downside is that the encoded html needs to be parsed.
The xml parser is used anyway, so schycyroll just blindly
tries that on the description (hoping most people use 1999
xhtml by now) and uses the text "parser" if the attempt fails.
Hope that explains why I want it to parse things.
-- 
MJ Ray (slef), K. Lynn, England, email see http://mjr.towers.org.uk/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [alug social] Planet Alug