HTML and character entities in RSS

Richard Gaskin ambassador at fourthworld.com
Fri Feb 26 20:52:02 EST 2010


Looking for standards among RSS feeds is like looking for standards in 
Windows GUI designs:  everyone knows they're published somewhere, but no 
one takes the time to read 'em. ;)

I've been parsing a bunch of RSS files, and man, what a wild west of 
weirdness it is.

For example, most of the RSS specs I've read suggest that all data is 
plan text, with HTML allowable only when marked as CDATA.

But I've seen feeds that do that backwards, and some that have some 
strings containing character entities flagged as CDATA with other 
containing entities that aren't flagged -- in the same feed!

By what rule should I know when to translate data from character 
entities back to plain old ASCII?

Browsers seem to handle the mish-mash rather well; wish I were as 
graceful at handling all the inconsistencies I'm finding.

--
  Richard Gaskin
  Fourth World
  Rev training and consulting: http://www.fourthworld.com
  Webzine for Rev developers: http://www.revjournal.com
  revJournal blog: http://revjournal.com/blog.irv



More information about the use-livecode mailing list