Problems downloading accented characters from a web page

Sarah Reichelt sarah.reichelt at gmail.com
Sat May 20 05:51:15 EDT 2006


> > I have a routine that downloads a web page and extracts certain text.
> > This works fine except when the characters are accented. I'm not sure
> > how well the characters will transfer in the email, but I'll try to
> > give an example:
> >
> > Accented e (Ž) - I never could remember which was an acute and which
> > was a grave but it's numToChar(142). On the web page viewed in a
> > browser and checking the source, it looks perfect. When I download
> > that page into a Rev, the Ž becomes "Ã(c)" i.e. square root &
> > copyright, charToNum 195 & 169.
> >
> > I've tried using ISOtoMac and uniDecode and the 2 combined in various
> > ways, but I can't get it to give me the correct accented e.
> >
> > Any ideas?
> >
> > TIA,
> > Sarah
>
> it is utf8
> try
> unidecode(uniencode(numToChar(195) & numToChar(169),"utf8"))
> it returns "Ž"

Thanks Devin & Thierry, that was it. Doing the decode/encode with utf8
solved the problem completely. Now I check the web page headers for
"utf8" and apply this fix if so.

Many thanks,
Sarah



More information about the use-livecode mailing list