A peculiar character substitution problem with URL
Jonathan Lynch
jonathandlynch at gmail.com
Tue Aug 20 07:24:46 EDT 2013
Well, Dave :)
This worked perfectly. Thank you for taking the time to look into this - it
really helped me. It also taught me that I need to learn more about
uniencode and unidecode.
Again - a huge thanks!
Take care,
Jonathan
On Mon, Aug 19, 2013 at 1:12 PM, Dave Cragg <dave.cragg at lacscentre.co.uk>wrote:
>
> On 19 Aug 2013, at 17:04, Jonathan Lynch <jonathandlynch at gmail.com> wrote:
>
> > This is just the strangest thing. On some websites - but not all - trying
> > to get the html of that website using "get url" or "put url" is causing
> > some characters to be substituted.
> >
> > These are not obscure unicode characters. They seem to be characters in
> the
> > upper ANSI set.
> >
> > For example, on this web page:
> > http://emergency.cdc.gov/disasters/wildfires/facts.asp
> >
> > If I use the following code:
> >
> > put URL "http://emergency.cdc.gov/disasters/wildfires/facts.asp" into
> field
> > 1
> >
> > The right single quote character --> ’ <-- ( which is character number
> 146)
> > gets converted into ’
> >
> >
> > I do not understand why ’ becomes ’
> >
>
> Jonathan,
>
> The page source for the url indicates the page is encoded as UTF-8. This
> is from the 'head' section of the page.
>
> <meta http-equiv="content-type" content="text/html; charset=utf-8" />
>
> So it looks like it may be 'obscure unicode characters'. :-)
>
> What happens when you do something like this:
>
> put URL "http://emergency.cdc.gov/disasters/wildfires/facts.asp" into
> tTemp
> put uniDecode(uniEncode(tTemp, "UTF8")) into field 1
>
> Cheers
> Dave Cragg
>
>
>
>
>
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
--
Do all things with love
More information about the use-livecode
mailing list