Unicode revisited, this time with htmlText
Alex Rice
alex at mindlube.com
Sat Nov 22 00:58:17 EST 2003
On Nov 21, 2003, at 5:03 PM, tuviah snyder wrote:
> Well that's the way you specify unicode characters in the HTML spec.
> Any
> other way would have byteorder issues, associating with it, and would
> require binary data be embedded into HTML which is supposed be plain
> text.
Not true in practice. The encoding of HTML can be specified in the HTTP
Content-type header from the web server, or in a META tag in the HTML
itself (yet in the HTML itself) Read this article that was posted to
improve-rev recently:
<http://www.joelonsoftware.com/articles/Unicode.html>
Here is a section from that article that talks about this issue:
"""
For a web page, the original idea was that the web server would return
a similar Content-Type http header along with the web page itself --
not in the HTML itself, but as one of the response headers that are
sent before the HTML page.
This causes problems. Suppose you have a big web server with lots of
sites and hundreds of pages contributed by lots of people in lots of
different languages and all using whatever encoding their copy of
Microsoft FrontPage saw fit to generate. The web server itself wouldn't
really know what encoding each file was written in, so it couldn't send
the Content-Type header.
It would be convenient if you could put the Content-Type of the HTML
file right in the HTML file itself, using some kind of special tag. Of
course this drove purists crazy... how can you read the HTML file until
you know what encoding it's in?! Luckily, almost every encoding in
common use does the same thing with characters between 32 and 127, so
you can always get this far on the HTML page without starting to use
funny letters:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
But that meta tag really has to be the very first thing in the <head>
section because as soon as the web browser sees this tag it's going to
stop parsing the page and start over after reinterpreting the whole
page using the encoding you specified.
"""
Alex Rice <alex at mindlube.com> | Mindlube Software |
<http://mindlube.com>
what a waste of thumbs that are opposable
to make machines that are disposable -Ani DiFranco
More information about the use-livecode
mailing list