XML Library and diacritics: how to?

Igor Couto igor at pixelmedia.com.au
Thu Sep 18 03:38:01 EDT 2003


Hi all!,

I am also struggling with Unicode and XML trees. I, too, have a field 
with some accented latin characters that must be saved as an attribute 
of an XML element.:

On Wednesday, September 17, 2003, at 03:09  AM, Tuviah Snyder wrote:

>>>> revSetXMLAttribute docid,theNode,"Joël" -- does not work
> Entering latin characters Seems to work fine when using the XML tree 
> view
> example..even without using UTF-8.

I just tried using the XML Tree View Stack, testing it to see if I 
could modify the contents of a node to include Latin accented letters, 
and it did not work for me. Perhaps I am doing something wrong - if so, 
please DO tell me! Here are the steps I took:

* Using MacOS X 10.2.6, Revolution 2.1:

1) Open the "xmltree-view" stack, that shipped in the "Sample Stacks" 
folder
2) Click the "browse..." button, and select an existing XML document - 
any document. Some sample ones shipped with Rev, too.
3) Click the "Load XML from file" button. This will read the selected 
XML document, and build a tree, populating the field labeled "XML 
tree-view".
4) In the "XML tree-view" field, select an element/node - any one will 
do.

If the selected xml element has any character data, then this will be 
shown in the "Element Contents" field. We are going to add/replace the 
contents of this field with a word which includes an accented character:

5) Change the keyboard layout to "US Extended". If you don't know how 
to do this in MacOS X:
    a) Open your System Preferences
    b) Click on "International"
    c) In the "International" pane, click on the "Input Menu" tab
    d) In the list of available keyboard layouts, make sure "US 
Extended" has a tick next to it.
    e) Quit System Preferences
    f) In the input menu in your menubar (the little 'flag' next to 
"Help"), select "US Extended"

6) Type the following word in the "Element Contents" field: "Português" 
- that is spelled: P-O-R-T-U-G-U-E^-S (the "e" has a 'circumflex' 
accent). To type that special accented 'e', using your 'US Extended' 
keyboard:
    a) hold down the OPTION key, and press "6" on your keyboard. Let go 
all keys.
    b) type "e".

That is the word for "Portuguese", in Portuguese.

If you followed all these steps, you will have TYPED the word into the 
"Element Contents" field. Now, to try and write it as part of our XML 
tree:

7) Click on the "Modify Contents" button.

Hmmm, it looks like it worked, huh? Not quite, though...the "Element 
Contents" field simply hasn't 'refreshed'. Let's manually 'refresh' our 
contents display, so that it truly reflects what has been put into the 
XML tree:

8) Click on another element, to deselect from the current one, and
9) Click back on our element, so we can see the contents again.

The "Element Contents" field displays "Portugu".

If instead of trying to enter this value as a "Content", we try to 
enter it as an attribute, we get exactly the same result - it TRUNCATES 
the text before the unicode "e".

>
>>>> revSetXMLAttribute docid,theNode,uniEncode("Joël","UTF-8") -- does 
>>>> not
> work
>
> Try
>
> put unidecode(uniencode("Joël"),"utf8")

I tried that - it did not work for me. Again, if this is something that 
someone else has tried, and that HAS worked, then maybe I am doing 
something silly. In which case, I BEG of you, pleeeeeeeeeeease post 
your successful recipe here, so that we can all benefit from it.

If this is a bug that has not yet been reported, then please let us 
know also, so we can add it to Bugzilla!

Many thanks,


--
Igor de Oliveira Couto
----------------------------------
----------------------------------




More information about the use-livecode mailing list