Help with Unicode Text
Dar Scott
dsc at swcp.com
Mon Mar 28 14:30:09 EST 2005
On Mar 28, 2005, at 12:06 PM, Dan Friedman wrote:
> Anyone know how to replace a return char in a unicode string?
There are two problems with your method.
Well, the first is really a potential problem depending on what you
want to do. Do you mean ASCII carriage return? Or the Revolution
newline character (coded the same as ASCII line feed)?
The test character is a single byte character. However, each character
in a unicode string is two bytes, 16-bit values in host order, that is,
UTF16. Even then you can't just convert the character to two bytes for
the platform and search. You might match half of one character and
half of the next.
The pattern for repeating for each unicode character is like this:
-- for each unicode char uc in sBMP
repeat with i = 1 to length(sBMP)-1 step 2
put char i to i+1 of sBMP into uc
-- body
end repeat
That assumes there are no surrogates.
One way to convert your ASCII test char is this:
get uniEncode(c,"UTF8")
So, you can go through each unicode character, accumulating values, but
replacing those that need replacing.
Dar
--
**********************************************
DSC (Dar Scott Consulting & Dar's Lab)
http://www.swcp.com/dsc/
Programming Services and Software
**********************************************
More information about the use-livecode
mailing list