Finding invisible/non printable characters in a string
kee.nethery at elloco.com
Mon May 10 10:16:13 EDT 2021
The ASCII characters at the beginning of the ASCII table (RS, GS, Bell, etc) typically display as a box. What you are describing are zero width Unicode characters. I think there are four. You could explicitly look for them.
> On May 10, 2021, at 7:09 AM, Paul Dupuis via use-livecode <use-livecode at lists.runrev.com> wrote:
> There are characters that consist of more than one codepoint - composite versions of characters for accents. See https://unicode-table.com/en/blocks/combining-diacritical-marks/
> I think the best way is to scan the codepoints looking for codePointToNum values that are 0-31 (exclude tab and cr/lfs if you like) and 127 (DEL). There may be some others in the 128-255 range that are not printable. I forget off the top of my head.
>> On 5/10/2021 5:49 AM, David V Glasgow via use-livecode wrote:
>> Hi folks, hope everyone is well.
>> Would I be right in thinking if codepoint count > the number of chars in a text string, then it probably contains invisible characters?
>> Or would I need to search through Hex to check?
>> Or something much easier and cleverer that I hadn’t even considered. Because that’s what this list and working with Livecode is like.
>> David Glasgow
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the use-livecode