First 1000 characters without loop?

Richard Gaskin ambassador at fourthworld.com
Thu Jun 22 21:19:51 EDT 2017


Monte Goulding wrote:

 >> On 23 Jun 2017, at 10:06 am, Richard Gaskin wrote:
 >>
 >> How can we know which is in use for a given string?
 >
 > You shouldn’t need to know. The engine will use native encoding where
 > possible for efficiency. A lot of the performance improvements between
 > LC 7 and 8 were using the right code paths based on whether the string
 > is native or unicode.

Seems murky.  I'd much rather at least have something like a byteLen 
function, which returns the number of bytes for a given string.  With 
that I can maintain byte offsets into a file with good performance and 
no ambiguity.


 >> Suppose I wanted to process a lot of text, so performance is
 >> critical. Using bytes would be optimal, since any chunk type or even
 >> Unicode characters may vary in length.
 >>
 >> So if I wanted to create an index of byte offsets into a large chunk
 >> of text, how would I know how long a character is?
 >
 > If it’s text encoded then you probably want to use character offsets
 > and let the engine worry about optimising it. If you know it’s binary
 > data then use bytes.

How do I find a substring in binary data in a what that will tell me the 
number of bytes of the offset?

-- 
  Richard Gaskin
  Fourth World Systems
  Software Design and Development for the Desktop, Mobile, and the Web
  ____________________________________________________________________
  Ambassador at FourthWorld.com                http://www.FourthWorld.com




More information about the use-livecode mailing list