How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

Geoff Canyon gcanyon at gmail.com
Thu Nov 1 20:04:13 EDT 2018


Nice! I *just* finished creating a github repository for it, and adding
support for multi-char search strings, much as you did. I was coming to the
list to post the update when I saw your post.

Here's the GitHub link: https://github.com/gcanyon/offsetlist

Here's my updated version:

function offsetList D,S,pCase
   -- returns a comma-delimited list of the offsets of D in S
   set the caseSensitive to pCase is true
   set the itemDel to D
   put length(D) into dLength
   put 1 - dLength into C
   repeat for each item i in S
      add length(i) + dLength to C
      put C,"" after R
   end repeat
   set the itemDel to comma
   if char -dLength to -1 of S is D then return char 1 to -2 of R
   put length(C) + 1 into lenC
   put length(R) into lenR
   if lenC = lenR then return 0
   return char 1 to lenR - lenC - 1 of R
end offsetList

On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
use-livecode at lists.runrev.com> wrote:

> Hi Geoff,
>
> thank you for this beautiful script.
>
> I modified it a bit to accept multi-character search string and also for
> case sensitivity.
>
> It definitely is a lot faster for unicode text than anything I have seen.
>
> -----------------------------
> function offsetList D,S, pCase
>    -- returns a comma-delimited list of the offsets of D in S
>    -- pCase is a boolean for caseSensitive
>    set the caseSensitive to pCase
>    set the itemDel to D
>    put the length of D into tDelimLength
>    repeat for each item i in S
>       add length(i) + tDelimLength to C
>       put C - (tDelimLength - 1),"" after R
>    end repeat
>    set the itemDel to comma
>    if char -1 of S is D then return char 1 to -2 of R
>    put length(C) + 1 into lenC
>    put length(R) into lenR
>    if lenC = lenR then return 0
>    return char 1 to lenR - lenC - 1 of R
> end offsetList
> ------------------------------
>
> Kind regards
> Bernd
>
>
>
>
>
> >
> > Date: Thu, 1 Nov 2018 00:15:37 -0700
> > From: Geoff Canyon
> > To: How to use LiveCode <use-livecode at lists.runrev.com>
> > Subject: Re: How to find the offset of the last instance of a
> >       repeating       character in a string?
> >
> > I was curious if using the itemDelimiter might work for this, so I wrote
> > the below code out of curiosity; but in my quick testing with single-byte
> > characters it was only about 30% faster than the above methods, so I
> didn't
> > bother to post it.
> >
> > But Ben Rubinstein just posted about a terrible slow-down doing pretty
> much
> > this same thing for text with unicode characters. So I ran a simple test
> > with 8000 character long strings that start with a single unicode
> > character, this is about 15x faster than offset() with skip. For
> > 100,000-character lines it's about 300x faster, so it seems to be immune
> to
> > the line-painter issues skip is subject to. So for what it's worth:
> >
> > function offsetList D,S
> >   -- returns a comma-delimited list of the offsets of D in S
> >   set the itemDel to D
> >   repeat for each item i in S
> >      add length(i) + 1 to C
> >      put C,"" after R
> >   end repeat
> >   set the itemDel to comma
> >   if char -1 of S is D then return char 1 to -2 of R
> >   put length(C) + 1 into lenC
> >   put length(R) into lenR
> >   if lenC = lenR then return 0
> >   return char 1 to lenR - lenC - 1 of R
> > end offsetList
> >
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



More information about the use-livecode mailing list