How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

Geoff Canyon gcanyon at gmail.com
Fri Nov 2 20:43:38 EDT 2018


I like that, changing it. Now available at
https://github.com/gcanyon/alloffsets

One thing I don't see how to do without significantly impacting performance
is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 1,3,5.
Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.

gc

On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
use-livecode at lists.runrev.com> wrote:

> how about allOffsets?
>
> Bob S
>
>
> > On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <
> use-livecode at lists.runrev.com> wrote:
> >
> > All of those return a single value; I wanted to convey the concept of
> > returning multiple values. To me listOffset implies it does the same
> thing
> > as itemOffset, since items come in a list. How about:
> >
> > offsets -- not my favorite because it's almost indistinguishable from
> offset
> > offsetsOf -- seems a tad clumsy
> >
> > On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
> > use-livecode at lists.runrev.com> wrote:
> >
> >> It probably should be named listOffset, like itemOffset or lineOffset.
> >>
> >> Bob S
> >>
> >>
> >>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
> >> use-livecode at lists.runrev.com> wrote:
> >>>
> >>> Nice! I *just* finished creating a github repository for it, and adding
> >>> support for multi-char search strings, much as you did. I was coming to
> >> the
> >>> list to post the update when I saw your post.
> >>>
> >>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
> >>>
> >>> Here's my updated version:
> >>>
> >>> function offsetList D,S,pCase
> >>>  -- returns a comma-delimited list of the offsets of D in S
> >>>  set the caseSensitive to pCase is true
> >>>  set the itemDel to D
> >>>  put length(D) into dLength
> >>>  put 1 - dLength into C
> >>>  repeat for each item i in S
> >>>     add length(i) + dLength to C
> >>>     put C,"" after R
> >>>  end repeat
> >>>  set the itemDel to comma
> >>>  if char -dLength to -1 of S is D then return char 1 to -2 of R
> >>>  put length(C) + 1 into lenC
> >>>  put length(R) into lenR
> >>>  if lenC = lenR then return 0
> >>>  return char 1 to lenR - lenC - 1 of R
> >>> end offsetList
> >>>
> >>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
> >>> use-livecode at lists.runrev.com> wrote:
> >>>
> >>>> Hi Geoff,
> >>>>
> >>>> thank you for this beautiful script.
> >>>>
> >>>> I modified it a bit to accept multi-character search string and also
> for
> >>>> case sensitivity.
> >>>>
> >>>> It definitely is a lot faster for unicode text than anything I have
> >> seen.
> >>>>
> >>>> -----------------------------
> >>>> function offsetList D,S, pCase
> >>>>  -- returns a comma-delimited list of the offsets of D in S
> >>>>  -- pCase is a boolean for caseSensitive
> >>>>  set the caseSensitive to pCase
> >>>>  set the itemDel to D
> >>>>  put the length of D into tDelimLength
> >>>>  repeat for each item i in S
> >>>>     add length(i) + tDelimLength to C
> >>>>     put C - (tDelimLength - 1),"" after R
> >>>>  end repeat
> >>>>  set the itemDel to comma
> >>>>  if char -1 of S is D then return char 1 to -2 of R
> >>>>  put length(C) + 1 into lenC
> >>>>  put length(R) into lenR
> >>>>  if lenC = lenR then return 0
> >>>>  return char 1 to lenR - lenC - 1 of R
> >>>> end offsetList
> >>>> ------------------------------
> >>>>
> >>>> Kind regards
> >>>> Bernd
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
> >>>>> From: Geoff Canyon
> >>>>> To: How to use LiveCode <use-livecode at lists.runrev.com>
> >>>>> Subject: Re: How to find the offset of the last instance of a
> >>>>>     repeating       character in a string?
> >>>>>
> >>>>> I was curious if using the itemDelimiter might work for this, so I
> >> wrote
> >>>>> the below code out of curiosity; but in my quick testing with
> >> single-byte
> >>>>> characters it was only about 30% faster than the above methods, so I
> >>>> didn't
> >>>>> bother to post it.
> >>>>>
> >>>>> But Ben Rubinstein just posted about a terrible slow-down doing
> pretty
> >>>> much
> >>>>> this same thing for text with unicode characters. So I ran a simple
> >> test
> >>>>> with 8000 character long strings that start with a single unicode
> >>>>> character, this is about 15x faster than offset() with skip. For
> >>>>> 100,000-character lines it's about 300x faster, so it seems to be
> >> immune
> >>>> to
> >>>>> the line-painter issues skip is subject to. So for what it's worth:
> >>>>>
> >>>>> function offsetList D,S
> >>>>> -- returns a comma-delimited list of the offsets of D in S
> >>>>> set the itemDel to D
> >>>>> repeat for each item i in S
> >>>>>    add length(i) + 1 to C
> >>>>>    put C,"" after R
> >>>>> end repeat
> >>>>> set the itemDel to comma
> >>>>> if char -1 of S is D then return char 1 to -2 of R
> >>>>> put length(C) + 1 into lenC
> >>>>> put length(R) into lenR
> >>>>> if lenC = lenR then return 0
> >>>>> return char 1 to lenR - lenC - 1 of R
> >>>>> end offsetList
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> use-livecode mailing list
> >>>> use-livecode at lists.runrev.com
> >>>> Please visit this url to subscribe, unsubscribe and manage your
> >>>> subscription preferences:
> >>>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>>>
> >>> _______________________________________________
> >>> use-livecode mailing list
> >>> use-livecode at lists.runrev.com
> >>> Please visit this url to subscribe, unsubscribe and manage your
> >> subscription preferences:
> >>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>
> >>
> >> _______________________________________________
> >> use-livecode mailing list
> >> use-livecode at lists.runrev.com
> >> Please visit this url to subscribe, unsubscribe and manage your
> >> subscription preferences:
> >> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>
> > _______________________________________________
> > use-livecode mailing list
> > use-livecode at lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



More information about the use-livecode mailing list