How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

Bob Sneidar bobsneidar at iotecdigital.com
Sun Nov 4 22:41:49 EST 2018


Simply add 1 to the last offset pointer. If after the first iteration you return 1, then set the charsToSkip to 2 instead of offset + len(searchString) if you take my meaning. 

Bob S


> On Nov 2, 2018, at 17:43 , Geoff Canyon via use-livecode <use-livecode at lists.runrev.com> wrote:
> 
> I like that, changing it. Now available at
> https://github.com/gcanyon/alloffsets
> 
> One thing I don't see how to do without significantly impacting performance
> is to return all offsets if there are overlapping strings. For example:
> 
> allOffsets("aba","abababa")
> 
> would return 1,5, when it might be reasonable to expect it to return 1,3,5.
> Using the offset function with numToSkip would make that easy; adapting
> allOffsets to do so would be harder to do cleanly I think.
> 
> gc
> 
> On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
> use-livecode at lists.runrev.com> wrote:
> 
>> how about allOffsets?
>> 
>> Bob S
>> 
>> 
>>> On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <
>> use-livecode at lists.runrev.com> wrote:
>>> 
>>> All of those return a single value; I wanted to convey the concept of
>>> returning multiple values. To me listOffset implies it does the same
>> thing
>>> as itemOffset, since items come in a list. How about:
>>> 
>>> offsets -- not my favorite because it's almost indistinguishable from
>> offset
>>> offsetsOf -- seems a tad clumsy
>>> 
>>> On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
>>> use-livecode at lists.runrev.com> wrote:
>>> 
>>>> It probably should be named listOffset, like itemOffset or lineOffset.
>>>> 
>>>> Bob S
>>>> 
>>>> 
>>>>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
>>>> use-livecode at lists.runrev.com> wrote:
>>>>> 
>>>>> Nice! I *just* finished creating a github repository for it, and adding
>>>>> support for multi-char search strings, much as you did. I was coming to
>>>> the
>>>>> list to post the update when I saw your post.
>>>>> 
>>>>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
>>>>> 
>>>>> Here's my updated version:
>>>>> 
>>>>> function offsetList D,S,pCase
>>>>> -- returns a comma-delimited list of the offsets of D in S
>>>>> set the caseSensitive to pCase is true
>>>>> set the itemDel to D
>>>>> put length(D) into dLength
>>>>> put 1 - dLength into C
>>>>> repeat for each item i in S
>>>>>    add length(i) + dLength to C
>>>>>    put C,"" after R
>>>>> end repeat
>>>>> set the itemDel to comma
>>>>> if char -dLength to -1 of S is D then return char 1 to -2 of R
>>>>> put length(C) + 1 into lenC
>>>>> put length(R) into lenR
>>>>> if lenC = lenR then return 0
>>>>> return char 1 to lenR - lenC - 1 of R
>>>>> end offsetList
>>>>> 
>>>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
>>>>> use-livecode at lists.runrev.com> wrote:
>>>>> 
>>>>>> Hi Geoff,
>>>>>> 
>>>>>> thank you for this beautiful script.
>>>>>> 
>>>>>> I modified it a bit to accept multi-character search string and also
>> for
>>>>>> case sensitivity.
>>>>>> 
>>>>>> It definitely is a lot faster for unicode text than anything I have
>>>> seen.
>>>>>> 
>>>>>> -----------------------------
>>>>>> function offsetList D,S, pCase
>>>>>> -- returns a comma-delimited list of the offsets of D in S
>>>>>> -- pCase is a boolean for caseSensitive
>>>>>> set the caseSensitive to pCase
>>>>>> set the itemDel to D
>>>>>> put the length of D into tDelimLength
>>>>>> repeat for each item i in S
>>>>>>    add length(i) + tDelimLength to C
>>>>>>    put C - (tDelimLength - 1),"" after R
>>>>>> end repeat
>>>>>> set the itemDel to comma
>>>>>> if char -1 of S is D then return char 1 to -2 of R
>>>>>> put length(C) + 1 into lenC
>>>>>> put length(R) into lenR
>>>>>> if lenC = lenR then return 0
>>>>>> return char 1 to lenR - lenC - 1 of R
>>>>>> end offsetList
>>>>>> ------------------------------
>>>>>> 
>>>>>> Kind regards
>>>>>> Bernd
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
>>>>>>> From: Geoff Canyon
>>>>>>> To: How to use LiveCode <use-livecode at lists.runrev.com>
>>>>>>> Subject: Re: How to find the offset of the last instance of a
>>>>>>>    repeating       character in a string?
>>>>>>> 
>>>>>>> I was curious if using the itemDelimiter might work for this, so I
>>>> wrote
>>>>>>> the below code out of curiosity; but in my quick testing with
>>>> single-byte
>>>>>>> characters it was only about 30% faster than the above methods, so I
>>>>>> didn't
>>>>>>> bother to post it.
>>>>>>> 
>>>>>>> But Ben Rubinstein just posted about a terrible slow-down doing
>> pretty
>>>>>> much
>>>>>>> this same thing for text with unicode characters. So I ran a simple
>>>> test
>>>>>>> with 8000 character long strings that start with a single unicode
>>>>>>> character, this is about 15x faster than offset() with skip. For
>>>>>>> 100,000-character lines it's about 300x faster, so it seems to be
>>>> immune
>>>>>> to
>>>>>>> the line-painter issues skip is subject to. So for what it's worth:
>>>>>>> 
>>>>>>> function offsetList D,S
>>>>>>> -- returns a comma-delimited list of the offsets of D in S
>>>>>>> set the itemDel to D
>>>>>>> repeat for each item i in S
>>>>>>>   add length(i) + 1 to C
>>>>>>>   put C,"" after R
>>>>>>> end repeat
>>>>>>> set the itemDel to comma
>>>>>>> if char -1 of S is D then return char 1 to -2 of R
>>>>>>> put length(C) + 1 into lenC
>>>>>>> put length(R) into lenR
>>>>>>> if lenC = lenR then return 0
>>>>>>> return char 1 to lenR - lenC - 1 of R
>>>>>>> end offsetList
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> use-livecode mailing list
>>>>>> use-livecode at lists.runrev.com
>>>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>>>> subscription preferences:
>>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>>> 
>>>>> _______________________________________________
>>>>> use-livecode mailing list
>>>>> use-livecode at lists.runrev.com
>>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>> 
>>>> 
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>> 
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list