charIndex property

Mon Jul 31 12:08:08 EDT 2023

I have no idea why pasting placed *'s all over the place!

On 7/31/2023 11:54 AM, Paul Dupuis via use-livecode wrote:
> Bob,
>
> Here is a version of Mark's method, for trueWords, sentences, and 
> paragraphs, with the added parameter of pDirection to get the char 
> index of the start of the chunk or the end of the chunk containing the 
> character position pChunkIndex.
>
> *private**function* rwCharIndex pText, pChunkType, pChunkIndex, 
> pDirection
>
> *-- pText is the full text*
>
> *-- pChunkType is once of: words|sentences|paragraphs*
>
> *-- pChunkIndex is the integer index in the indicated units. ie. 
> "word",7 is the 7th word*
>
> *-- pDirection is one of: first|last meaning either the 1st character 
> of the chunk or the last character*
>
> *-- error checking, emty is returned if an error occurs with the 
> parameters*
>
> *if* pText isempty*then* *return*empty
>
> *if* pChunkType isnotamongtheitemsof"words,sentences,paragraphs"*then* 
> *return*empty
>
> *if* pChunkIndex isnotaninteger*then* *return*empty
>
> *if* pDirection isnotamongtheitemsof"first,last"*then* *return*empty
>
> *local*tL
>
> *switch* pChunkType
>
> *case* "words"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintotrueWordpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(trueWordpChunkIndex ofpText) intotL
>
> *put*nullintotrueWordpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *case* "sentences"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintosentencepChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(sentencepChunkIndex ofpText) intotL
>
> *put*nullintosentencepChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *case* "paragraphs"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintoparagraphpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(paragraphpChunkIndex ofpText) intotL
>
> *put*nullintoparagraphpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *end* *switch*
>
> *end*rwCharIndex
>
>
>
>
> On 7/31/2023 11:44 AM, Bob Sneidar via use-livecode wrote:
>> I replaced the code in the original function with this code and it 
>> won’t compile.
>>
>> Do you mind posting the full working function again?
>>
>> Bob S
>>
>>
>>> On Jul 27, 2023, at 2:06 PM, Mark Waddingham via use-livecode 
>>> <use-livecode at lists.runrev.com> wrote:
>>>
>>> Oh those pesky chunks which don’t ‘cover’ the target string (which 
>>> is actually all of them except codeunit/point/char come to think of 
>>> it). I should have run through a few more examples in my head before 
>>> posting….
>>>
>>> Alternative attempt:
>>>
>>> Put null into word N to -1 of S
>>> Delete codeunit (codeunitoffset(null, S) to -1 of S
>>> Return the number of chars in S + 1
>>>
>>> The problem before was the chars which do not form part of the last 
>>> chunk and remain after deletion.
>>>
>>> The above puts in a sentinel char which can be searched for to find 
>>> where the requested chunk started.
>>>
>>> Second time lucky? ;)
>>>
>>> Mark.
>>>
>>> Sent from my iPhone
>>>
>>>> On 27 Jul 2023, at 21:23, Paul Dupuis via use-livecode 
>>>> <use-livecode at lists.runrev.com> wrote:
>>>>
>>>> On 7/27/2023 4:31 AM, Mark Waddingham via use-livecode wrote:
>>>>>> On 2023-07-26 18:02, Paul Dupuis via use-livecode wrote:
>>>>>> If I have some text in a field, I can use the "charIndex" 
>>>>>> property (see Dictionary) to obtain teh character position of the 
>>>>>> first character of a chunk.
>>>>>>
>>>>>> Does anyone know of a clever way to do the equivalent of the 
>>>>>> charIndex for an arbitrary chunk expression for a 
>>>>>> container/variable (i.e. not an actual field object)?
>>>>> This should work I think:
>>>>>
>>>>>    function charIndexOfWord pWordIndex, pTarget
>>>>>       delete word pWordIndex to -1 of pTarget
>>>>>       return the number of characters in pTarget + 1
>>>>>    end charIndexOfWord
>>>>>
>>>>> Deletion of chunks works from the first char that makes up the 
>>>>> computed range, so you are left with all the characters which sit 
>>>>> before it.
>>>>>
>>>>> The index of the character immediately before the start of the 
>>>>> specified word is the just the number of characters which sit 
>>>>> before it; and so the index of the first char of the specified 
>>>>> word (which is what charIndex gives you in a field) is that +1.
>>>>>
>>>>> The above should work for both +ve and -ve indices, and the 
>>>>> obvious changes will make it work for other string chunks (i.e. 
>>>>> change 'Word' for <chunk>).
>>>>>
>>>> Mark,
>>>>
>>>> Thank you very much. This was a brilliant approach and I should 
>>>> have thought of it myself. However, it is not quite an accurate 
>>>> substitute for the charIndex property of a field. The following 
>>>> example illustrates the issue:
>>>>
>>>> pTarget is [The quick brown fox jumps over the lazy dog. The lazy 
>>>> dog was named "Oz".]
>>>> pWordIndex is 8 (having been derived from searching for 'lazy', the 
>>>> 8th word)
>>>>
>>>> Using [] to quote strings.
>>>> delete word 8 to -1 of pTarget -- deletes [lazy] to ["Oz"] but not 
>>>> the period (.) at the end since it is not considered part of word -1.
>>>> This leaves pTarget as [The quick brown fox jumps over the .]
>>>> The number of characters in pTarget + 1 is actually not the 
>>>> position of the [l] in [lazy], which is character 36, but the [a] 
>>>> in [azy], character 37, due to the period being left.
>>>>
>>>> There are some similar issues, being off by  or more, with 
>>>> sentences and paragraphs in longer text.
>>>>
>>>> Thank you very much for chiming in with a good direction to try.
>>>>
>>>> Paul Dupuis
>>>> Researchware
>>>>
>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your 
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your 
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your 
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your 
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode