charIndex property
Paul Dupuis
paul at researchware.com
Mon Jul 31 12:08:08 EDT 2023
I have no idea why pasting placed *'s all over the place!
On 7/31/2023 11:54 AM, Paul Dupuis via use-livecode wrote:
> Bob,
>
> Here is a version of Mark's method, for trueWords, sentences, and
> paragraphs, with the added parameter of pDirection to get the char
> index of the start of the chunk or the end of the chunk containing the
> character position pChunkIndex.
>
> *private**function* rwCharIndex pText, pChunkType, pChunkIndex,
> pDirection
>
> *-- pText is the full text*
>
> *-- pChunkType is once of: words|sentences|paragraphs*
>
> *-- pChunkIndex is the integer index in the indicated units. ie.
> "word",7 is the 7th word*
>
> *-- pDirection is one of: first|last meaning either the 1st character
> of the chunk or the last character*
>
> *-- error checking, emty is returned if an error occurs with the
> parameters*
>
> *if* pText isempty*then* *return*empty
>
> *if* pChunkType isnotamongtheitemsof"words,sentences,paragraphs"*then*
> *return*empty
>
> *if* pChunkIndex isnotaninteger*then* *return*empty
>
> *if* pDirection isnotamongtheitemsof"first,last"*then* *return*empty
>
> *local*tL
>
> *switch* pChunkType
>
> *case* "words"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintotrueWordpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(trueWordpChunkIndex ofpText) intotL
>
> *put*nullintotrueWordpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *case* "sentences"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintosentencepChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(sentencepChunkIndex ofpText) intotL
>
> *put*nullintosentencepChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *case* "paragraphs"
>
> *switch* pDirection
>
> *case* "first"
>
> *put*nullintoparagraphpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharsinpText + 1)
>
> *break*
>
> *case* "last"
>
> *put*length(paragraphpChunkIndex ofpText) intotL
>
> *put*nullintoparagraphpChunkIndex to-1 ofpText
>
> *put*codeunitOffset(null,pText) intoN
>
> *delete*codeunitN to-1 ofpText
>
> *return*(thenumberofcharactersinpText + tL)
>
> *break*
>
> *end* *switch*
>
> *break*
>
> *end* *switch*
>
> *end*rwCharIndex
>
>
>
>
> On 7/31/2023 11:44 AM, Bob Sneidar via use-livecode wrote:
>> I replaced the code in the original function with this code and it
>> won’t compile.
>>
>> Do you mind posting the full working function again?
>>
>> Bob S
>>
>>
>>> On Jul 27, 2023, at 2:06 PM, Mark Waddingham via use-livecode
>>> <use-livecode at lists.runrev.com> wrote:
>>>
>>> Oh those pesky chunks which don’t ‘cover’ the target string (which
>>> is actually all of them except codeunit/point/char come to think of
>>> it). I should have run through a few more examples in my head before
>>> posting….
>>>
>>> Alternative attempt:
>>>
>>> Put null into word N to -1 of S
>>> Delete codeunit (codeunitoffset(null, S) to -1 of S
>>> Return the number of chars in S + 1
>>>
>>> The problem before was the chars which do not form part of the last
>>> chunk and remain after deletion.
>>>
>>> The above puts in a sentinel char which can be searched for to find
>>> where the requested chunk started.
>>>
>>> Second time lucky? ;)
>>>
>>> Mark.
>>>
>>> Sent from my iPhone
>>>
>>>> On 27 Jul 2023, at 21:23, Paul Dupuis via use-livecode
>>>> <use-livecode at lists.runrev.com> wrote:
>>>>
>>>> On 7/27/2023 4:31 AM, Mark Waddingham via use-livecode wrote:
>>>>>> On 2023-07-26 18:02, Paul Dupuis via use-livecode wrote:
>>>>>> If I have some text in a field, I can use the "charIndex"
>>>>>> property (see Dictionary) to obtain teh character position of the
>>>>>> first character of a chunk.
>>>>>>
>>>>>> Does anyone know of a clever way to do the equivalent of the
>>>>>> charIndex for an arbitrary chunk expression for a
>>>>>> container/variable (i.e. not an actual field object)?
>>>>> This should work I think:
>>>>>
>>>>> function charIndexOfWord pWordIndex, pTarget
>>>>> delete word pWordIndex to -1 of pTarget
>>>>> return the number of characters in pTarget + 1
>>>>> end charIndexOfWord
>>>>>
>>>>> Deletion of chunks works from the first char that makes up the
>>>>> computed range, so you are left with all the characters which sit
>>>>> before it.
>>>>>
>>>>> The index of the character immediately before the start of the
>>>>> specified word is the just the number of characters which sit
>>>>> before it; and so the index of the first char of the specified
>>>>> word (which is what charIndex gives you in a field) is that +1.
>>>>>
>>>>> The above should work for both +ve and -ve indices, and the
>>>>> obvious changes will make it work for other string chunks (i.e.
>>>>> change 'Word' for <chunk>).
>>>>>
>>>> Mark,
>>>>
>>>> Thank you very much. This was a brilliant approach and I should
>>>> have thought of it myself. However, it is not quite an accurate
>>>> substitute for the charIndex property of a field. The following
>>>> example illustrates the issue:
>>>>
>>>> pTarget is [The quick brown fox jumps over the lazy dog. The lazy
>>>> dog was named "Oz".]
>>>> pWordIndex is 8 (having been derived from searching for 'lazy', the
>>>> 8th word)
>>>>
>>>> Using [] to quote strings.
>>>> delete word 8 to -1 of pTarget -- deletes [lazy] to ["Oz"] but not
>>>> the period (.) at the end since it is not considered part of word -1.
>>>> This leaves pTarget as [The quick brown fox jumps over the .]
>>>> The number of characters in pTarget + 1 is actually not the
>>>> position of the [l] in [lazy], which is character 36, but the [a]
>>>> in [azy], character 37, due to the period being left.
>>>>
>>>> There are some similar issues, being off by or more, with
>>>> sentences and paragraphs in longer text.
>>>>
>>>> Thank you very much for chiming in with a good direction to try.
>>>>
>>>> Paul Dupuis
>>>> Researchware
>>>>
>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list