Need Help With String Pattern Matching

Gregory Lypny gregory.lypny at videotron.ca
Sat Jun 11 15:48:00 EDT 2016


Hello everyone,

I’ve just come back to LiveCode and I'm pretty little rusty. I used to do some basic text analysis of files where the lines containing strings of interest were consistent and therefore easy to spot. I am now working on files where the chunk of text that contains the data I want is more ambiguous. I figure I should be using MatchChunk and was wondering if anyone might give me some tips on how to do the following. The chunk that I want to extract will have a certain word or phrase near its start and a certain word or phrase near its end. There may be many such chunks like it in the document, but the best candidate contains certain other strings. Here’s an example:

The chunk starts with the word *owner* or the phrase *beneficial owner*.

The chunk ends with *all directors* or *less than one percent*.

The chunk contains all of the following:
- At least four or five big numbers, e.g., 234,879
- At least two percentages, e.g., 3.4%, or percentage signs

If you are curious, this would more or less identify an ownership table in a proxy statement filed at the Securities and Exchange Commission. These are archived at the SEC in text and html (in vintages going back to about 1994).

Any tips or examples would be much appreciated.

Regards,

Gregory







More information about the use-livecode mailing list