a regular expression question, or at least a text manipulation question

Peter Alcibiades palcibiades-first at yahoo.co.uk
Wed Aug 27 16:35:43 EDT 2008


How do you do the following?

I have a series of lines which go like this

|  [record separator, new record starts]
AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed
CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
| [record separator]
AAA adipisicing elit, sed   [new record starts]

| is the record separator.

In the above, its CCC that is repeated, but it could be any prefix.  Also CCC 
is next to its repetition.  This will always be the case.

I want to go through the file.  When I find a single prefix (like AAA) this 
should be written to the output file.  when the next line starts with the 
same prefix (as in the CCC cases, I want to put both occurences on the same 
line.  So the desired output would be

AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
EOR
AAA adipisicing elit, sed

How do I detect a repetition of that sort and do this? 

A similar question, if the line is

CCC  adipisicing elit, sed TAB CCC  adipisicing elit, sed

How do you detect the multiple occurence (I can do this with regex) and then 
write out in place of thie above expression (this I don't see how to do) the 
following:

CCC  adipisicing elit, sed CCC  adipisicing elit, sed

Obviously, the pseudo latin is different in each case, so no way to check 
using that.

Peter



More information about the use-livecode mailing list