3 megs of text
Sarah Reichelt
sarahr at genesearch.com.au
Mon Mar 4 19:04:01 EST 2002
Can you sort the list by date & time? If so, you could try this:
put bigData into dataA
filter dataA with "*text_A*"
-- dataA only contains the lines with "text_A"
put bigData into dataB
filter dataB with "*text_B*"
put dataA & cr & dataB into shortData
sort lines of shortData dateTime by item 1 to 2 of each
-- now you just have the text_A & text_B lines
Then use David's suggestions for the main loop
- use repeat for each line
- store the previous line rather than re-finding it each time
OR use lineOffset to find the next text_A line and check the one after for
text_B
repeat
get lineOffset("text_A", shortData)
if it = 0 or shortData is empty then
-- you're finished
-- do whatever you want with the info
exit repeat -- don't forget this or you'll be stuck in the loop
end if
if line it+1 of shortData contains "text_B" then
-- do your date & time stuff
end if
delete line 1 to it of shortData
end repeat
Note: the lineOffset approach might be fast enough even without the
preliminary filtering.
Hope this helps,
Sarah
> I have about 3 MB of text data in one single logfile which I need to parse
> and do some maths for time and date manipulations with. I don't want to go
> through every single line of it - it simply would take far too much time.
>
> Each line has 3 items: gDate,gTime,"text_A" or "text_B" or "text_rubbish".
>
> 1.) I need to get rid of every line that contains "text_rubbish" (item 3).
> 2.) If "text_A" in line i is followed by "text_b" in line i+1 then add
> (gTime of line i+1) minus (gTime of line i) to spentTime and 3.) put gDate
> into dateList.
>
> A 480 KB file took my G3/233 about 18 minutes to do this by using "normal"
> repeat loops with 1 if-then-else statement inside.
>
> Any ideas to speed this up?
>
> Thank you very much for any help and best regards,
>
> Ralf
More information about the use-livecode
mailing list