3 megs of text

Sarah Reichelt sarahr at genesearch.com.au
Mon Mar 4 19:04:01 EST 2002


Can you sort the list by date & time? If so, you could try this:

put bigData into dataA
filter dataA with "*text_A*"
-- dataA only contains the lines with "text_A"

put bigData into dataB
filter dataB with "*text_B*"

put dataA & cr & dataB into shortData
sort lines of shortData dateTime by item 1 to 2 of each
-- now you just have the text_A & text_B lines

Then use David's suggestions for the main loop
- use repeat for each line
- store the previous line rather than re-finding it each time

OR use lineOffset to find the next text_A line and check the one after for
text_B

repeat
    get lineOffset("text_A", shortData)
    if it = 0 or shortData is empty then
        -- you're finished
        -- do whatever you want with the info

        exit repeat  -- don't forget this or you'll be stuck in the loop
    end if

    if line it+1 of shortData contains "text_B" then
        -- do your date & time stuff
    end if

    delete line 1 to it of shortData
end repeat

Note: the lineOffset approach might be fast enough even without the
preliminary filtering.

Hope this helps,
Sarah


> I have about 3 MB of text data in one single logfile which I need to parse
> and do some maths for time and date manipulations with. I don't want to go
> through every single line of it - it simply would take far too much time.
> 
> Each line has 3 items: gDate,gTime,"text_A" or "text_B" or "text_rubbish".
> 
> 1.) I need to get rid of every line that contains "text_rubbish" (item 3).
> 2.) If "text_A" in line i is followed by "text_b" in line i+1 then add
> (gTime of line i+1) minus (gTime of line i) to spentTime and 3.) put gDate
> into dateList.
> 
> A 480 KB file took my G3/233 about 18 minutes to do this by using "normal"
> repeat loops with 1 if-then-else statement inside.
> 
> Any ideas to speed this up?
> 
> Thank you very much for any help and best regards,
> 
> Ralf




More information about the use-livecode mailing list