process data in steps of 50 lines

Matthias Rebbe matthias_livecode at me.com
Tue Jun 7 17:56:51 EDT 2011


Hi Mark,

many thanks for that. I will check it tomorrow. It´s late here and i am afraid if i look into it now i will get no sleep. ;)

Regards,

Matthias
Am 07.06.2011 um 23:46 schrieb Mark Talluto:

> Hi Matthias,
> 
> The code below is based on a locally saved text file that is vertical bar delimited.
> 
> The code counts a particular value in each customer record.  The true portion of the if counts the entire database.  The else portion counts only the customers that are currently being viewed.  These would be a smaller subset of the same database.  
> 
> on countLicenses
>      put switchMode() into tCurrentDatabase
> 
>      if fld "search" is empty then
>            --COUNT ENTIRE DATABASE
>            --SORT PUTTING ALL THE NON NUMERIC COUNTS AT THE TOP
>            set the itemdel to "|"
>            sort lines of tCurrentDatabase numeric ascending by item 16 of each
> 
>            --REMOVE THE LINES THAT DO NOT HAVE A NUMERIC VALUE
>            repeat until tValue = "done"
>                  put item 16 of line 1 of tCurrentDatabase into tValue
>                  if tValue < 1 then delete line 1 of tCurrentDatabase else put "done" into tValue
>            end repeat
>            put the num of lines of tCurrentDatabase into fld "visible licenses"
>      else
> 
>            --CREATE LIST OF IDS TO FOCUS ON FROM THE USERS FIELD
>            put fld "users" into tCurrentUsersDatabase
>            set the itemdel to tab
> 
>            repeat for each line thisLine in tCurrentUsersDatabase
>                  put item 1 of thisLine & lf after tIDs
>            end repeat
>            if the last char of tIDs is LF then delete the last char of tIDs --REMOVES TRAILING LF
> 
>            --CHUNK THE DATABASE INTO SMALLER GROUPS
>            put 600 into tChunkSize
>            put the num of lines of tCurrentDatabase into tTotalLines
>            put 1 into x
>            put ((tTotalLines/tChunkSize) + 1) into tIterations
> 
>            repeat tIterations
>                  put line 1 to tChunkSize of tCurrentDatabase into aDatabase[x]
>                  delete line 1 to tChunkSize of tCurrentDatabase
>                  add 1 to x
>            end repeat
> 
>            --WALK THROUGH EACH LINE AND COUNT THE LICENSES
>            put 0 into tCount
>            put 1 into x
>            set the itemDel to "|"
> 
>            repeat tIterations
>                  repeat tChunkSize
>                        --GET LINE TO WORK ON
>                        put line 1 of aDatabase[x] into thisLine
>                        if item 1 of thisLine <> line 1 of tIDs then
>                              delete line 1 of aDatabase[x] --delete line from memory to make next check faster
>                              next repeat
>                        end if
>                        if thisLine is empty then exit repeat
>                        delete line 1 of aDatabase[x] --delete line from memory to make next check faster
>                        delete line 1 of tIDs --delete line from memory to make next check faster
> 
>                        --COUNT LICENSES
>                        add item 16 of thisLine to tCount
>                  end repeat
>                  add 1 to x
>            end repeat
>            put tCount into fld "visible licenses"
>      end if
> end countLicenses
> 
> 
> Best regards,
> 
> Mark Talluto
> http://www.canelasoftware.com
> 
> 
> 
> On Jun 7, 2011, at 2:07 PM, Matthias Rebbe wrote:
> 
>> Hi Mark,
>> 
>> could you please explain how you did that. How you chunk the data into groups?
>> 
>> That is not clear for me.
>> 
>> Regards,
>> 
>> Matthias
>> Am 07.06.2011 um 20:46 schrieb Mark Talluto:
>> 
>>> On Jun 7, 2011, at 6:27 AM, Mark Schonewille wrote:
>>> 
>>>> Since the data is already in memory, there is no reason to process it in steps of 50. Also, using repear with x =... is very slow. Use repear for each with a counter instead:
>>> 
>>> 
>>> I believe there is a major speed benefit to chunking the data into smaller groups.  We optimized a data processing app with this technique and brought tasks from minutes of processing down to milliseconds.
>>> 
>>> The technique chunks the data into groups of 600 lines.  Then you use a basic repeat.  No with or for each.  Then you read just the first line of the repeat.  Then delete the first line of the dataset.  You will see a major improvement in speed.
>>> 
>>> 
>>> Best regards,
>>> 
>>> Mark Talluto
>>> http://www.canelasoftware.com
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list