process data in steps of 50 lines

Mark Talluto userev at canelasoftware.com
Tue Jun 7 17:46:54 EDT 2011


Hi Matthias,

The code below is based on a locally saved text file that is vertical bar delimited.

The code counts a particular value in each customer record.  The true portion of the if counts the entire database.  The else portion counts only the customers that are currently being viewed.  These would be a smaller subset of the same database.  

on countLicenses
      put switchMode() into tCurrentDatabase
      
      if fld "search" is empty then
            --COUNT ENTIRE DATABASE
            --SORT PUTTING ALL THE NON NUMERIC COUNTS AT THE TOP
            set the itemdel to "|"
            sort lines of tCurrentDatabase numeric ascending by item 16 of each
             
            --REMOVE THE LINES THAT DO NOT HAVE A NUMERIC VALUE
            repeat until tValue = "done"
                  put item 16 of line 1 of tCurrentDatabase into tValue
                  if tValue < 1 then delete line 1 of tCurrentDatabase else put "done" into tValue
            end repeat
            put the num of lines of tCurrentDatabase into fld "visible licenses"
      else
        
            --CREATE LIST OF IDS TO FOCUS ON FROM THE USERS FIELD
            put fld "users" into tCurrentUsersDatabase
            set the itemdel to tab
             
            repeat for each line thisLine in tCurrentUsersDatabase
                  put item 1 of thisLine & lf after tIDs
            end repeat
            if the last char of tIDs is LF then delete the last char of tIDs --REMOVES TRAILING LF
             
            --CHUNK THE DATABASE INTO SMALLER GROUPS
            put 600 into tChunkSize
            put the num of lines of tCurrentDatabase into tTotalLines
            put 1 into x
            put ((tTotalLines/tChunkSize) + 1) into tIterations
             
            repeat tIterations
                  put line 1 to tChunkSize of tCurrentDatabase into aDatabase[x]
                  delete line 1 to tChunkSize of tCurrentDatabase
                  add 1 to x
            end repeat
             
            --WALK THROUGH EACH LINE AND COUNT THE LICENSES
            put 0 into tCount
            put 1 into x
            set the itemDel to "|"
             
            repeat tIterations
                  repeat tChunkSize
                        --GET LINE TO WORK ON
                        put line 1 of aDatabase[x] into thisLine
                        if item 1 of thisLine <> line 1 of tIDs then
                              delete line 1 of aDatabase[x] --delete line from memory to make next check faster
                              next repeat
                        end if
                        if thisLine is empty then exit repeat
                        delete line 1 of aDatabase[x] --delete line from memory to make next check faster
                        delete line 1 of tIDs --delete line from memory to make next check faster
                         
                        --COUNT LICENSES
                        add item 16 of thisLine to tCount
                  end repeat
                  add 1 to x
            end repeat
            put tCount into fld "visible licenses"
      end if
end countLicenses


Best regards,

Mark Talluto
http://www.canelasoftware.com



On Jun 7, 2011, at 2:07 PM, Matthias Rebbe wrote:

> Hi Mark,
> 
> could you please explain how you did that. How you chunk the data into groups?
> 
> That is not clear for me.
> 
> Regards,
> 
> Matthias
> Am 07.06.2011 um 20:46 schrieb Mark Talluto:
> 
>> On Jun 7, 2011, at 6:27 AM, Mark Schonewille wrote:
>> 
>>> Since the data is already in memory, there is no reason to process it in steps of 50. Also, using repear with x =... is very slow. Use repear for each with a counter instead:
>> 
>> 
>> I believe there is a major speed benefit to chunking the data into smaller groups.  We optimized a data processing app with this technique and brought tasks from minutes of processing down to milliseconds.
>> 
>> The technique chunks the data into groups of 600 lines.  Then you use a basic repeat.  No with or for each.  Then you read just the first line of the repeat.  Then delete the first line of the dataset.  You will see a major improvement in speed.
>> 
>> 
>> Best regards,
>> 
>> Mark Talluto
>> http://www.canelasoftware.com
>> 
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list