Tools & techniques for one-off consolidation of multiple 'similar' CSV files?

Bob Sneidar bobsneidar at iotecdigital.com
Mon Apr 4 18:40:23 EDT 2022


Okay no answer to my question so here is why I ask. SQL databases typically have a limit on how many columns you can have, and how many total bytes a record will take. If it is possible to import each CSV file as a separate table, that would be ideal. If not you need to make yourself aware of those limits, and if you are going to exceed them, then SQL is not the way to go. 

CSV format is OK if the text fields are enclosed in quotes, and the numeric fields can be enclosed in quotes or not, it doesn't matter. If text fields ARE enclosed in quotes, then Excel should be able to open it natively, even if there are commas in the field text, that is not as delimiters. Id the fields are NOT enclosed in quotes, and the field text DOES contain commas, you are essentially screwed. 

Bob S


> On Apr 4, 2022, at 10:37 , Bob Sneidar via use-livecode <use-livecode at lists.runrev.com> wrote:
> 
> Does all the data need to be in a single table? 
> 
> Sent from my iPhone
> 
>> On Apr 4, 2022, at 10:31, Mike Kerner via use-livecode <use-livecode at lists.runrev.com> wrote:
>> 
>> keith,
>> are all the files structured the same way? are they all gathered in the
>> same place?
>> LC's big strength, IMHO, is text handling, so you're right in its
>> wheelhouse.
>> for the simplest example, let's assume that all the files have all the same
>> column layout and they're all in the same folder.
>> in that case, you would
>> * grab the list of files in the folder
>> * exclude files that shouldn't be included - maybe filter everything
>> that doesn't have a .csv suffix. you could do this during the loop, below,
>> or ahead of time.
>> * iterate through the files by reading each one, and appending the contents
>> to a variable/container (if you do this, don't forget to make sure that
>> when you append each file, the last line ends with a line delimiter)
>> * create a new file
>> * save the variable/container to the file
>> 
>> slightly more complicated: the layouts aren't the same
>> you can either:
>> * rearrange the columns upon reading the file or
>> * use something like an sqlite db and create a record for each row,
>> assigning the column in each record based on the column name in the csv
>> file,  or
>> * use a LC array, using the column names in the file as the keys of the
>> array.
>> * create a new file
>> * output the result of whichever of the three solutions you chose to the
>> new file.
>> 
>> both the easy and less-easy scenarios should take, i'm guessing, somewhere
>> between twenty and fifty lines of code, and are easy to implement. if you
>> have a couple hundred files, i think that the LC solution would be much
>> faster and easier to write, test, and run than the drag-and-drop solution.
>> 
>>> On Mon, Apr 4, 2022 at 1:04 PM Keith Clarke via use-livecode <
>>> use-livecode at lists.runrev.com> wrote:
>>> 
>>> Hi folks,
>>> I need to consolidate a couple of hundred CSV files of varying sizes
>>> (dozens to hundreds of rows) and column structures (some shared columns but
>>> many unique), into a single superset dataset that comprises all data from
>>> all files.
>>> 
>>> There are too many files and columns to attempt a manual process, so in
>>> I’ve been trying to achieve this with LiveCode - by iterating through the
>>> files to parse the column header rows into into row template and column
>>> mapping arrays used to build the combined table. However, I'm struggling to
>>> both compile the superset and render the results.
>>> 
>>> I feel I may be designing a very complicated solution for a problem that
>>> has probably been solved before. I wonder if I’m missing a simpler method -
>>> with LiveCode or perhaps with Excel or ‘drag & drop’ into some kind of
>>> self-organising database GUI for SQLite, etc?
>>> 
>>> Thanks in advance for any ideas.
>>> Best,
>>> Keith
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>> 
>> 
>> 
>> -- 
>> On the first day, God created the heavens and the Earth
>> On the second day, God created the oceans.
>> On the third day, God put the animals on hold for a few hours,
>>  and did a little diving.
>> And God said, "This is good."
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the use-livecode mailing list