Tools & techniques for one-off consolidation of multiple 'similar' CSV files?

Mike Kerner MikeKerner at roadrunner.com
Mon Apr 4 13:29:23 EDT 2022


keith,
are all the files structured the same way? are they all gathered in the
same place?
LC's big strength, IMHO, is text handling, so you're right in its
wheelhouse.
for the simplest example, let's assume that all the files have all the same
column layout and they're all in the same folder.
in that case, you would
* grab the list of files in the folder
* exclude files that shouldn't be included - maybe filter everything
that doesn't have a .csv suffix. you could do this during the loop, below,
or ahead of time.
* iterate through the files by reading each one, and appending the contents
to a variable/container (if you do this, don't forget to make sure that
when you append each file, the last line ends with a line delimiter)
* create a new file
* save the variable/container to the file

slightly more complicated: the layouts aren't the same
you can either:
* rearrange the columns upon reading the file or
* use something like an sqlite db and create a record for each row,
assigning the column in each record based on the column name in the csv
file,  or
* use a LC array, using the column names in the file as the keys of the
array.
* create a new file
* output the result of whichever of the three solutions you chose to the
new file.

both the easy and less-easy scenarios should take, i'm guessing, somewhere
between twenty and fifty lines of code, and are easy to implement. if you
have a couple hundred files, i think that the LC solution would be much
faster and easier to write, test, and run than the drag-and-drop solution.

On Mon, Apr 4, 2022 at 1:04 PM Keith Clarke via use-livecode <
use-livecode at lists.runrev.com> wrote:

> Hi folks,
> I need to consolidate a couple of hundred CSV files of varying sizes
> (dozens to hundreds of rows) and column structures (some shared columns but
> many unique), into a single superset dataset that comprises all data from
> all files.
>
> There are too many files and columns to attempt a manual process, so in
> I’ve been trying to achieve this with LiveCode - by iterating through the
> files to parse the column header rows into into row template and column
> mapping arrays used to build the combined table. However, I'm struggling to
> both compile the superset and render the results.
>
> I feel I may be designing a very complicated solution for a problem that
> has probably been solved before. I wonder if I’m missing a simpler method -
> with LiveCode or perhaps with Excel or ‘drag & drop’ into some kind of
> self-organising database GUI for SQLite, etc?
>
> Thanks in advance for any ideas.
> Best,
> Keith
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>


-- 
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
   and did a little diving.
And God said, "This is good."


More information about the use-livecode mailing list