CSV to TSV (was Re: Tools & techniques for one-off consolidation of multiple 'similar' CSV files?)
Alex Tweedly
alex at tweedly.net
Tue Apr 5 13:17:01 EDT 2022
Hi Keith,
that code will fail for any commas which occur within quoted entries -
they will be wrongly converted to TABs
I'd suggest getting cvsToTab (a community effort by Richard Gaskin, me
and a whole host of others over the years) as a good starting place, and
perhaps finishing place. It will handle most CSV oddities (not all of
them - that is provably impossible :-).
This does an efficient walk through the data, remembering whether it is
inside or outside quoted entries, and hence handles commas accordingly.
https://github.com/macMikey/csvToText/blob/master/csvToTab.livecodescript
Alex.
On 05/04/2022 17:02, Keith Clarke via use-livecode wrote:
> Hi folks,
> Thanks all for the responses and ideas on consolidating multiple CSV files into - much appreciated.
>
> Ben - Thank you for sharing your working recipe. This lifted my spirits as it showed I was on the right path (very nearly!) and you moved me on a big step from where I was stuck.
>
> My script was successfully iterating through folders and files, with filtering to get a file list of just CSVs with their paths for onward processing. I’d also identified the need to maintain registers of (growing) column names, together with a master row template and a mapping of the current file’s column headers in row-1 to the master to put align output columns. I got stuck when I set up nested repeat loops for files, then lines, then items and was trying to deal with row 1 column headers and data rows at the same time, which got rather confusing. Separating the column name processing from parsing row data made life a lot simpler and I’ve now got LC parsing the ~200 CSV files into a ~60,000 row TSV file that opens in Excel.
>
> However… I’m getting cells dropped into the wrong columns in the output file. So, I’m wondering if delimiters are broken in my CSV-to-TSV pre-processing. Can anyone spot any obvious errors or omissions in the following...
> -- convert from CSV to TSV
>
> replace tab with space in tFileData -- clear any tabs in the content before setting as a delimiter
>
> replace quote & comma & quote with tab in tFileData -- change delimiter for quoted values
>
> replace comma with tab in tFileData -- change delimiter for unquoted values
>
> replace quote with "" in tFileData -- clear quotes in first & last items
>
> set the itemDelimiter to tab
>
> Best,
> Keith
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list