reading CSV text file

Jim Ault JimAultWins at yahoo.com
Fri Jun 22 19:04:29 EDT 2007


> My problem is, i do not know, how can i read the file and detect each item of
> a line, as the items are sepearated by comma and the textfield/items could
> contain comma, too. How can i tell Revolution to ignore the comma in the
> "tesxtarea of an item"

Short answer is to ask for a tab delim file instead.

I have had to do this task before because we bought industry data that was
only available in CSV.  Probably the worst format ever, but everyone seems
to use it.  Tab delim is sooo much better,

Part of the situation is:
    text values with quotes surrounding
and 
    text values with embedded quotes
    text values with embedded commas

Embedded commas are problematic in any solution you program.
  --> Fuji Cleaning Cartridge für DLT Streamer, retail

The basic concept you need to keep is that you are building a character
scanner with flags to represent conditions.

A  reading value w/o quote, thus number (commas illegal now)
B  reading value w/ quote as first char, thus string
C found quote followed by comma, thus end of string
D found quote not followed by comma, thus embedded quote
E reading string, found comma before closing quote, thus embedded comma

Thus 
embedded commas occur between quotes
embedded quotes occur between (,"  and ",)
as in -> ,"chain, 12" gold",
------------------------------------

Job 1 is to scan for commas where readingString = true
and convert it to numtochar (3)
------------ NOTE { this is off the top of my head, so
--------------  test drive before actually launching
------------------  missles }

put quote into q
false into readingString
repeat for each char CH in textBlock
   if (CH is q) and readingString then put false into readingString
   if (CH is ",") and readingString then
      put numtochar (3) after newTextBlock
   else
       put CH after newTextBlock
   end if
end repeat
--------------------------------------
Job 2 is to scan for quotes where readingString = true
and convert it to numtochar (4)
false into readingString
put empty into prevCH
repeat for each char CH in textBlock
   if (CH is ",") and prevCH is q then
      put false into readingString
      put prevCH after newTextBlock
   else
      put numtochar (3) after newTextBlock
   end if
    if (CH is q) and prevCH is "," then
      put false into readingString
   end if
end repeat
put CH after newTextBlock

--------------------------------------
Job 3 is to
put quote into q
replace (q&","&q) with tab in newTextBlock
replace (q&",") with tab in newTextBlock
replace (","&q) with tab in newTextBlock

replace  numtochar (4) with comma in newTextBlock
replace  numtochar (3) with q in newTextBlock
-----------------------------

By the way, you are dead if the following was typed into the database

"chain, gold, 12", pendant, ruby, 2" clasp, diam 0.5", 128,35.00

Good luck,

Jim Ault
Las Vegs


On 6/22/07 2:36 PM, "runrev260805 at m-r-d.de" <runrev260805 at m-r-d.de> wrote:

> Hi,
> 
> i have to read a textfile, which is comma separated and contains " as text
> identifier. The item delimiter is ,
> My problem is, i do not know, how can i read the file and detect each item of
> a line, as the items are sepearated by comma and the textfield/items could
> contain comma, too. How can i tell Revolution to ignore the comma in the
> "tesxtarea of an item"
> 
> Here´s an example line
> 
> "7722"," ","Fuji Cleaning Cartridge für DLT Streamer, retail","DLT
> Reinigungskassette für bis zu 20 Reinigungen nicht für DLT 1 und
> VS!","","","BAND","183","Diverse
> 
Hersteller","42419",31.03,38.79,35.27,34.48,34.48,34.48,32.66,0.000,0.000,"",2>
4
> 
> As you can see, some fields/items are included in " ,others (the numeric
> values) are not. The field included in " could contain , .
> 
> Any idea, how i can solve that. I have to mention that in the textfields/items
> there could be also the " as product description e.g.  TFT 19"  or HDD 2.5"





More information about the use-livecode mailing list