Importing data into RevDB
Kay C Lan
lan.kc.macmail at gmail.com
Mon Dec 7 21:10:26 EST 2009
When I first had headaches working with csv I ended up, in Excel prior to
exporting, doing a 'Replace All'
, to ccccc
' to sqsqsq (single quote)
" to dqdqdq (double quote)
tab to ttttt
carriage returns to crcrcr
the basic principle being replacing all problem characters with a character
or character sequence that you know wont occur in the data. In one stack I
have, because the data is so valuable, I even go to the trouble of searching
the data first for ttttt to confirm it doesn't occur before doing the
replace sequence, and if it does it will keep adding additional t's until it
finds a safe replace set of characters.
This eliminated ALL comma count errors or quote problems when dealing with
the data in Rev. But then I soon discovered that if I went with TAB
delimited for the Excel output I was just wasting my time replacing all the
, ' and " (commas, single and double quotes). All I needed to do was replace
tab and carriage returns/line feeds and Export as TAB delimited and change
the itemDelimiter in Rev to tab.
As I said, I have some very important data, and it's vital that it is not
corrupted, so if you have an example from you magazine data, where replacing
all the tabs and cr first, then exporting as tab delimited will NOT result
in a line of data in Rev = a row of data in Excel, or an item of data in Rev
= a cell of data in Excel, I'd like to know because I'll need to determine
if such a problem could effect my data.
PS I've been down the XML route as well and personally I find: Replace All,
Export as Tab delimited, Repeat for each line, much much faster than doing
anything with XML.
PPS Of course this only works if you can control the data export output. If
the data is already set in stone, and it was outputted as csv, I'm deeply
sorry and can feel your pain;-(
On Tue, Dec 8, 2009 at 1:17 AM, Andre Garzia <andre at andregarzia.com> wrote:
> I've just imported 30 MB worth of text into a SQLite database over a remote
> connection. What I tried using first was a CSV file as well and I could not
> make it work. I am importing magazine articles, the comma count is all
> and unless I wrote a super dupper RegEx thing to cope with quote values
> can contain commas and quotes up to unlimited levels in a field then it
> would never work. I ended up using XML as my data source format and all of
> sudden everything started falling into place.
More information about the Use-livecode