Database or large text file processing??
Alex Tweedly
alex at tweedly.net
Mon Oct 10 07:42:33 EDT 2005
Ian Leigh wrote:
>
> I wish to retrieve details about a particular file which are held in
> a text file, fairly large at about 264000 lines. The file contains
> details about many files but I only need to retrieve one file at a
> time. I am wondering about the best way to deal with this. The text
> file is set out in such a way that the filename is found first (with
> a specific character) and the information about follows. It is always
> in the same format but obviously some have more text than others (is
> this making sense) so there aren't specific field lengths etc..
>
So I take it your file looks something like
+ <filename1>
info about filename1
which can be guaranteed to not contain any plus signs
because that's the "specific character" you mention
+ <filename2>
with not much info
+ <filename3> etc.
And each time the program is run, you get a filename (from the user) and
on;y want the info on that one file.
Is that close enough to an accurate description ?
> I could try and bring that file in and use arrays I suppose but I
> don't know what effect that would have on performance and filesize.
You might need, or want, arrays in other languages, but don't need them
in Rev for this kind of thing.
> I don't know anything about using databases with rev but I wonder if
> using them would be a more elegant solution.
Don't see anything there that needs a database.
> I don't want to have to import the text file every time the program
> is run and the text file itself is subject to occasional updating.
> This led me to think that just search through the text file itself
> might make it a bit more robust but I could do with some advice about
> which way to go.
>
You'll want to read the file each time (to deal with the updating
issue). But it will be really, really quick.
I don't think 240K lines (say < 10M) should be a problem reading into a
variable in Rev, so I'd recommend (at least as a first try), something like
Here's a little script I tried out
> on mouseUp
> local tInfo, tFile, tCatalog
> answer file "Specify a catalog file"
> put it into tCatalog
> put fld "Input" into tFile
> put getFileInfo(tCatalog, tFile) into tInfo
> put tInfo into fld "Field"
> end mouseUp
>
> function getFileInfo pCatalog, pFile
> local tAlldata, tStart, tEnd
> put URL ("file:" & pCatalog) into tAllData
>
> put lineoffset("+ " & pFile, tAllData) into tStart
> if tStart = 0 then
> -- file not found
> return "file " & pFile & " not found"
> end if
> put lineOffset("+", tAllData, tStart) into tEnd
> if tEnd = 0 then
> put -1 into tEnd
> else
> put tStart + tEnd -1 into tEnd
> end if
> return line tSTart to tEnd of tAllData
> end getFileInfo
and it retrieves the info in < 1 second on a 242439 line file (finding a
file about 100 lines from the end).
Hope that gives you some ideas ...
--
Alex Tweedly http://www.tweedly.net
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.11.14/127 - Release Date: 10/10/2005
More information about the use-livecode
mailing list