MatchText, MatchChunk and the needle in the haystack
Bryan McCormick
bryan at deepfoo.com
Mon Mar 19 16:40:17 EDT 2007
Dave and Jim,
Thanks for the suggestions. I think I have now exhausted all but the
ugly walk through by hand and find the errors/solutions. For the record:
1) The textBlock length is not changing since I am not deleting anything
from the block (or adding). I am using pos to advance the search ahead
in the string so that removal is not necessary. Or at least it should
not be. This is confirmed.
2) Did the test for nulls. Nuttin' honey. Good idea though I had
forgotten what a pain those can be. I am nonetheless flushing all the
text before processing by removing nulls if they exists (they don't so
far but that is not to say they might not happen in one or two of the
files so it won't hurt).
3) Because these records are very simple and do not contain a whole lot,
I was able to blow off any ascii below 32 and above 127. Doing this
confirmed that the hyphens all appear to be hyphens. That was a good
idea though, it could easily have been an oddball character.
4) Just to double check (it's more than double now but you get the
drift) I checked to see if I could find known values in the suspect strings:
for example I searched for "5-Jan-99" explicitly because it is one of
the errors. here is a fragment of the string it is in.
form5-Jan-99War
And yes, it is there when I type in "put offset("form5-Jan-99War", fld
1)" into the msg box. Finds the offset without issue.
So the good news is at least the text is not getting munged in some
awful way.
Unfortunately that means that there is just something brain-dead obvious
yet hidden in the script.
I am going to take a rest before I have a melt and come back to it in a
little while.
The better news is at least there are lines that can be checked now.
Remember these were single long strings before this stage so if I have
to do the ugly and go line by line, at least now I have have lines.
More information about the use-livecode
mailing list