Stripping CRLF from incoming html files

Sannyasin Sivakatirswami katir at hindu.org
Tue Nov 9 22:55:28 EST 2004


I set up a script to download a lot of html files to cards. then I'm 
extracting the "crucial" text and leaving all the wrapper stuff 
(headers footers etc) behind. In the process I'm getting a lot of 
vertical white space which I usually remove with something like

replace (cr&cr&cr) with (cr&cr) in tInputData
replace (cr&cr&cr) with (cr&cr) in tInputData
replace (cr&cr&cr) with (cr&cr) in tInputData

three times which is usually enough to get it down to single blank 
lines between divs etc....

but, I discovered this wasn't working... I grabbed the white space and 
read the ASCII and was getting

char(10)
13
10
13
10
13
10
13

so, I thought I would try to write a script to clean this up... I'm 
working on a Mac, OSX

I tried

replace CRLF with numToChar(10)

but it didn't work

I tried

replace numToChar(13) with ""  in tInputData

it also didn't work

I'm missing something really simple here.

??








More information about the use-livecode mailing list