Not shy at all...
Jan Schenkel
janschenkel at yahoo.com
Sun Jun 6 13:33:57 EDT 2004
--- Bob Nelson <bobnelson at mac.com> wrote:
> ...so let's dive in with both feet.
>
> HyperCard was my friend - and remained my friend
> until the advent of OS X
> and a new machine that won't boot into OS 9.x any
> more. Sad, since I used
> it for all sorts of cool tricks, especially the
> massaging of copious amounts
> of data that needed a good "cleaning" before
> dropping it into MySQL or
> FileMaker.
>
Hi Bob,
Welcome to the Revolution -- and for the sort of work
you're doing, it is an excellent choice ; the URL
library makes it easy to read data, and the database
library allows you to connect to MySQL directly and to
FMPro via web-calls or ODBC.
> A new project came along and prompted me to go
> hunting. One of the old
> HyperCard sites recommended Revolution or SuperCard,
> so I've been demo'ing
> Revolution for a couple of days to see how the
> package operates compared to
> other options - including RealBASIC.
>
> So I've got a little script that does a moderately
> simple thing: Grab a web
> page, bring it back, strip useless data out of it
> (right now, I've just got
> it stripping out the extra returns and leading
> spaces per line) and the next
> step will be to kill the HTML on the page so I can
> mine the data...
>
> My layout and code are fairly simple:
>
> Two fields and two buttons so I can work through the
> example - the first
> field is the 'holder' of the remote URL which has
> been retrieved
> (Imported_Raw) and the second field will be the
> restructured output when I'm
> done.
>
> Here's the code, for those who want to dive
> deeper...
>
> on mouseUp
> put 0 into i
> repeat forever
> add 1 to i
> if char 1 of line i of field "Imported_Raw" is
> numToChar(13) then
> delete line i of field "Imported_Raw"
> put "Ate one return at line " & i & " of " &
> the number of lines of
> field "Imported_Raw" & " total lines."
> subtract 1 from i
> end if
> repeat while char 1 of line i of field
> "Imported_Raw" = " "
> delete char 1 of line i of field
> "Imported_Raw"
> put "Ate one space at line " & i & " of " &
> the number of lines of
> field "Imported_Raw" & " total lines."
> end repeat
> if line i of field "Imported_Raw" is the last
> line of field
> "Imported_Raw" then
> exit repeat
> end if
> end repeat
>
> end mouseUp
>
Allow me to optimize this script :-)
--
on mouseUp
-- copy the field data into a variable
put field "Imported_Raw" into tData
-- calculate the number of lines once
put the number of lines of tData into tNumLines
-- initialise line tracker variable
put 0 into i
-- use the speedy 'repeat for each' construct
repeat for each line tLine in tData
-- update the progress
add 1 to i
if i MOD 100 = 1 then
-- show progress
put "Processing line" && i &&"of" && tNumLines
end if
-- skip the line if it's empty
if tLine is empty then next repeat
-- eat the leading and trailing spaces
put word 1 to -1 of tLine into tLine
-- append this bit to a different variable
put tLine & return after tCleanData
end repeat
end mouseUp
--
>
> Here's what I noticed about execution:
>
> 1. Importing the URL is awesome - a great feature
> that makes my life soooo
> much easier for this project! And fast, too!
Yup, it's a life saver :-)
> 2. The page I grabbed consisted of 140,000 lines of
> code. After removing
> extra line feeds, the number of lines is around
> 80,000.
Not too shabby, but certainly not beyond Revolution's
capabilities.
> 3. This script runs VERY slow, compared to
> relatively the same script in
> HyperCard running under 9.2.1 -- as an example,
> Revolution has been running
> this script for more than 18 hours and still hasn't
> finished processing.
> (And that's running on a Dual 2 GHz, 4 Gb RAM, OS X
> most current version
> with all updates.) Under HC, the similar script
> executed in about an hour -
> running on an iMac G3/233 with 1 Gb and OS 9.2.1 --
> any comments regarding
> execution speed?
The version I produced should require far less
overhead and zip along at a very good speed.
The main problem with your approach was that it
constantly updates the data in your field, which
results in a redraw ; plus, you're asking the
processor to calculate the number of lines numerous
times, and to find the offset of line i.
> 4. I don't see any mechanisms for determining
> progress of the operation --
> although I may have certainly missed something. Are
> there any progress
> bars, etc., that one can use in Revolution?
There are "progress bar" controls ; see the following
recent mailing list posts on how to use them :
<http://lists.runrev.com/pipermail/use-revolution/2004-June/037480.html>
<http://lists.runrev.com/pipermail/use-revolution/2004-June/037484.html>
> 5. Looking through all the examples I can find, as
> well as documentation, I
> noted that there aren't many examples related to
> text manipulation - and
> importing/exporting text, etc., in/out of your
> stack. I'm sure I missed
> something on this front, as I'm sure people would be
> doing this all the
> time... Can anyone point me in a direction?
>
I hope the above helped ; but make sure to browse
through the entire documentation, as you'll find a
Cookbook with examples.
> Finally, I'm impressed with the professional layout
> of the product - this
> could well be the perfect 'update' (I'm sure they
> don't like to hear that at
> Rev!) to HyperCard. I'm looking forward to a book,
> like The Complete
> HyperCard Handbook, that lays out the functionality
> of Revolution as an
> awesome reference book.
>
There's Dan Shafer's book : "Revolution : SOftware at
the speed of thought" ; for more information :
<http://www.runrev.com/resources/shaferbook.shtml>
> Thanks for your time!
>
> Bob
>
Hope this helped,
Jan Schenkel.
=====
"As we grow older, we grow both wiser and more foolish at the same time." (La Rochefoucauld)
__________________________________
Do you Yahoo!?
Friends. Fun. Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/
More information about the use-livecode
mailing list