Richmond goes data-mining (a.k.a. shovelling through the sh..)

Richmond Mathewson richmondmathewson at gmail.com
Mon Apr 26 14:56:47 EDT 2010


  People who just want the list of RunRev features not available in the 
Linux version
can "cut the crap" by scrolling down to the bottom of this message.
------------------------------------------------------------------------------
After a heavy hint from Peter Alcibiades, and being a bit 'fried'
having had to lecture at the local University for Saturday and Sunday
on 2 marathon 7/8 hour lecture sessions to extra-mural students
(most of them were completely "off the wall" by the time I'd finished 
with them)
on semantic, grammatical and morphological change (talk about
bu**ering up one's weekend) and a certain embarrass de richesse
in connexion with work on my Devawriter Pro (plug, plug, shameless
plug; coming soon, watch this space) I thought I would try my sweaty
paws at data-mining.

So; downloaded BvG's 'Bvg Docu' stack - nifty number that sucked those
clotted rev files like a lamia feasting on a half-rotted corpse and spewed
out the xml files like a Glaswegian after a night on the town; heavy!

Then mucked around with a stack to chew its way through those xml files
and spot for those that didn't have:

<unix support="true"

in them

and while I'm being campy and 'artistic' I will wonder "out loud" why the
good folks at RunRev have conflated UNIX with Linux; guaranteed to rub
2 lots of people up the wrong way . . .   :) Almost as bad as mixing
Microsoft DOS up with FreeDOS . . . pass me the mouthwash Dr Watson.

and Bingo (OK, OK, I'm a creepy kinda guy; I embedded a sound file that
goes "Bingo" on script completion), out comes a list of all the
multifarious facets / aspects (call them what you will) of RunRev that work
on Macintosh or Windows (that is an inclusive OR for you logicians out
there) and DO NOT work on Linux.

It is a long list; in fact it is probably so long that the Use-lIst will 
spit back
my message as over-quota if I append it here . . .  :(

I am actually writing this as the stack does the chewing because it 
looks as though
it is developing ulcers or acute colitis.

Um . . . my temperature gauge on my G4 is goimg up alarmingly; I wonder why?

Now; BvG's wonderful stack disgorged 1,624 xml files and an index file; 
however, my
"Chewer" keeps getting stuck at file 1507 . . .

Oh, Gosh, Richmond has found another devil in the machine.

Well, maybe; but I am more interested in the results of the data-mining so
I am going to have to FORCE QUIT RunRev (curses, all my results go down 
the tubes)
and divide BvG's XML files up into 2 folders and do them all over again.

Wow; duplicated the 'BvG Docu' folder twice; naming the first 'BvG Part 
1' and
the second 'BvG Part 2'; then tried to delete 818 files from the first 
and the G4
went 'all stroppy' . . .

Um' same again with 'Part 2' - somebody 'out there' tell me that there 
is a computer
that doesn't throw a tantrum on being asked to move buckets of files to 
the trash.

Now this is bad news: I have just sent off for permission to download the
The Brooklyn-Geneva-Amsterdam-Helsinki Parsed Corpus of Old English, which
is a 'stinker' of a set of files. Why? Because I want to go 
a-data-mining on behalf
of my wife who is looking at a fairly bizarre semantic shift associated 
with "WITH";
where it moved from meaning 'against' to its current meaning, displacing the
Germanic word "MED". If RunRev baulks (Aaah; what a lovely verb) at 1500
files things are going to get pretty sticky; and there I was thinking 
that RunRev
as a data-miner would probably be a better thing than the rather 
primitive and
distinctly unattractive Java 'thing' the corpus linguistics people at 
Penn Sate
put out.

Oh, No; now the thing has "sce**d up" on the .DS files. Why? Why?

One wonders why copying a directory / folder and deleting half its contents
generates a .DS file; and why they choke RunRev. Actually, I've had problems
with .DS files elsewhere.

Cracked open 3011.xml (1507th XML file) with TextWrangler: nothing obvious
to stop RunRev in it tracks . . . Later; much later; obviously something 
very odd
indeed about 3011.xml: on setting the 'Chewer' to chew from file number 1507
the thing seized up; on setting it to run from 1508 it ran perfectly 
normally.

Humpf: bunged a 'save this stack' in one of the repeat loops, so even if 
I do have to force
quit RunRev not everything will be lost.

Blast; still got stuck at file 1507. Just had a look and RunRev isn't 
using much memory
(2.7 MB RAM and about 600 MB virtual on a machine with 2GB RAM), but it 
pushes
the temperature up something rotten.

Finished, at last; there are 204 things listed; so they can be 
downloaded here:
----------------------------------------------------------------------------------
http://andregarzia.on-rev.com/richmond/STUFF/RR non-Linux.txt

I'm off to bed. sincerely, Richmond.



More information about the use-livecode mailing list