Shifted Results from External
Dar Scott
dsc at swcp.com
Wed May 3 16:36:01 EDT 2006
(Hello, everybody, I'm trying to crawl back out from my cave and see
what's going on here.)
I have an external problem. Well, that's where the symptoms show up.
I created an external that is a thin layer over some win32 calls
(targeted to XP). It does some extensive error checking. The code
looks clean and the external has tested out.
However, in the customer's environment which includes another
external of mine that runs a customer supplied ActiveX module
something strange happens after a while. Based on the data from the
customer, it looks like the results from system calls are shifted off
by one. That is, if calls are like this--f1(), g(), g(), f2(), g()--
at some point the data returned is that that should have been for the
previous call--empty, f1(), g(), g(), f2(). I have no queues in my
code, but it looks as if data is queued but an extra value is left in
or inserted at some point. (The customer also reported memory usage
growth.)
I suspect that some other module in my customer's environment is
breaking things--one of these: the other external, the activeX, Rev
2.6.1, XP or maybe even the Transcript. I think either the
external's memory is getting smashed (or the heap or Rev) or
something is going wrong with malloc/free. I'm pretty sure it is not
this external (famous last words).
The module uses static linking to C run time, and the best I can
tell, there is no substitution for the malloc. In all cases
*retString is set. (A quick check shows gibberish is returned if it
is not.) Strong exception catching is used. All function results
are checked for CRT and WIN32 calls. I checked the calls to malloc
and free and in my external they balance. I make no calls to CRT
functions that use malloc (according to MSDN documentation). I
haven't looked into where malloc gets its memory yet, maybe the
process heap--anybody know?
These use my C++ libraries for externals, but these have worked for a
long time and in lots of environments. (More famous last words.)
The test stack does not seem to be blowing the Transcript call stack,
but does have some interesting uses of wait with messages.
I'm not able to duplicate this in my environment on 3 machines. I've
made an effort to make sure the environments are the same as that of
my customer, but was in the middle of that when the troubleshooting
effort was stopped. The customer test stack makes lots of different
kinds of calls and uses send a lot. In any case, the test is not
small and it takes a while to fail in the customer's environment.
Since I couldn't replicate the bug (I know how RunRev feels with some
of the Rev bugs), I sent some variations that might shift the
symptoms or even report what went wrong. Unfortunately, one of them
(one that uses malloc less) did not display the problem, and testing
of the batch of variations stopped right there, most untried.
I realize this is very weird and folks on this list, even external
builders, may not have seen this, but I thought I'd give it a try.
I hope my customer can get his product to run reliably and I want to
vindicate this external.
I can come up with a model for almost anything, but this baffles me.
What can cause this?
OK, here is a model, but it is pretty wild: I know external calls
are slow, but I would be surprised if Rev is pushing & pulling data
through queues to another thread that runs external calls.
Dar Scott
Rev guy on the northern Rio Grande
More information about the use-livecode
mailing list