How does compiling work?

Julian Ohrt lists.runrev.com at ohrt.org
Fri Sep 9 04:42:17 EDT 2011


Wow! Thank you very much, Richard!

Sounds a little like PHP which -on loading- creates opcode which is 
executed on the fly; but also allows interpreted code (using the eval() 
function).

Too bad that there is no official "how it works" explanation as Stephen 
pointed out.

Thanks again!
Julian



On 08.09.2011 16:00, Richard Gaskin wrote:
> Julian Ohrt wrote:
>
>> Is there any documentation how compiling of livecode works internally?
>> Is it a compiler which can produce native code (for Windows, Linux,
>> etc.)? Are the scripts packaged within the executable together with an
>> interpreter and interpreted at run time? Or is it more like a virtual
>> machine approach?
>
> Yes, I think it could be said that LiveCode has more in common with a
> virtual machine than almost any other metaphor.
>
> My understanding of the under-the-hood mechanics is very limited, but
> that won't stop me from trying. :)
>
> There are many layers to code execution and the languages which work at
> each level, which could be summarized as:
>
> - CPU instruction set/Object code: the intructions the processor is able
> to handle on its own, purely binary code; these are very primitive,
> consistent largely of moving stuff from one memory location to another,
> some basic math routines, etc. Most mortals never write machine code
> directly, relying on assemblers or or compilers to translate their more
> human-readable code into machine instructions.
>
> - Assembler: a way of working directly with the CPU instruction set, but
> with the advantage of using mnemonic labels for the instructions ("MOVE"
> rather than "0111010"). Generally speaking, there is usually a
> one-to-one relationship between Assembler instructions and machine
> instructions.
>
> - C: Designed as a substitute for Assembler, C allows you to execute
> many hundreds or even thousands of machine instructions with relatively
> little code, but it's still somewhat close to the CPU in terms of memory
> management, data types, options for register use, etc.
>
> - C++/C#/Objective C: a set of libraries and compilers based on C that
> implement object-oriented programming, executing many more instructions
> per line of code and usually involving frameworks that handle many of
> the common tasks an application will perform.
>
> - Scripting: Instructions written in very high-level languages which
> often completely automate things like memory management, type
> conversion, garbage collection, etc., triggering a great many machine
> instructions for each line of code, favoring developer convenience at a
> small cost to efficiency and memory.
>
> At each of these levels, the number of machine instructions triggered by
> a line of code is generally higher, meaning ever more of the work is
> done by the system rather than the programmer.
>
> Much of the LiveCode engine is written in C++ (with some portions in
> straight C, I believe), and the LiveCode scripting language is often
> compiled to an intermediary bytecode, which in the list above might be
> between C++ and Scripting.
>
> Bytecode is very different from true object code, in that object code
> represents the instructions as the CPU itself expects to handle them,
> while bytecode still needs an intermediary mechanism (such as the
> LiveCode engine) to translate it into machine instructions.
>
> Bytecode representations are much closer to those in machine
> instructions than scripts, making the runtime translation of them often
> as simple as jumping from one register to another from a densely packed
> and highly optimized lookup table.
>
> Moreover, bytecode represents a fairly small subset of the instructions
> compiled from your script; in many cases they jump directly into
> compiled object code in the engine, which was written in C++ and
> compiled to machine code using some of the best modern compilers. So in
> effect, as Osterhaut puts it in his seminal paper on scripting (see
> <http://www.stanford.edu/~ouster/cgi-bin/papers/scripting.pdf>), good
> scripting languages are often just a sort of "glue" between true
> machine-compiled routines. Bytecode makes that glue smaller and more
> efficient.
>
> The scripts you write in LiveCode are what gets saved with the file (at
> least that's what I see when I look at a saved stack file; I can find
> the scripts but if the bytecode gets saved with it it's amazingly small
> because I can't find it at all).
>
> It's my understanding that when a stack is opened, its scripts are
> compiled to bytecode as the stack's object records are unpacked and the
> message path is set up. This "runtime compilation" involves parsing your
> script and translating that into binary tokens that execute much more
> efficiently. When executing, this bytecode is translated to direct
> machine instructions on the fly, but as you can see with LiveCode's
> blazing performance, neither the runtime compilation to bytecode nor the
> translation of the bytecode into machine instructions is particularly
> costly. And by separating the tasks, the more costly parsing of the
> script is done only once, which is one of the reasons why LC outperforms
> fully-interpreted systems (another reason is careful pruning of the
> lookup table used in that parsing and in the subsequent bytecode jumps,
> but that's another story).
>
> In fact, since so much of the actual execution takes place in the
> engine's machine-compiled code, performance for many tasks is on par
> with other systems where you have to wait for a compiler every time you
> change your code. :)
>
> There are exceptions to the general rule that script statements are
> translated to bytecode in advance of execution. For example, the "do"
> command and the "value" function both require parsing during execution,
> since they work with strings whose values cannot be known in advance,
> and therefore cannot be compiled in advance.
>
> But those tokens also make good examples of LiveCode's efficiency: while
> technically slower than alternative syntax which can be precompiled to
> bytecode, the time it takes the engine to parse those expressions and
> translate them into a form which can be executed is usually measured in
> microseconds, sometimes fractions of microseconds.
>
> Along those lines, compare the time it takes LiveCode to compile a
> script when you push the script editor's "Compile" button to compilation
> times in almost any other system. With each script compiled to bytecode
> separately, and with its means of doing so being rather well tuned over
> a great many years, it's almost instantaneous - you'll never wait for a
> progress bar when compiling in LiveCode. :)
>
>
> In summary, LiveCode attempts to find a sweet spot between raw
> performance and developer convenience. You could write faster-executing
> code in Assembler, but who would want to? Even using languages like C++
> will often take orders of magnitude more development time to accomplish
> similar goals. LiveCode's two-step compilation allows for blazing fast
> performance with nearly unprecedented return on your development time.
>
> IMO, an almost ideal sweet spot indeed.
>
> --
> Richard Gaskin
> Fourth World
> LiveCode training and consulting: http://www.fourthworld.com
> Webzine for LiveCode developers: http://www.LiveCodeJournal.com
> LiveCode Journal blog: http://LiveCodejournal.com/blog.irv
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list