Why is node.js faster than LC server?

Mark Waddingham mark at livecode.com
Wed Dec 6 02:26:42 EST 2017


On 2017-12-05 18:55, Richard Gaskin via use-livecode wrote:
> Can you describe what you mean by fork() here?  In a discussion a
> while back I made reference to an earlier post you'd made about
> fork(), but the response from the engine team left me with the
> impression that it's merely the way processes launch other processes.
> Of course we can launch processes right now in LC, so clearly there's
> something here I'm missing (forgive the naivete of the question; I'm
> only halfway into Robert Love's Linux System Programming).

'fork' is a UNIX system call which creates a new process by copying 
everything about the current process. In particular:

   1) The entire virtual memory space is copied using copy-on-write (so 
there is no copy until a page is modified)

   2) All file descriptors (fds) (which inside the kernel are 
ref-counted) are copied into the new process by increasing the ref-count 
(this is not copy-on-write).

However, only the thread which calls 'fork()' is cloned in this manner - 
the new process has only one thread.

So fork basically creates a clone of the current process sharing *all* 
file descriptor based resources (which is pretty much how all kernel 
resources are represented beyond memory, threads and processes) with 
just a single thread.

In 99.9999% of cases where you find a fork(), you will find an exec* 
syscall almost immediately afterwards in the new process (fork returns 0 
in the new process, and the new process's pid in the original process) - 
the exec* family of calls completely replace a process with a new one 
running a new executable. The only bit of code you tend to find between 
fork() and exec on the new process side is code which shuffles fds 
around to ensure the new process shares only the fds you want.

You can call fork() from any language which either provides FFI or 
external like plugins - not that it does you any good in almost all 
cases in any language unless you call exec* very soon after. The sharing 
of state means that the new process is not going to play well with the 
original, and the copying only goes so far as address space, fds and the 
single thread - not anything it might be connected to (such as a window 
server).

The only code which is fork() safe to do anything more than that is code 
which has been written *from the ground up* to work with it, and then it 
would be for very specialized purposes.

Basically, 'fork' is a red-herring here - it is not useful generally, 
and certainly not in a high-level language environment such as LiveCode 
(or pretty much any other environment where you don't control everything 
down to the last bit in your process). [ Note: You would be fine using 
fork() and fd-shuffling calls followed by exec() from LCB, just not a 
great deal more ].

In terms of the context we are talking about, then the main thing which 
fork() provides which is useful is the ability to execute a new process 
and pass it a file descriptor of a socket from the original process to 
the new one - i.e. 'handing off' of a connection from one process to 
another. However, that can be done without using fork - you can pass 
file descriptors to other processes on UNIX by using UNIX domain sockets 
and sendmsg (you can do a similar thing on Windows, the abstractions are 
different, but the underlying effects are the same).

Another thing which fork() does allow is the idea of a 'zygote' virtual 
machine process. All virtual machines have setup time which does exactly 
the same thing for all uses of it. A zygote is a pre-inited virtual 
machine process which has done everything but loaded any user code. 
Android Dalvik works like this - there is a single zygote Dalvik process 
which has all the setup of the VM done, and is just sitting waiting to 
load user code to execute. When a new Java-based Android process is 
required, the zygote is forked and user code loaded. Note that this 
requires a great deal of care to get right due to what fork() actually 
does, as outlined above.

For something like Android where you have restricted resources and want 
to spin-up new processes as quickly as possible, this makes sense. 
However, in the case we are talking about (server-based process 
clusters, essentially) then it is easier just to fire up many replica 
processes which just sit idle until used - with new connections being 
handed off to them as needed.

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list