Why is node.js faster than LC server?
Mark Waddingham
mark at livecode.com
Wed Dec 6 02:26:42 EST 2017
On 2017-12-05 18:55, Richard Gaskin via use-livecode wrote:
> Can you describe what you mean by fork() here? In a discussion a
> while back I made reference to an earlier post you'd made about
> fork(), but the response from the engine team left me with the
> impression that it's merely the way processes launch other processes.
> Of course we can launch processes right now in LC, so clearly there's
> something here I'm missing (forgive the naivete of the question; I'm
> only halfway into Robert Love's Linux System Programming).
'fork' is a UNIX system call which creates a new process by copying
everything about the current process. In particular:
1) The entire virtual memory space is copied using copy-on-write (so
there is no copy until a page is modified)
2) All file descriptors (fds) (which inside the kernel are
ref-counted) are copied into the new process by increasing the ref-count
(this is not copy-on-write).
However, only the thread which calls 'fork()' is cloned in this manner -
the new process has only one thread.
So fork basically creates a clone of the current process sharing *all*
file descriptor based resources (which is pretty much how all kernel
resources are represented beyond memory, threads and processes) with
just a single thread.
In 99.9999% of cases where you find a fork(), you will find an exec*
syscall almost immediately afterwards in the new process (fork returns 0
in the new process, and the new process's pid in the original process) -
the exec* family of calls completely replace a process with a new one
running a new executable. The only bit of code you tend to find between
fork() and exec on the new process side is code which shuffles fds
around to ensure the new process shares only the fds you want.
You can call fork() from any language which either provides FFI or
external like plugins - not that it does you any good in almost all
cases in any language unless you call exec* very soon after. The sharing
of state means that the new process is not going to play well with the
original, and the copying only goes so far as address space, fds and the
single thread - not anything it might be connected to (such as a window
server).
The only code which is fork() safe to do anything more than that is code
which has been written *from the ground up* to work with it, and then it
would be for very specialized purposes.
Basically, 'fork' is a red-herring here - it is not useful generally,
and certainly not in a high-level language environment such as LiveCode
(or pretty much any other environment where you don't control everything
down to the last bit in your process). [ Note: You would be fine using
fork() and fd-shuffling calls followed by exec() from LCB, just not a
great deal more ].
In terms of the context we are talking about, then the main thing which
fork() provides which is useful is the ability to execute a new process
and pass it a file descriptor of a socket from the original process to
the new one - i.e. 'handing off' of a connection from one process to
another. However, that can be done without using fork - you can pass
file descriptors to other processes on UNIX by using UNIX domain sockets
and sendmsg (you can do a similar thing on Windows, the abstractions are
different, but the underlying effects are the same).
Another thing which fork() does allow is the idea of a 'zygote' virtual
machine process. All virtual machines have setup time which does exactly
the same thing for all uses of it. A zygote is a pre-inited virtual
machine process which has done everything but loaded any user code.
Android Dalvik works like this - there is a single zygote Dalvik process
which has all the setup of the VM done, and is just sitting waiting to
load user code to execute. When a new Java-based Android process is
required, the zygote is forked and user code loaded. Note that this
requires a great deal of care to get right due to what fork() actually
does, as outlined above.
For something like Android where you have restricted resources and want
to spin-up new processes as quickly as possible, this makes sense.
However, in the case we are talking about (server-based process
clusters, essentially) then it is easier just to fire up many replica
processes which just sit idle until used - with new connections being
handed off to them as needed.
Warmest Regards,
Mark.
--
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps
More information about the use-livecode
mailing list