Architecture of Node.js’ Internal Codebase
Jeff Atwood, co-founder of Stack Overflow, once wrote in his famous programming blog Coding Horror :
I’ve recently answered a Stack Overflow question regarding the architecture of Node.js’ internal codebase, which inspired me to write this article.
The official doc isn’t very helpful at explaining what Node.js is:
In order to understand this statement and the actual power behind it, let’s break down Node.js’ components, elaborate on some key terms, and then explain how different pieces interact with one another to make Node.js the powerful runtime it is:
libuv : the C library that provides asynchronous features. It maintains event loops, a thread pool, file system I/O, DNS functionalities, and network I/O, among other critical functionalities.
Other C/C++ Components/Dependencies : such as c-ares , crypto (OpenSSL) , http-parser , and zlib . These dependencies provide low-level interactions with servers to establish important functionalities such as networking, compressing, encrypting, etc.
I/O: shorthand for I nput/ O utput. It basically denotes any computer operations handled primarily by the system’s I/O subsystem. I/O-bound operations usually involves interactions with disks/drives. Examples include database access and file system operations. Some other related concepts include CPU-bound, memory-bound, etc. A good way to determine whether an operation belongs to I/O-bound, CPU-bound, or others is to check by increasing which resource would that specific operation have a better performance. For example, if an operation would go noticeably faster if CPU power is increased, then it is CPU-bound.
Non-blocking/Asynchronous: Normally, when a request comes in, the application would handle the request and halt all other operations until the request is processed. This immediately presents a problem: when a large number of requests come in at the same time, each request would have to wait until the previous requests are processed. In other words, the previous operation will block the ones following it. To make it worse, if the previous requests have long response time (e.g. calculating the first 1000 prime numbers, or reading 3GB of data from database), all other requests would be halted/blocked for a long time. To address this issue, one can resort to multiprocessing and/or multithreading as solutions, each with its own pros and cons. Node.js handles things differently. Instead of spawning a new thread for every new request, all the requests are handled on one single main thread, and that’s pretty much all that it does: handle requests — all (I/O) operations contained in the request, such as file system access, database read/write, are sent to the worker threads maintained by libuv (mentioned above) in the background. In other words, I/O operations in the requests are handled asynchronously, not on the main thread. This way the main thread is never blocked, as heavyliftings are shipped elsewhere. You (and thus your application code) ever only get to work with the one and only main thread. All the worker threads in libuv’s tread pool are shielded from you. You never get to work directly with (or need to worry about) them. Node.js takes care of them for you. This architecture makes I/O operations especially efficient. However, it’s not without disadvantages. Operations include not only I/O-bound ones, but CPU-bound, memory-bound, etc. as well. Out of box, Node.js only provides asynchronous functions for I/O tasks. There are ways to work around CPU-intensive operations. However, they are not the focus of this article.
Event-Driven: typically, in almost all modern systems, after the main application kicks off, processes are initiated by incoming requests. However, how things go from there differ, sometimes drastically, among different technologies. Typical implementations handle a request procedurally: a thread is spawn for a request; operations are done one after another; if an operation is slow, all following operations halts on that thread; when all operations complete successfully, a response to returned. However, in Node.js, all operations are registered to Node.js as events, waiting to be triggered, either by the main application or requests.
Runtime (System): Node.js runtime is the entire codebase (components mentioned above), both low-level and high-level, that together supports the execution of a Node.js application.
PUTTING EVERYTHING TOGETHER
Now that we have a high-level overview of Node.js’ components, we’ll investigate its workflow to get a better sense of its architecture and how different components interact with one another.
When a Node.js application starts running, the V8 engine will run the application code you write. Objects in your application will keep a list of observers (functions registered to events). These observers will get notified when their respective events are emitted.
When an event is emitted, its callback function will be enqueued into an event queue . As long as there are remaining events in the queue, the event loop will keep dequeuing events in the queue and putting them onto the call stack . It should be noted that only when the previous event is processed (the call stack is cleared) will the event loop put the next event onto the call stack.
On the call stack, when an I/O operation is encountered, it will be handed over to libuv for processing. By default, libuv maintains a thread pool of four worker threads, although the number can be altered to add more threads. If a request is file-system I/O and DNS-related, then it will be assigned to the thread pool for processing; otherwise, for other requests such as networking, platform-specific mechanisms will be deployed to handle such requests ( libuv design overview ).
For I/O operations that make use of the thread pool (i.e. file I/O, DNS, etc.), the worker threads will interact with Node.js’ low-level libraries to perform operations such as database transaction, file system access, etc. When the processing is over, libuv will enqueue the event back into the event queue again for the main tread to work on. During the time that libuv handles asynchronous I/O operations, the main thread does not wait for the outcome of processing, but moves on instead. The event returned by libuv will have the opportunity to be handled by the main thread again when it is put back onto the call stack by the event loop.This completes a the life cycle of an event in a Node.js application.
Think of a Node.js application as a Starbucks cafe. A highly efficient and well-trained waiter (the one and only main thread) will take order. When a large number of customers visits the cafe at the same time, they will wait in line (enqueued in the event queue) to be served by the waiter. Once a customer is served by the waiter, the waiter passes the order to a manager (libuv), who assigns each order to a barista (worker thread or platform-specific mechanism). The barista will use different ingredients and machines (low-level C/C++ components) to make different kinds of drinks, depending on the customers’ requests. Typically there will be four baristas on duty (tread pool) to specifically make latte (file I/O, DNS, etc.). However, when the peak hits, more baristas can be called to work (however this should be done at the beginning of the day, NOT during lunch break for example). Once the waiter passes the order to the manager, he does not wait for the coffee to be made to serve another customer. Instead, he calls the next customer (the next event dequeued by the event loop and pushed onto the call stack). You can think of an event currently on the call stack as a customer at the counter being served. When the coffee is done, the coffee will be sent to the end of the customer line. The waiter will call out the name when the coffee moves to the counter, and the customer will get his coffee. (The underlined part of the analogy sounds weird in real life; however when you consider the process from a program’s perspective, it makes sense.)
This completes the high-level overview of Node.js’ internal codebase and its events’ typical life cycle. This overview, however, is very general and did not address many issues and details, for example CPU-bound operation handling, Node.js design patterns, etc. More topics will be covered in other articles.