未加星标

Understanding the NodeJS cluster module

字体大小 | |
[前端(javascript) 所属分类 前端(javascript) | 发布者 店小二03 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

TL;DR NodeJS processes runs on a single process, which means it does not take adavantage from multi-core systems by default. If you have an 8 core CPU and run a NodeJS program via $ node app.js it will run in a single process, wasting the rest of CPUs.

Hopefully for us NodeJS offers the cluster module that contains a set of functions and properties that help us to create programs that uses all the CPUs. Not a surprise the mechanism the cluster module uses to maximize the CPU usage was via forking processes, similar to the old fork() system call Unix systems.

Introducing the cluster module

The cluster module is a NodeJS module that contains a set of functions and properties that help us forking processes to take advantage of multi-core systems. It is propably the first level of scalability you must take care in your node application, specifally if you are working in a HTTP server application, before going to a higher scalability levels (I mean scaling vertically and horizontally in different machines).

With the cluster module a parent/master process can be forked in any number of child/worker processes and communicate with them sending messages via IPC communication. Remember there is no shared memory among processes.

Next lines are a compilation of sentences from the NodeJS documentation I have taken the liberty to copy&pasta to put it in a way I think can help you understand thw whole thing in a few lines.

A single instance of Node.js runs in a single thread. To take advantage of multi-core systems the user will sometimes want to launch a cluster of Node.js processes to handle the load.

The cluster module allows easy creation of child processes that all share server ports.

The worker (child) processes are spawned using the child_proces.fork() method, so that they can communicate with the parent via IPC and pass server handles back and forth. The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. Like child_process.spawn() , a ChildProcess object is returned. The returned ChildProcess will have an additional communication channel built-in that allows messages to be passed back and forth between the parent and child, through the send() method. See subprocess.sen() for details.

It is important to keep in mind that spawned Node.js child processes are independent of the parent with exception of the IPC communication channel that is established between the two. Each process has its own memory, with their own V8 instances. Because of the additional resource allocations required, spawning a large number of child Node.js processes is not recommended.

So, most of the magic is done by the child_process module, which is resposible to spawn new process and help communicate among them, for example, creating pipes. You can find a great article at Node.js Child Processes: Everything you need to know .

A basic example

Stop talking and lets see a real exampe. Next we show a very basic code that:

Creates a master process that retrives the number of CPUs and forks a worker process for each CPU, and Each child process prints a message in console and exit. const cluster = require('cluster'); const http = require('http'); const numCPUs = require('os').cpus().length; if (cluster.isMaster) { masterProcess(); } else { childProcess(); } function masterProcess() { console.log(`Master ${process.pid} is running`); for (let i = 0; i < numCPUs; i++) { console.log(`Forking process number ${i}...`); cluster.fork(); } process.exit(); } function childProcess() { console.log(`Worker ${process.pid} started and finished`); process.exit(); }

Save the code in app.js file and run executing: $ node app.js . The output should be something similar to:

$ node app.js Master 8463 is running Forking process number 0... Forking process number 1... Forking process number 2... Forking process number 3... Worker 8464 started and finished Worker 8465 started and finished Worker 8467 started and finished Worker 8466 started and finished Code explanation

When we run the app.js program an OS process is created that starts running our code. At the beginning the cluster mode is imported const cluster = require('cluster') and in the if sentence we check if the isMaster property.

Because the process is the first process the isMaster property is true and then we run the code of masterProcess function. This function has not much secret, it loops depending on the number of CPUs of your machine and forks the current process using the cluster.fork() method.

What the fork() really does is to create a new node process, like if you run it via command line with $node app.js , that is you have many processes running your app.js program.

When a child process is created and executed, it does the same as the master, that is, imports the cluster module and executes the if statement. Once of the differences is for the child process the value of cluster.isMaster is false , so they ends running the childProcess function.

Note, we explicitly terminate the master and worker processes with process.exit() , which by default return value of zero.

NOTE: NodeJS also offers the Child Processes module that simplifies the creation and comunication with other processes. For example we can spawn the ls -l terminal command and pipe with another process that handles the results.

Comunicating master and worker processes

When a worker process is created, an IPC channel is created among the worker and the master, allowing us to communicated between them with the send() method, which accepts a javascript object as parameter. Remember they are different processes (not threads) so we can’t use shared memory as a way of communcation.

From the master process, we can send a message to a worker process using the process reference, i.e. someChild.send({ ... }) , and within the worker process we can messages to the master simply using the current process reference, i.e. process.send() .

We have updated slighly the previous code to allow master send and receive messages from/to the workers and also the workers receive and send messages from the master process:

function childProcess() { console.log(`Worker ${process.pid} started`); process.on('message', function(message) { console.log(`Worker ${process.pid} recevies message '${JSON.stringify(message)}'`); }); console.log(`Worker ${process.pid} sends message to master...`); process.send({ msg: `Message from worker ${process.pid}` }); console.log(`Worker ${process.pid} finished`); }

The worker process is simply to understand. First we listen for the message event registering a listener with the process.on('message', handler) method. Later we send a messages with process.send({...}) . Note the message is a plain JavaScript object.

let workers = []; function masterProcess() { console.log(`Master ${process.pid} is running`); // Fork workers for (let i = 0; i < numCPUs; i++) { console.log(`Forking process number ${i}...`); const worker = cluster.fork(); workers.push(worker); // Listen for messages from worker worker.on('message', function(message) { console.log(`Master ${process.pid} recevies message '${JSON.stringify(message)}' from worker ${worker.process.pid}`); }); } // Send message to the workers workers.forEach(function(worker) { console.log(`Master ${process.pid} sends message to worker ${worker.process.pid}...`); worker.send({ msg: `Message from master ${process.pid}` }); }, this); }

The masterProcess function has been divided in two parts. In the first loop we fork as much workers as CPUs we have. The cluster.fork() returns a worker object representing the worker process, we store the reference in an array and register a listener to receive messages that comes from that worker instance.

Later, we loop over the array of workers and send a message from the master process to that concrete worker.

If you run the code the output will be something like:

$ node app.js Master 4045 is running Forking process number 0... Forking process number 1... Master 4045 sends message to worker 4046... Master 4045 sends message to worker 4047... Worker 4047 started Worker 4047 sends message to master... Worker 4047 finished Master 4045 recevies message '{"msg":"Message from worker 4047"}' from worker 4047 Worker 4047 recevies message '{"msg":"Message from master 4045"}' Worker 4046 started Worker 4046 sends message to master... Worker 4046 finished Master 4045 recevies message '{"msg":"Message from worker 4046"}' from worker 4046 Worker 4046 recevies message '{"msg":"Message from master 4045"}'

Here we are not terminating the process with process.exit() so to close the application you need to use ctrl+c .

Conclusion

The cluster module offers to NodeJS the needed capabilities to use the whole power of a CPU. Although not seen in this post, the cluster module is complemented with the child process module that offers plenty of tools to work with processes: start, stop and pipe input/out, etc.

Cluster module allow us to easily create worker processes. In addition it magically creates an IPC channel to communicate the master and worker process passing JavaScript objects.

In my next post I will show how important is the cluster module when working in an HTTP server, no matter if an API or web server working with ExpressJS. The cluster module can increase performance of our application having as many worker processes as CPUs cores.

本文前端(javascript)相关术语:javascript是什么意思 javascript下载 javascript权威指南 javascript基础教程 javascript 正则表达式 javascript设计模式 javascript高级程序设计 精通javascript javascript教程

主题: CPUNode.jsJavaScriptJava
tags: process,worker,message,cluster,processes
分页:12
转载请注明
本文标题:Understanding the NodeJS cluster module
本站链接:http://www.codesec.net/view/561315.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 前端(javascript) | 评论(0) | 阅读(56)