生产最佳实践:性能和可靠性

¥Production best practices: performance and reliability

概述

¥Overview

本文讨论部署到生产中的 Express 应用的性能和可靠性最佳实践。

¥This article discusses performance and reliability best practices for Express applications deployed to production.

这个主题显然属于 “devops” 世界,跨越传统的开发和运营。因此,信息分为两部分:

¥This topic clearly falls into the “devops” world, spanning both traditional development and operations. Accordingly, the information is divided into two parts:

在代码中要做的事情

¥Things to do in your code

以下是你可以在代码中执行的一些操作以提高应用的性能:

¥Here are some things you can do in your code to improve your application’s performance:

使用 gzip 压缩

¥Use gzip compression

Gzip 压缩可以大大减小响应主体的大小,从而提高 Web 应用的速度。在你的 Express 应用中使用 compression 中间件进行 gzip 压缩。例如:

¥Gzip compressing can greatly decrease the size of the response body and hence increase the speed of a web app. Use the compression middleware for gzip compression in your Express app. For example:

const compression = require('compression')
const express = require('express')
const app = express()
app.use(compression())

对于生产中的高流量网站,实现压缩的最佳方法是在反向代理级别实现压缩(参见 使用反向代理)。在那种情况下,你不需要使用压缩中间件。有关在 Nginx 中启用 gzip 压缩的详细信息,请参阅 Nginx 文档中的 模块 ngx_http_gzip_module

¥For a high-traffic website in production, the best way to put compression in place is to implement it at a reverse proxy level (see Use a reverse proxy). In that case, you do not need to use compression middleware. For details on enabling gzip compression in Nginx, see Module ngx_http_gzip_module in the Nginx documentation.

不要使用同步函数

¥Don’t use synchronous functions

同步函数和方法会占用正在执行的进程,直到它们返回。对同步函数的单个调用可能会在几微秒或几毫秒内返回,但在高流量网站中,这些调用加起来会降低应用的性能。避免在生产中使用它们。

¥Synchronous functions and methods tie up the executing process until they return. A single call to a synchronous function might return in a few microseconds or milliseconds, however in high-traffic websites, these calls add up and reduce the performance of the app. Avoid their use in production.

尽管 Node 和许多模块提供了它们功能的同步和异步版本,但在生产中始终使用异步版本。唯一可以证明同步功能合理的时间是在初始启动时。

¥Although Node and many modules provide synchronous and asynchronous versions of their functions, always use the asynchronous version in production. The only time when a synchronous function can be justified is upon initial startup.

如果你使用的是 Node.js 4.0+ 或 io.js 2.1.0+,你可以使用 --trace-sync-io 命令行标志在你的应用使用同步 API 时打印警告和堆栈跟踪。当然,你不想在生产中使用它,而是为了确保你的代码已准备好投入生产。有关详细信息,请参阅 node 命令行选项文档

¥If you are using Node.js 4.0+ or io.js 2.1.0+, you can use the --trace-sync-io command-line flag to print a warning and a stack trace whenever your application uses a synchronous API. Of course, you wouldn’t want to use this in production, but rather to ensure that your code is ready for production. See the node command-line options documentation for more information.

正确记录日志

¥Do logging correctly

一般来说,从你的应用登录有两个原因:用于调试和记录应用活动(基本上,其他一切)。使用 console.log()console.error() 将日志消息打印到终端是开发中的常见做法。但是当目标是终端或文件时 这些函数是同步的,所以它们不适合生产,除非你将输出通过管道传递给另一个程序。

¥In general, there are two reasons for logging from your app: For debugging and for logging app activity (essentially, everything else). Using console.log() or console.error() to print log messages to the terminal is common practice in development. But these functions are synchronous when the destination is a terminal or a file, so they are not suitable for production, unless you pipe the output to another program.

用于调试

¥For debugging

如果你出于调试目的进行日志记录,那么不要使用 console.log(),而是使用像 debug 这样的特殊调试模块。此模块使你能够使用 DEBUG 环境变量来控制将哪些调试消息发送到 console.error()(如果有)。为了让你的应用完全异步,你仍然希望将 console.error() 通过管道传递给另一个程序。但是,你真的不会在生产中调试,是吗?

¥If you’re logging for purposes of debugging, then instead of using console.log(), use a special debugging module like debug. This module enables you to use the DEBUG environment variable to control what debug messages are sent to console.error(), if any. To keep your app purely asynchronous, you’d still want to pipe console.error() to another program. But then, you’re not really going to debug in production, are you?

对于应用活动

¥For app activity

如果你正在记录应用活动(例如,跟踪流量或 API 调用),请使用 WinstonBunyan 等日志库,而不是使用 console.log()。有关这两个库的详细比较,请参阅 StrongLoop 博文 比较 Winston 和 Bunyan Node.js 日志记录

¥If you’re logging app activity (for example, tracking traffic or API calls), instead of using console.log(), use a logging library like Winston or Bunyan. For a detailed comparison of these two libraries, see the StrongLoop blog post Comparing Winston and Bunyan Node.js Logging.

妥善处理异常

¥Handle exceptions properly

Node 应用在遇到未捕获的异常时崩溃。不处理异常并采取适当的措施将使你的 Express 应用崩溃并离线。如果你遵循下面 确保你的应用自动重启 中的建议,那么你的应用将从崩溃中恢复。幸运的是,Express 应用的启动时间通常很短。然而,你首先要避免崩溃,为此,你需要正确处理异常。

¥Node apps crash when they encounter an uncaught exception. Not handling exceptions and taking appropriate actions will make your Express app crash and go offline. If you follow the advice in Ensure your app automatically restarts below, then your app will recover from a crash. Fortunately, Express apps typically have a short startup time. Nevertheless, you want to avoid crashing in the first place, and to do that, you need to handle exceptions properly.

为确保处理所有异常,请使用以下技术:

¥To ensure you handle all exceptions, use the following techniques:

在深入这些主题之前,你应该对 Node/Express 错误处理有一个基本的了解:使用错误优先回调,并在中间件中传播错误。Node 使用 “错误优先回调” 约定从异步函数返回错误,其中回调函数的第一个参数是错误对象,后面是后续参数中的结果数据。要指示没有错误,请将 null 作为第一个参数传递。回调函数必须相应地遵循错误优先回调约定才能有意义地处理错误。在 Express 中,最佳实践是使用 next() 函数通过中间件链传播错误。

¥Before diving into these topics, you should have a basic understanding of Node/Express error handling: using error-first callbacks, and propagating errors in middleware. Node uses an “error-first callback” convention for returning errors from asynchronous functions, where the first parameter to the callback function is the error object, followed by result data in succeeding parameters. To indicate no error, pass null as the first parameter. The callback function must correspondingly follow the error-first callback convention to meaningfully handle the error. And in Express, the best practice is to use the next() function to propagate errors through the middleware chain.

有关错误处理基础知识的更多信息,请参阅:

¥For more on the fundamentals of error handling, see:

不该做什么

¥What not to do

你不应该做的一件事是监听 uncaughtException 事件,当异常冒泡回到事件循环时发出。为 uncaughtException 添加事件监听器将更改遇到异常的进程的默认行为;尽管出现异常,该过程仍将继续运行。这听起来像是防止应用崩溃的好方法,但在未捕获的异常后继续运行应用是一种危险的做法,不推荐这样做,因为进程的状态变得不可靠且不可预测。

¥One thing you should not do is to listen for the uncaughtException event, emitted when an exception bubbles all the way back to the event loop. Adding an event listener for uncaughtException will change the default behavior of the process that is encountering an exception; the process will continue to run despite the exception. This might sound like a good way of preventing your app from crashing, but continuing to run the app after an uncaught exception is a dangerous practice and is not recommended, because the state of the process becomes unreliable and unpredictable.

此外,使用 uncaughtException 被官方认可为 crude。所以听 uncaughtException 只是个坏主意。这就是为什么我们推荐诸如多个进程和主管之类的东西:崩溃并重新启动通常是从错误中恢复的最可靠方法。

¥Additionally, using uncaughtException is officially recognized as crude. So listening for uncaughtException is just a bad idea. This is why we recommend things like multiple processes and supervisors: crashing and restarting is often the most reliable way to recover from an error.

我们也不建议使用 domains。它通常不能解决问题,是一个已弃用的模块。

¥We also don’t recommend using domains. It generally doesn’t solve the problem and is a deprecated module.

使用 try-catch

¥Use try-catch

Try-catch 是一种 JavaScript 语言构造,可用于捕获同步代码中的异常。例如,使用 try-catch 来处理 JSON 解析错误,如下所示。

¥Try-catch is a JavaScript language construct that you can use to catch exceptions in synchronous code. Use try-catch, for example, to handle JSON parsing errors as shown below.

使用 JSHintJSLint 之类的工具来帮助你查找隐式异常,例如 未定义变量的引用错误

¥Use a tool such as JSHint or JSLint to help you find implicit exceptions like reference errors on undefined variables.

下面是使用 try-catch 处理潜在进程崩溃异常的示例。这个中间件函数接受一个名为 “params” 的查询字段参数,它是一个 JSON 对象。

¥Here is an example of using try-catch to handle a potential process-crashing exception. This middleware function accepts a query field parameter named “params” that is a JSON object.

app.get('/search', (req, res) => {
  // Simulating async operation
  setImmediate(() => {
    const jsonStr = req.query.params
    try {
      const jsonObj = JSON.parse(jsonStr)
      res.send('Success')
    } catch (e) {
      res.status(400).send('Invalid JSON string')
    }
  })
})

但是,try-catch 仅适用于同步代码。因为 Node 平台主要是异步的(特别是在生产环境中),try-catch 不会捕获很多异常。

¥However, try-catch works only for synchronous code. Because the Node platform is primarily asynchronous (particularly in a production environment), try-catch won’t catch a lot of exceptions.

使用 promise

¥Use promises

Promises 将处理使用 then() 的异步代码块中的任何异常(显式和隐式)。只需将 .catch(next) 添加到 promise 链的末尾。例如:

¥Promises will handle any exceptions (both explicit and implicit) in asynchronous code blocks that use then(). Just add .catch(next) to the end of promise chains. For example:

app.get('/', (req, res, next) => {
  // do some sync stuff
  queryDb()
    .then((data) => makeCsv(data)) // handle data
    .then((csv) => { /* handle csv */ })
    .catch(next)
})

app.use((err, req, res, next) => {
  // handle error
})

现在,所有异步和同步错误都会传播到错误中间件。

¥Now, all errors asynchronous and synchronous get propagated to the error middleware.

但是,有两个注意事项:

¥However, there are two caveats:

  1. 你所有的异步代码都必须返回 promise(触发器除外)。如果特定库不返回 promise,请使用像 Bluebird.promisifyAll() 这样的辅助函数转换基础对象。

    ¥All your asynchronous code must return promises (except emitters). If a particular library does not return promises, convert the base object by using a helper function like Bluebird.promisifyAll().

  2. 事件触发器(如 streams)仍然可能导致未捕获的异常。因此,请确保你正确处理了错误事件;例如:

    ¥Event emitters (like streams) can still cause uncaught exceptions. So make sure you are handling the error event properly; for example:

const wrap = fn => (...args) => fn(...args).catch(args[2])

app.get('/', wrap(async (req, res, next) => {
  const company = await getCompanyById(req.query.id)
  const stream = getLogoStreamById(company.id)
  stream.on('error', next).pipe(res)
}))

wrap() 函数是一个封装器,它捕获被拒绝的 promise 并以错误作为第一个参数调用 next()。详见 使用 Promises、Generators 和 ES7 在 Express 中进行异步错误处理

¥The wrap() function is a wrapper that catches rejected promises and calls next() with the error as the first argument. For details, see Asynchronous Error Handling in Express with Promises, Generators and ES7.

有关使用 promise 进行错误处理的更多信息,请参阅 使用 Q 在 Node.js 中实现 promise - 回调的替代方案

¥For more information about error-handling by using promises, see Promises in Node.js with Q – An Alternative to Callbacks.

在你的环境/设置中要做的事情

¥Things to do in your environment / setup

以下是你可以在系统环境中执行的一些操作,以提高应用的性能:

¥Here are some things you can do in your system environment to improve your app’s performance:

将 NODE_ENV 设置为 “production”

¥Set NODE_ENV to “production”

NODE_ENV 环境变量指定应用运行的环境(通常是开发或生产)。要提高性能,你可以做的最简单的事情之一就是将 NODE_ENV 设置为 “production”

¥The NODE_ENV environment variable specifies the environment in which an application is running (usually, development or production). One of the simplest things you can do to improve performance is to set NODE_ENV to “production.”

将 NODE_ENV 设置为 “production” 使得 Express:

¥Setting NODE_ENV to “production” makes Express:

测试表明 只要这样做就可以将应用性能提高三倍!

¥Tests indicate that just doing this can improve app performance by a factor of three!

如果需要编写特定于环境的代码,可以使用 process.env.NODE_ENV 检查 NODE_ENV 的值。请注意,检查任何环境变量的值都会导致性能下降,因此应谨慎进行。

¥If you need to write environment-specific code, you can check the value of NODE_ENV with process.env.NODE_ENV. Be aware that checking the value of any environment variable incurs a performance penalty, and so should be done sparingly.

在开发中,你通常在交互式 shell 中设置环境变量,例如使用 export.bash_profile 文件。但一般来说,你不应该在生产服务器上这样做;相反,请使用操作系统的初始化系统(systemd 或 Upstart)。下一节将提供有关使用 init 系统的更多详细信息,但设置 NODE_ENV 对性能非常重要(并且易于操作),因此在此高亮。

¥In development, you typically set environment variables in your interactive shell, for example by using export or your .bash_profile file. But in general, you shouldn’t do that on a production server; instead, use your OS’s init system (systemd or Upstart). The next section provides more details about using your init system in general, but setting NODE_ENV is so important for performance (and easy to do), that it’s highlighted here.

对于 Upstart,在你的工作文件中使用 env 关键字。例如:

¥With Upstart, use the env keyword in your job file. For example:

# /etc/init/env.conf
 env NODE_ENV=production

有关详细信息,请参阅 Upstart 简介、秘诀和最佳实践

¥For more information, see the Upstart Intro, Cookbook and Best Practices.

对于 systemd,请在单元文件中使用 Environment 指令。例如:

¥With systemd, use the Environment directive in your unit file. For example:

# /etc/systemd/system/myservice.service
Environment=NODE_ENV=production

有关详细信息,请参阅 在 systemd 单元中使用环境变量

¥For more information, see Using Environment Variables In systemd Units.

确保你的应用自动重启

¥Ensure your app automatically restarts

在生产环境中,你永远不希望你的应用处于离线状态。这意味着你需要确保它在应用崩溃和服务器本身崩溃时都重新启动。尽管你希望这两种情况都不会发生,但实际上你必须通过以下方式考虑这两种可能性:

¥In production, you don’t want your application to be offline, ever. This means you need to make sure it restarts both if the app crashes and if the server itself crashes. Although you hope that neither of those events occurs, realistically you must account for both eventualities by:

如果遇到未捕获的异常,Node 应用会崩溃。你需要做的最重要的事情是确保你的应用经过良好测试并处理所有异常(有关详细信息,请参阅 正确处理异常)。但作为故障保险,建立一个机制来确保当你的应用崩溃时,它会自动重启。

¥Node applications crash if they encounter an uncaught exception. The foremost thing you need to do is to ensure your app is well-tested and handles all exceptions (see handle exceptions properly for details). But as a fail-safe, put a mechanism in place to ensure that if and when your app crashes, it will automatically restart.

使用进程管理器

¥Use a process manager

在开发中,你只需从命令行使用 node server.js 或类似的东西启动你的应用。但是在生产中这样做是灾难的根源。如果应用崩溃,它将处于离线状态,直到你重新启动它。为确保你的应用在崩溃时重新启动,请使用进程管理器。进程管理器是应用的 “container”,可促进部署、提供高可用性并使你能够在运行时管理应用。

¥In development, you started your app simply from the command line with node server.js or something similar. But doing this in production is a recipe for disaster. If the app crashes, it will be offline until you restart it. To ensure your app restarts if it crashes, use a process manager. A process manager is a “container” for applications that facilitates deployment, provides high availability, and enables you to manage the application at runtime.

除了在应用崩溃时重新启动应用之外,进程管理器还可以让你:

¥In addition to restarting your app when it crashes, a process manager can enable you to:

最流行的 Node 进程管理器如下:

¥The most popular process managers for Node are as follows:

有关三个进程管理器的逐项比较,请参阅 http://strong-pm.io/compare/

¥For a feature-by-feature comparison of the three process managers, see http://strong-pm.io/compare/.

使用这些进程管理器中的任何一个都足以让你的应用保持正常运行,即使它有时会崩溃。

¥Using any of these process managers will suffice to keep your application up, even if it does crash from time to time.

然而,StrongLoop PM 有很多专门针对生产部署的特性。你可以使用它和相关的 StrongLoop 工具来:

¥However, StrongLoop PM has lots of features that specifically target production deployment. You can use it and the related StrongLoop tools to:

如下所述,当你使用 init 系统将 StrongLoop PM 安装为操作系统服务时,它会在系统重新启动时自动重新启动。因此,它将使你的应用进程和集群永远保持活动状态。

¥As explained below, when you install StrongLoop PM as an operating system service using your init system, it will automatically restart when the system restarts. Thus, it will keep your application processes and clusters alive forever.

使用初始化系统

¥Use an init system

下一层可靠性是确保你的应用在服务器重新启动时重新启动。由于各种原因,系统仍可能出现故障。为确保你的应用在服务器崩溃时重新启动,请使用操作系统内置的初始化系统。目前使用的两个主要初始化系统是 systemdUpstart

¥The next layer of reliability is to ensure that your app restarts when the server restarts. Systems can still go down for a variety of reasons. To ensure that your app restarts if the server crashes, use the init system built into your OS. The two main init systems in use today are systemd and Upstart.

有两种方法可以通过 Express 应用使用初始化系统:

¥There are two ways to use init systems with your Express app:

Systemd

Systemd 是一个 Linux 系统和服务管理器。大多数主要的 Linux 发行版都采用 systemd 作为它们的默认初始化系统。

¥Systemd is a Linux system and service manager. Most major Linux distributions have adopted systemd as their default init system.

systemd 服务配置文件称为单元文件,文件名以 .service 结尾。这是一个示例单元文件,用于直接管理 Node 应用。为你的系统和应用替换 <angle brackets> 中包含的值:

¥A systemd service configuration file is called a unit file, with a filename ending in .service. Here’s an example unit file to manage a Node app directly. Replace the values enclosed in <angle brackets> for your system and app:

[Unit]
Description=<Awesome Express App>

[Service]
Type=simple
ExecStart=/usr/local/bin/node </projects/myapp/index.js>
WorkingDirectory=</projects/myapp>

User=nobody
Group=nogroup

# Environment variables:
Environment=NODE_ENV=production

# Allow many incoming connections
LimitNOFILE=infinity

# Allow core dumps for debugging
LimitCORE=infinity

StandardInput=null
StandardOutput=syslog
StandardError=syslog
Restart=always

[Install]
WantedBy=multi-user.target

有关 systemd 的更多信息,请参阅 systemd 参考手册

¥For more information on systemd, see the systemd reference (man page).

StrongLoop PM 作为 systemd 服务

¥StrongLoop PM as a systemd service

你可以轻松地将 StrongLoop Process Manager 安装为 systemd 服务。完成后,当服务器重新启动时,它会自动重新启动 StrongLoop PM,然后它将重新启动它管理的所有应用。

¥You can easily install StrongLoop Process Manager as a systemd service. After you do, when the server restarts, it will automatically restart StrongLoop PM, which will then restart all the apps it is managing.

将 StrongLoop PM 安装为系统服务:

¥To install StrongLoop PM as a systemd service:

$ sudo sl-pm-install --systemd

然后启动服务:

¥Then start the service with:

$ sudo /usr/bin/systemctl start strong-pm

有关详细信息,请参阅 设置生产主机(StrongLoop 文档)

¥For more information, see Setting up a production host (StrongLoop documentation).

Upstart

Upstart 是许多 Linux 发行版上可用的系统工具,用于在系统启动期间启动任务和服务,在关闭期间停止它们,并监督它们。你可以将你的 Express 应用或进程管理器配置为服务,然后 Upstart 会在它崩溃时自动重启它。

¥Upstart is a system tool available on many Linux distributions for starting tasks and services during system startup, stopping them during shutdown, and supervising them. You can configure your Express app or process manager as a service and then Upstart will automatically restart it when it crashes.

Upstart 服务在作业配置文件(也称为 “job”)中定义,文件名以 .conf 结尾。以下示例演示如何为名为 “myapp” 的应用创建名为 “myapp” 的作业,其主文件位于 /projects/myapp/index.js

¥An Upstart service is defined in a job configuration file (also called a “job”) with filename ending in .conf. The following example shows how to create a job called “myapp” for an app named “myapp” with the main file located at /projects/myapp/index.js.

使用以下内容创建一个名为 myapp.conf at /etc/init/ 的文件(将粗体文本替换为你的系统和应用的值):

¥Create a file named myapp.conf at /etc/init/ with the following content (replace the bold text with values for your system and app):

# When to start the process
start on runlevel [2345]

# When to stop the process
stop on runlevel [016]

# Increase file descriptor limit to be able to handle more requests
limit nofile 50000 50000

# Use production mode
env NODE_ENV=production

# Run as www-data
setuid www-data
setgid www-data

# Run from inside the app dir
chdir /projects/myapp

# The process to start
exec /usr/local/bin/node /projects/myapp/index.js

# Restart the process if it is down
respawn

# Limit restart attempt to 10 times within 10 seconds
respawn limit 10 10

注意:此脚本需要 Upstart 1.4 或更新版本,受 Ubuntu 12.04-14.10 支持。

¥NOTE: This script requires Upstart 1.4 or newer, supported on Ubuntu 12.04-14.10.

由于作业配置为在系统启动时运行,因此你的应用将与操作系统一起启动,并在应用崩溃或系统停机时自动重新启动。

¥Since the job is configured to run when the system starts, your app will be started along with the operating system, and automatically restarted if the app crashes or the system goes down.

除了自动重启应用之外,Upstart 还允许你使用以下命令:

¥Apart from automatically restarting the app, Upstart enables you to use these commands:

有关 Upstart 的更多信息,请参阅 Upstart 简介、秘诀和最佳实践

¥For more information on Upstart, see Upstart Intro, Cookbook and Best Practises.

StrongLoop PM 作为 Upstart 服务

¥StrongLoop PM as an Upstart service

你可以轻松地将 StrongLoop Process Manager 安装为 Upstart 服务。完成后,当服务器重新启动时,它会自动重新启动 StrongLoop PM,然后它将重新启动它管理的所有应用。

¥You can easily install StrongLoop Process Manager as an Upstart service. After you do, when the server restarts, it will automatically restart StrongLoop PM, which will then restart all the apps it is managing.

要将 StrongLoop PM 安装为 Upstart 1.4 服务:

¥To install StrongLoop PM as an Upstart 1.4 service:

$ sudo sl-pm-install

然后运行服务:

¥Then run the service with:

$ sudo /sbin/initctl start strong-pm

注意:在不支持 Upstart 1.4 的系统上,命令略有不同。有关详细信息,请参阅 设置生产主机(StrongLoop 文档)

¥NOTE: On systems that don’t support Upstart 1.4, the commands are slightly different. See Setting up a production host (StrongLoop documentation) for more information.

在集群中运行你的应用

¥Run your app in a cluster

在多核系统中,你可以通过启动进程集群将 Node 应用的性能提高很多倍。集群运行应用的多个实例,理想情况下每个 CPU 内核一个实例,从而在实例之间分配负载和任务。

¥In a multi-core system, you can increase the performance of a Node app by many times by launching a cluster of processes. A cluster runs multiple instances of the app, ideally one instance on each CPU core, thereby distributing the load and tasks among the instances.

Balancing between application instances using the cluster API

重要:由于应用实例作为单独的进程运行,因此它们不共享相同的内存空间。也就是说,对象对于应用的每个实例都是本地的。因此,你无法在应用代码中维护状态。但是,你可以使用像 Redis 这样的内存数据存储来存储与会话相关的数据和状态。这个警告基本上适用于所有形式的水平扩展,无论是多进程集群还是多物理服务器。

¥IMPORTANT: Since the app instances run as separate processes, they do not share the same memory space. That is, objects are local to each instance of the app. Therefore, you cannot maintain state in the application code. However, you can use an in-memory datastore like Redis to store session-related data and state. This caveat applies to essentially all forms of horizontal scaling, whether clustering with multiple processes or multiple physical servers.

在集群应用中,工作进程可以单独崩溃而不影响其余进程。除了性能优势之外,故障隔离是运行应用进程集群的另一个原因。每当工作进程崩溃时,始终确保记录事件并使用 cluster.fork() 生成新进程。

¥In clustered apps, worker processes can crash individually without affecting the rest of the processes. Apart from performance advantages, failure isolation is another reason to run a cluster of app processes. Whenever a worker process crashes, always make sure to log the event and spawn a new process using cluster.fork().

使用 Node 的集群模块

¥Using Node’s cluster module

Node 的 集群模块 使集群成为可能。这使主进程能够生成工作进程并在工作进程之间分配传入连接。但是,与其直接使用该模块,不如使用自动为你完成的众多工具之一要好得多;例如 node-pmcluster-service

¥Clustering is made possible with Node’s cluster module. This enables a master process to spawn worker processes and distribute incoming connections among the workers. However, rather than using this module directly, it’s far better to use one of the many tools out there that does it for you automatically; for example node-pm or cluster-service.

使用 StrongLoop PM

¥Using StrongLoop PM

如果将应用部署到 StrongLoop Process Manager (PM),则无需修改应用代码即可利用集群。

¥If you deploy your application to StrongLoop Process Manager (PM), then you can take advantage of clustering without modifying your application code.

当 StrongLoop Process Manager (PM) 运行一个应用时,它会自动在一个集群中运行它,该集群的工作线程数等于系统上的 CPU 内核数。你可以使用 slc 命令行工具手动更改集群中的工作进程数,而无需停止应用。

¥When StrongLoop Process Manager (PM) runs an application, it automatically runs it in a cluster with a number of workers equal to the number of CPU cores on the system. You can manually change the number of worker processes in the cluster using the slc command line tool without stopping the app.

例如,假设你已将应用部署到 prod.foo.com 并且 StrongLoop PM 正在监听端口 8701(默认),然后使用 slc 将集群大小设置为 8:

¥For example, assuming you’ve deployed your app to prod.foo.com and StrongLoop PM is listening on port 8701 (the default), then to set the cluster size to eight using slc:

$ slc ctl -C http://prod.foo.com:8701 set-size my-app 8

有关使用 StrongLoop PM 进行集群的更多信息,请参阅 StrongLoop 文档中的 集群

¥For more information on clustering with StrongLoop PM, see Clustering in StrongLoop documentation.

使用 PM2

¥Using PM2

如果你使用 PM2 部署你的应用,那么你可以在不修改你的应用代码的情况下利用集群。你应该首先确保你的 应用是无状态的,这意味着没有本地数据存储在进程中(例如会话、websocket 连接等)。

¥If you deploy your application with PM2, then you can take advantage of clustering without modifying your application code. You should ensure your application is stateless first, meaning no local data is stored in the process (such as sessions, websocket connections and the like).

使用 PM2 运行应用时,你可以启用集群模式,以便在具有你选择的多个实例的集群中运行该应用,例如匹配计算机上可用 CPU 的数量。你可以使用 pm2 命令行工具手动更改集群中的进程数,而无需停止应用。

¥When running an application with PM2, you can enable cluster mode to run it in a cluster with a number of instances of your choosing, such as the matching the number of available CPUs on the machine. You can manually change the number of processes in the cluster using the pm2 command line tool without stopping the app.

要启用集群模式,请像这样启动你的应用:

¥To enable cluster mode, start your application like so:

# Start 4 worker processes
$ pm2 start npm --name my-app -i 4 -- start
# Auto-detect number of available CPUs and start that many worker processes
$ pm2 start npm --name my-app -i max -- start

这也可以在 PM2 进程文件(ecosystem.config.js 或类似文件)中配置,方法是将 exec_mode 设置为 cluster,将 instances 设置为要启动的工作进程数。

¥This can also be configured within a PM2 process file (ecosystem.config.js or similar) by setting exec_mode to cluster and instances to the number of workers to start.

运行后,应用可以像这样缩放:

¥Once running, the application can be scaled like so:

# Add 3 more workers
$ pm2 scale my-app +3
# Scale to a specific number of workers
$ pm2 scale my-app 2

有关使用 PM2 进行集群的更多信息,请参阅 PM2 文档中的 集群模式

¥For more information on clustering with PM2, see Cluster Mode in the PM2 documentation.

缓存请求结果

¥Cache request results

另一个提高生产性能的策略是缓存请求的结果,这样你的应用就不会重复操作来重复服务同一个请求。

¥Another strategy to improve the performance in production is to cache the result of requests, so that your app does not repeat the operation to serve the same request repeatedly.

使用像 VarnishNginx(另请参阅 Nginx 缓存)这样的缓存服务器可以大大提高应用的速度和性能。

¥Use a caching server like Varnish or Nginx (see also Nginx Caching) to greatly improve the speed and performance of your app.

使用负载均衡器

¥Use a load balancer

无论应用如何优化,单个实例只能处理有限的负载和流量。扩展应用的一种方法是运行它的多个实例并通过负载均衡器分配流量。设置负载均衡器可以提高应用的性能和速度,并使其能够比单个实例扩展更多。

¥No matter how optimized an app is, a single instance can handle only a limited amount of load and traffic. One way to scale an app is to run multiple instances of it and distribute the traffic via a load balancer. Setting up a load balancer can improve your app’s performance and speed, and enable it to scale more than is possible with a single instance.

负载均衡器通常是一个反向代理,用于协调进出多个应用实例和服务器的流量。你可以使用 NginxHAProxy 轻松地为你的应用设置负载均衡器。

¥A load balancer is usually a reverse proxy that orchestrates traffic to and from multiple application instances and servers. You can easily set up a load balancer for your app by using Nginx or HAProxy.

使用负载平衡,你可能必须确保与特定会话 ID 关联的请求连接到发起它们的进程。这被称为会话亲和性或粘性会话,并且可以通过上面的建议解决,即使用 Redis 等数据存储来存储会话数据(取决于你的应用)。有关讨论,请参阅 使用多个 node

¥With load balancing, you might have to ensure that requests that are associated with a particular session ID connect to the process that originated them. This is known as session affinity, or sticky sessions, and may be addressed by the suggestion above to use a data store such as Redis for session data (depending on your application). For a discussion, see Using multiple nodes.

使用反向代理

¥Use a reverse proxy

反向代理位于 Web 应用前面,除了将请求定向到应用之外,还对请求执行支持操作。它可以处理错误页面、压缩、缓存、服务文件和负载平衡等。

¥A reverse proxy sits in front of a web app and performs supporting operations on the requests, apart from directing requests to the app. It can handle error pages, compression, caching, serving files, and load balancing among other things.

将不需要了解应用状态的任务移交给反向代理可以释放 Express 来执行专门的应用任务。出于这个原因,建议在生产环境中像 NginxHAProxy 这样的反向代理后面运行 Express。

¥Handing over tasks that do not require knowledge of application state to a reverse proxy frees up Express to perform specialized application tasks. For this reason, it is recommended to run Express behind a reverse proxy like Nginx or HAProxy in production.