Node.js使用redis集群进行Socket.IO扩展

目前,我面临的任务是必须使用Amazon EC2扩展Node.js应用程序.根据我的理解,这样做的方法是让每个子服务器使用集群使用所有可用进程,并具有粘性连接,以确保“记住”连接到服务器的每个用户关于他们当前数据的工作者以前的会议.

完成此操作之后,我所知道的下一个最佳移动是根据需要部署尽可能多的服务器,并使用Nginx在所有服务器之间进行负载平衡,再次使用粘性连接来了解每个用户数据所在的“子”服务器.

因此,当用户连接到服务器时,会发生什么？

客户端连接 – >查找/选择服务器 – >查找/选择流程 – > Socket.IO握手/连接等

如果没有,请允许我更好地理解这个负载平衡任务.我也不明白redis在这种情况下的重要性.

下面是我用来在一台机器上使用所有cpu的代码,用于单独的Node.js进程：

var express = require('express');
cluster = require('cluster'),net = require('net'),sio = require('socket.io'),sio_redis = require('socket.io-redis');

var port = 3502,num_processes = require('os').cpus().length;

if (cluster.isMaster) {
// This stores our workers. We need to keep them to be able to reference
// them based on source IP address. It's also useful for auto-restart,// for example.
var workers = [];

// Helper function for spawning worker at index 'i'.
var spawn = function(i) {
    workers[i] = cluster.fork();

    // Optional: Restart worker on exit
    workers[i].on('exit',function(worker,code,signal) {
        console.log('respawning worker',i);
        spawn(i);
    });
};

// Spawn workers.
for (var i = 0; i < num_processes; i++) {
    spawn(i);
}

// Helper function for getting a worker index based on IP address.
// This is a hot path so it should be really fast. The way it works
// is by converting the IP address to a number by removing the dots,// then compressing it to the number of slots we have.
//
// Compared against "real" hashing (from the sticky-session code) and
// "real" IP number conversion,this function is on par in terms of
// worker index distribution only much faster.
var worker_index = function(ip,len) {
    var s = '';
    for (var i = 0,_len = ip.length; i < _len; i++) {
        if (ip[i] !== '.') {
            s += ip[i];
        }
    }

    return Number(s) % len;
};

// Create the outside facing server listening on our port.
var server = net.createServer({ pauSEOnConnect: true },function(connection) {
    // We received a connection and need to pass it to the appropriate
    // worker. Get the worker for this connection's source IP and pass
    // it the connection.
    var worker = workers[worker_index(connection.remoteAddress,num_processes)];
    worker.send('sticky-session:connection',connection);
}).listen(port);
} else {
// Note we don't use a port here because the master listens on it for us.
var app = new express();

// Here you might use middleware,attach routes,etc.

// Don't expose our internal server to the outside.
var server = app.listen(0,'localhost'),io = sio(server);

// Tell Socket.IO to use the redis adapter. By default,the redis
// server is assumed to be on localhost:6379. You don't have to
// specify them explicitly unless you want to change them.
io.adapter(sio_redis({ host: 'localhost',port: 6379 }));

// Here you might use Socket.IO middleware for authorization etc.

console.log("Listening");
// Listen to messages sent from the master. Ignore everything else.
process.on('message',function(message,connection) {
    if (message !== 'sticky-session:connection') {
        return;
    }

    // Emulate a connection event on the server by emitting the
    // event with the connection the master sent us.
    server.emit('connection',connection);

    connection.resume();
});
}

最佳答案

我相信你的一般理解是正确的,尽管我想发表一些意见：

负载均衡

你是正确的,一种做负载平衡的方法是在不同实例之间具有Nginx 负载平衡,并且每个实例内部在它创建的工作进程之间具有集群平衡.然而,这只是一种方式,并不一定总是最好的方式.

实例之间

首先,如果您正在使用AWS,您可能需要考虑使用ELB.它专门用于负载平衡EC2实例,并且它使实例之间的负载平衡配置变得微不足道.它还提供了许多有用的功能,并且(使用Auto Scaling)可以使扩展非常动态,而无需您的任何努力.

ELB的一个特点是,它与你的问题特别相关,它支持粘性会话out of the box – 只需标记一个复选框.

但是,我必须添加一个主要的警告,即ELB可以在bizarre ways中断socket.io.如果你只是使用长轮询你应该没问题(假设启用了粘性会话),但实际的websockets工作是介于非常令人沮丧的并且不可能.

进程之间

虽然使用集群有很多替代方案,within节点和without,我倾向于认为集群本身通常都很好.

但是,一个不起作用的情况是当你想在负载均衡器后面进行粘性会话时,就像你在这里做的那样.

首先,应该明确指出,首先你甚至需要粘性会话的唯一原因是因为socket.io依赖存储在内存中的会话数据来处理请求(在websockets握手期间,或者基本上持续很长时间)轮询).一般来说,出于各种原因,应尽可能避免依赖于以这种方式存储的数据,但是对于socket.io,您实际上没有选择权.

现在,这似乎并不太糟糕,因为集群可以使用socket.io的documentation中提到的sticky-session模块或者您似乎正在使用的snippet来支持粘性会话.

问题是,由于这些粘性会话基于客户端的IP,它们不会在负载均衡器后面工作,无论是Nginx,ELB还是其他任何东西,因为那时实例内部可见的是负载均衡器的IP .您的代码尝试散列的remoteAddress实际上根本不是客户端的地址.

也就是说,当您的节点代码尝试充当进程之间的负载平衡器时,它尝试使用的IP将始终是另一个负载均衡器的IP,它在实例之间进行平衡.因此,所有请求都将在同一进程中结束,从而破坏集群的整体目的.

在这个question中,你可以看到这个问题的细节,以及一些解决它的方法(其中没有一个特别漂亮).

Redis的重要性

正如我之前提到的,一旦有多个实例/进程接收来自用户的请求,会话数据的内存存储就不再足够了.粘性会话是一种可行的方法,尽管存在其他可以说是更好的解决方案,其中包括Redis可以提供的中央会话存储.有关该主题的全面评论,请参见post.

看到你的问题是关于socket.io,但我认为你可能意味着Redis对websockets的特殊重要性,所以：

当您有多个socket.io服务器(实例/进程)时,给定用户将在任何给定时间仅连接到一个此类服务器.然而,任何服务器可以在任何时候希望向给定用户发送消息,或者甚至向所有用户发送广播,而不管他们当前在哪个服务器上.

为此,socket.io支持“Adapters”,其中Redis是其中一个,允许不同的socket.io服务器相互通信.当一台服务器发出消息时,它会进入Redis,然后所有服务器都会看到它(Pub / Sub)并将其发送给用户,确保消息将达到其目标.

这再次在socket.io的documentation中解释了多个节点,在Stack Overflow answer中甚至可能更好.

Node.js使用redis集群进行Socket.IO扩展

猜你在找的Nginx相关文章