Skip to content

Latency/Load/Overhead caused by integer indexed arrays indexOf/splice/shift etc #2199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
selay opened this issue Aug 4, 2015 · 17 comments
Closed

Comments

@selay
Copy link

selay commented Aug 4, 2015

Recently I was debugging to find the reason for frequent server crashes, delays in transmission and server capacity issues.

I have found several issues in socket.io and hope it might be useful to report here. Well, I had to hack it to solve the issues and it provided unbelievable improvement. However, I have only changed the parts I use and that means my current changes are not mergable and not compatible with existing modules.

Problem:
Room etc uses integer based arrays to keep sockets and then uses indexOf multiple times to find the index of socket. IndexOf is used many times during a single emit process. Additionally splice is used to remove a socket, and that means it re-arranges multiple big arrays for each socket changes.
I did a small simulation with 10000 sockets connected and randomly leaving, emitting, connecting etc.
(Compared socket.io against my hacked version which uses "associative array" where index and value are the same. No need for indexOf and splice (because no need to rearrange array indexes).)

When 10000 sockets are connected, emit is 2040 times!!! faster in the hacked version than current socket.io implementation. The server load is at least twice low, and memory usage is 30% less.
The latency and delay issues disappeared.
(it apparently solved another unrelated issue - when server is busy to accept a new handshake quickly, client side continues to make new polling requests and as a result, multiple connections were happening with the same socket.id. When a message comes, it was getting echoed/duplicated many times.)

I dont understand what the reason is to use integer indexed array and then search each time. As location is not known, each searches the entire array (10000 sockets multiple times for a single emit and how about 1000 sockets emit and this overhead is multiplied by 1000, a single disconnect rearranges all 10000 sockets etc * 1000).

Actually, you can run a simple test to compare [] acess against indexOf and can see the performance difference which is over 3000 times.

So, the current socket.io implementation is only suitable for a small-scale project where you don't expect more than 3000 sockets connecting. Or you will need unnecessarily have load balancing, multiple nodes etc.

@nuclearace
Copy link
Member

Interesting. If this is really a problem I think it would be worth seeing some improvements.

@nuclearace
Copy link
Member

@rauchg What are your thoughts?

@fourpixels
Copy link

@selay - hey this sounds pretty interesting. Can you share the modifications you've made or at least any ideas what you've done. I'm pretty interested in this fix, as I've already browsed the code and thought it was quite a strange approach.

How do you use the "associative array", what's the key?

I don't understand the array either. First I thought it was because they want the total number of connected sockets but never found anything like that in the code. Weird..

@selay
Copy link
Author

selay commented Sep 2, 2015

Yes, using socket id as key (or property if you prefer to call it an object) in "associated array". As you know, socket id is the default room for each socket - so when you send something to Socket A, you actually send it to Room named socket id.
On disconnection, socket id is removed from the "associated array" and it is easy because you can cheaply add, remove, search etc in the associated array by its key.
So, the modification is actually simple.
instead of

self.rooms.push(room); 

you can have

self.rooms[room]=room. 

To check if it exists,
instead of

if (~this.rooms.indexOf(room))

you can have

if (this.rooms[room])

The original code to remove from room when socket disconnects is:

var idx = self.rooms.indexOf(room);
 if (idx >= 0) {
  self.rooms.splice(idx, 1);
}

You can simply have:

delete self.rooms[room]; 

The same stuff to check if room exists are used many times during a single emit, broadcast etc at the top of each function called, so actually the already big overhead gets multiplied.
Note: Both indexOf and splice are extremely expensive compared to [] and delete [](both accesses directly); Splice not only traverses the entire container but also after removal, it re-arranges the entire container (rooms), moving everything one position down after creating on a new container and moving there. (it actually sometimes results in lost connection if you use multiple nodes shared between multiple servers because the threads work independently but when you change the original object to a new one, they may return wrong result). If you have 100 000 sockets and hundreds of sockets connect and leave every millisecond and you do so much work to just keep the container updated.
In other areas, it is used like this;

var id= self.rooms.indexOf(room);
return self.rooms[id]; 

First you get id with an exhaustive search and then access by index.
In modified case, it simply is:

return self.rooms[room]; // no need to know its position. 

In several other places, there are other inefficient codes:
for example, change for to and in prototype method:

Socket.prototype.to =
Socket.prototype.in = function(name){
    this._rooms = this._rooms || [];
    if (!~this._rooms.indexOf(name)) this._rooms.push(name);
   return this;
};

You can change to this:

Socket.prototype.to =
Socket.prototype.in = function(name){
    if (!this._rooms)  this._rooms=[];
    if (!this._rooms[name]) this._rooms[name]=name;
   return this. 
  };

BY THE WAY, I JUST WROTE ALL CODES HERE FOR ILLUSTRATION WITHOUT LOOKING AT MY PROJECT, SO PLEASE DON"T ASSUME THEY ARE CORRECT, AND TEST BEFORE USING.

@darrachequesne
Copy link
Member

Hi! I tried to implement the suggested modifications, is there any way to properly bench it ?

  • In the object containing sockets, the key is the unique pair (namespace name, client id):
function Socket(nsp, client){
  [...]
  this.id = nsp.name + '::' + client.id;

Note: the separator :: is completely random, maybe not even needed.
In the object containing rooms, the key is simply the name of the room, as suggested by @selay.

  • In the file lib/socket.js, the array this._rooms can also be transformed, but I think the socket.io-adapter has to be updated, as it's waiting for an array as input (the cost of conversion from object values to array of values every time Socket.prototype.emit is called would be too high, would'nt it?)
  Socket.prototype.emit = function(ev){
      [...]
      this.adapter.broadcast(packet, {
        except: [this.id],
        rooms: this._rooms,
        flags: flags
      });

PS: the Travis build has miserably failed 😱 (https://travis-ci.org/socketio/socket.io/builds/79921947), but I think it's related to the last commit on master (Error: Cannot find module 'base64id'). Maybe. Please 😇

@selay
Copy link
Author

selay commented Sep 11, 2015

Yes, you need to hack any component or module that references integer-indexed array. Any conversion can be a big overhead. The modules or adaptor use what socket.io uses, so dependencies may require modification.

If you want to make it generic and contribute to this project, I think the better way is you only change socket.io for the next version, and note the changes so that other module developers can change their modules to be compatible with socket.io next release.

If I get some time, I will help to covert some modules as well.

I think the next release should focus on efficiency and it can be version 2.0 to be seperate from 1.x line, which means other modules will need to support it.

@darrachequesne
Copy link
Member

Hi! As suggested by @kinsi55, I also converted for-in loops to for+Object.getOwnPropertyNames loops as it seems to be much more efficient (I don't think it really improves readability, but it does seem faster, see this jsperf).

I have merged these changes in my repo, if anyone is interested. Against 1.3.5 and 1.3.7 releases (as I haven't figured why #2251 yet).

@selay
Copy link
Author

selay commented Oct 1, 2015

Did you test memory usage?
I don't think it is a good idea to convert for-in loop.
for-in loop is pretty efficient (both in memory utilization and performance), better-supported, general-purpose, neat, safe and easy-to-maintain, and the gain in performance after the conversion can only be marginal in a specific favorable situation (even with n*millions of iterations).
Especially, for server side code where crash can not be tolerated, it is better to use for-in loop as it checks many things including object property definition.
(Benefits outweigh downsides).
PS: Also, your jsperf link shows it to be slower than for-in loop?
http://s16.postimg.org/wc09hkv1w/your_own_test.jpg

@kinsi55
Copy link

kinsi55 commented Oct 1, 2015

@darrachequesne Hi. Apparently the answer i posted to your PR, after deleting my initial comment didnt go trough because my mobile connection sucked ass, so i'll write it down again. I deleted my comment because i actually did some tests on this in chome, and to me, although it doesnt make any sense, it looked like a normal for loop in combination w/ object.keys is slower than for in, i couldnt believe that to be honest, but that seems to be the case, as @selay prove too, while the jsperf gave me a 98% slower than for, which is why i initally even criticised the code. I guess you never learn out with JS, and as @selay wrote, i am pretty sure memory usage would be something that gets influenced by this change, so i am sorry, but i'd suggest changing it back.

@darrachequesne
Copy link
Member

Well I guess it depends on the browser 😃 (http://postimg.org/image/v1os2pfjv/) ! But those results seem to vary with the number of attributes in the object (with few attributes for-in loop actually appears to be faster). Long story short... back to the (much-more-readable) future!

@kinsi55
Copy link

kinsi55 commented Oct 2, 2015

I just tested it right now w/ this since i was curious and i really want socket.io to be what it can.

var obj = {};
for(var i = 0; i < 1000000; i++){
    obj[i] = 0;
}

function benchForI(){
    console.time("fori");
    var keys = Object.keys(obj);
    var kl = keys.length;
    for(var i = 0; i < kl; i++){
        obj[keys[i]] += 1;
    }
    console.timeEnd("fori");
}

function benchForIn(){
    console.time("forin");
    for(var n in obj){
        obj[n] += 1;
    }
    console.timeEnd("forin");
}

Result for 0.12.7:

> benchForI();
fori: 236ms
> benchForI();
fori: 210ms
> benchForI();
fori: 213ms
>
> benchForIn();
forin: 220ms
> benchForIn();
forin: 220ms
> benchForIn();
forin: 219ms
>
> benchForI();
fori: 212ms
> benchForIn();
forin: 219ms
> benchForI();
fori: 248ms
> benchForIn();
forin: 219ms
> benchForI();
fori: 210ms
> benchForIn();
forin: 221ms

Same with 4.1.1:

> benchForI();
fori: 200ms
> benchForI();
fori: 219ms
> benchForI();
fori: 210ms
>
> benchForIn();
forin: 246ms
> benchForIn();
forin: 235ms
> benchForIn();
forin: 235ms
>
> benchForI();
fori: 197ms
> benchForIn();
forin: 235ms
> benchForI();
fori: 199ms
> benchForIn();
forin: 242ms
> benchForI();
fori: 209ms
> benchForIn();
forin: 251ms

TL;DR There really isnt a difference in node you should consider to address in this case, though, in 4.1.1 for..in seems to be slower consistently (Yes i tried it more often than listed here, for..in is, in this case, always slower than a normal for loop (Node 4.1.1)

@kapouer
Copy link
Contributor

kapouer commented Oct 3, 2015

Hello, we submitted similar PR
#2141
socketio/socket.io-adapter#29
so a comparison of both sets of improvements might be fruitful.

@selay
Copy link
Author

selay commented Oct 3, 2015

I guess anyone who used socket.io in production environment has already or will soon run into this issue.

Changing loop types, global to local variables etc. will not provide much benefit, but of course optimization should eventually be done anywhere possible with the condition that it doesn't affect maintainability (for example, for-in loop is straightforward and easier to maintain. Object.keys may use for-in loop or another loop internally depending on the implementation when it iterates through keys. it is unlikely to be better. But even if it is better, it makes the code horrible to maintain - eventually can lead to more bugs when an enhancement is made).

Also, such things are generally optimized by (smart) interpreter anyway.

However, indexOf to "associative array" (object) is a different story. Its performance difference is undeniably and significantly high. Also, fixing it does not really complicate maintainability because [] access is very common and easy to follow. (actually, it is more straightforward then indexOf because the latter usually confuses novice programmers due to return value of -1 not false when not found, for example).

So, I think it is better to prioritize issues which deliver axiomatic improvement.

@darrachequesne
Copy link
Member

@kapouer Hi! I hadn't seen your PR #2141, it seems they are indeed quite similar! All the change are included in the PR #2239, aren't them (please correct me if I'm wrong)? Currently, conversion from array elements to object attributes affects:

  • namespace#sockets (also included in your PR)
  • client#sockets
  • socket#rooms

Note: namespace#rooms and socket#_rooms could also be converted, but as you pointed out with no improvement as they are used as a kind of "volatile buffer" for to|in method (and adapter API has to be updated accordingly).

The other PR is closed, we should rather take a look at socketio/socket.io-adapter#30, right? I think the latter may be interessant (converting !Object.keys().length to a statically stored Room#length), have you been able to bench it?

@selay do you see any other improvement to be made on the current subject?

@kapouer
Copy link
Contributor

kapouer commented Oct 8, 2015

Well also my point @darrachequesne was to make sure we don't break API even for adapters - current breakage between master and 1.3.x branches is already painful. Besides that, i think you have a much clearer view of what's to do and how to do it. And No, i have no benchmarks on that. Maybe @manubb could help but not sure he has time for this.

@darrachequesne
Copy link
Member

@rauchg Hi! Would you have time to review the current PR #2239 please?

@nkzawa
Copy link
Contributor

nkzawa commented Jan 17, 2016

Should be fixed in #2239 , thanks for @darrachequesne 😄

@nkzawa nkzawa closed this as completed Jan 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants
@kapouer @nkzawa @nuclearace @selay @fourpixels @kinsi55 @darrachequesne and others