28 May 2007

Future server optimizations and multi-threading

Personally, I think that the threads are the root of all evil, especially in a MMO server. They make debugging very difficult, can cause serious and hard to trace bugs if not done properly, and the speed benefit is not that great (about 30-40% in practice, if done right).

All this being said, there are some cases where multi threading is a good idea for a MMO server:
1. When you are processing a lot of data that is not dependent on other thread's result.
2. When you are CPU limited.
3. When you want the fastest response time possible (lowest latency).

Some MMO servers use a blocking/threading model. What that means is that you have 1 thread for each player (or 1 thread for a number of players), and the server uses a blocking socket (the execution of the thread is intrerrupted until there is any incoming server activity).
Other MMOs (like Eternal Lands) use a non blocking, non threaded model. That means that a socket will not block the program execution when it doesn't have data, and, instead, you move to the next socket to process the next player.
A third category of MMO servers use a hybrid model: Non blocking, threaded.

This is what I plan to work on, after we are done with the update.

So, how will it work?
Right now, there are two routines that take most of the CPU time: the path finding, and the range calculations (the part that determines who sees who).

And it so happens that those two routines are not dependent of previous results; they can be done in parallel.

The range calculation is basically done like this:
for each map,
for each player on the map
test to see if you can see each player on the same map

How can this be switched to multi-threading?
Well, we can have multiple threads (as many as the physical number of cores in the system), and each one will do one map. Once it finishes, it will move to the next unprocessed map. Of course, there will be some state table so two or more threads won't do the same map, wasting time and causing conflicts. This table needs to be locked each time it is accessed (read/write) to prevent other threads from doing the same and mess things up.

The path finding is slightly more complicated. Why? Because right now, we use an "as you go" model, where the path finding routine is called whenever a path needs to be determined.
However, most of the time it is not necesary to have a path right away, although in some cases it is (such as for determining if you can access a certain location or not).
So then the path finding function, which currently looks like: int find_path(int player_id, int target_x, int target_y) will be modified to look like: int find_path(int player_id, int target_x, int target_y, int urgency)
The urgency value is a boolean, and 1 means I need the result right now, while 0 means that it can wait for a while.
If the value is 0, the server will not attempt to calculate the path, but just set a variable on the player structure that a path is needed.
And after all the stuff is read from the sockets, we can have a global path finding routine, which checks every actor and calculates the path for those who need it. Then each thread can do a different path in a similar manner with the range calculation, since the paths are not dependent of eachother.

Currently there is really no need for this, because even with 750 player/bot connections and over 1300 AI entities, the server never went up more than 15% CPU.
However, this will slightly improve the response time (by a few MS), and will allow us to host even more players (maybe up to even 10-20K, depending on how many CPUs we have).

4 Comments:

Blogger Pictor said...

Normally I wouldn't comment on a Devblog, but something about this confused me; Why is player pathfinding done on the server side?

Maybe I'm being presumptuous - I know nothing of Eternal Lands' internal server code - but wouldn't it be easier to simply let the client do the pathfinding and then check for validity on the server? It'd require a bit of tweaking to get it right, but you'd probably save a ton on server processing.

28/5/07 04:57  
Blogger Radu said...

Because if it's done on the client side, the client would have to send each move to the server, which means more delay and bandwidth lost, because the server will need to validate each individual move before the client can do it.

28/5/07 12:33  
Anonymous Anonymous said...

"the client would have to send each move to the server"
Surely you're doing something similar by the player having to wait for the server to calculate the path? Could you not implement client-side path finding and then thread a server-side validation routine? Suppose it ultimately depends on what you regard more expensive: server or bandwidth load (at 15% max usage, you've not got much to worry about)

"and the speed benefit is not that great (about 30-40% in practice, if done right)."
Anyone who knows optimisation will tell you that a 30% increase in performance for most systems is a significant gain.

Good luck with the threading, on large systems it's a PITA as you probably know. Remember to K.I.S.S :D

29/5/07 08:16  
Blogger Radu said...

Placid:

If the client calculates the path, it HAS to wait for the server to VALIDATE it. Otherwise if the client moves preemtively, and the server sends a "can't go there", then the client will have to jump back, which is bad and ugly.

If the server calculates the path, it just sends a "move there" packet, and that's it.
If the client calcualtes it, it needs to wait for the packet to be sent to the server, then wait for the server to process it, and then wait for the server to send the ok or not ok.

So as you can see, it is faster for the server to do it.

In other games, especially FPS, the client calculates everything and moves preemtively, which looks OK. However, in a MMO like EL, it can look bad.

31/5/07 19:50  

Post a Comment

<< Home