question

jonas.barnaby@gmail.com avatar image
jonas.barnaby@gmail.com asked

Server management

Hi,

In previous versions of the Game Manager there was the option of "killing" a server by clicking a checkbox. This now removed option was useful to us in the case a lobby crashed but the server was still alive because it allowed us to kill the server. Furthermore, in the event of a crashed lobby the matchmaker was still sending players to that lobby, so by killing the entire instance we forced another instance to be initialized with a fresh game process. Now we cannot kill a server when a lobby crashes and therefore the matchmaker still sends players to it resulting in users who cannot enter the game. The only way we have to kill servers now is by disabling all the regions in the correspondent build, but this option does not work in a server where the game process has crashed. Another option is killing manually the crashed lobby in the "active games" tab, but right now there's no indication whatsoever that a lobby has crashed. As a matter of fact we're locating our crashed lobbies by looking at abnormal games' duration. We'd need a way to know when a lobby has crashed and be able to kill the server instance.

Another problem we've detected (not related to the previous one) is that in many occasions a server initializes and in the moment it goes from "initializing" to "running", another server of the same build starts to initialize. Please note these two problems impact our workflow in a severe way.

Thank you so much for your help.

Regards,

Alberto

apis
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
brendan avatar image
brendan answered

We removed the option to kill server hosts (which kills all running instances of a game on the server in question) as it was causing excessive confusion for developers. Killing a server host means that there will be no logs captured, as it kills the entire server host without giving our game manager the chance to copy anything.

Since it's specifically the game instance that could hang (in the case of a crash, it would no longer be running, and so would free that slot), you can kill that on the Active Games tab, as you pointed out. But allowing a new instance to spin up on the same server host should not be a problem, as the instances run separately. Apart from specifying too high of a total number of instances for a host, or bad behavior such as locking a file (other than the log and output files) for write, they should not interfere with each other.

My question would be, how are you identifying that a server is in a hung state?

2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jonas.barnaby@gmail.com avatar image jonas.barnaby@gmail.com commented ·

Hi, Brendan

As of now, the only way we have to detect a hung game is by seeing an abnormal match duration in the game manager. None of our game modes should last more than 10 minutes because they've got limited duration, so when a game is lasting more than, say, 15 minutes it's almost certainly hung (we're adding queue time + match time here). We're taking steps to prevent any kind of problem with our game and that should not happen on production, but bugs are a fact of life and they will certainly happen.

Yes, you're absolutely right, we can kill the hung game via the "Active Games" tab in the game manager. However, is there a way (an API call or something) we can know if a given game has hung and then kill it? We could automate a process which would get all the current lobby id's, find out if the associated game instances are still alive and kill those which got hung. Something like an ack signal.

Thanks,

Alberto

0 Likes 0 ·
brendan avatar image brendan jonas.barnaby@gmail.com commented ·

Sorry, but we don't currently have a way to detect if your custom game server code is no longer running correctly. We do want to provide a way to view logs before the server has stopped in a future update, but that would provide alerts. I'd have to recommend using a monitoring service if you feel this is a significant risk for your title.

0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.