We have implemented OnGSDKHealthCheck in our Unreal game server and we are using multiplayer servers for our game. We return the server with the lowest player count upon Requesting or Getting a game server.
Today one of our servers went to 100% CPU, game server stopped logging anything, so was completely dead. But somehow OnGSDKHealthCheck didn't do the work I thought it would do, detect the dead server and kill the container. I guess OnGSDKHealthCheck didn't return anything or possibly timed out when your system calls it.
So we ended up routing players to a dead server. From Playfab side, it was active and with connected players.
How does OnGSDKHealthCheck work? What can we do to avoid this issue next time?