question

brendan avatar image
brendan asked

Query data from other users

Byterockers' Games
started a topic on Mon, 21 September 2015 at 7:50 AM

In our game, players create their own maps, which then will be played by other players (on player breaks into the base of another player). For that we need the possibility to find other players depending on their stats.

On our test server (simple PHP server with a mySQL database) we used the following approach:

1) Find another player which did not get selected for the longest time (smallest last_target_fetch time):

$query = "SELECT `id`, `name`, `map`, `cash` FROM data WHERE `hash` != '$id' AND `map` IS NOT NULL ORDER BY `last_target_fetch` LIMIT 1"; 

2) Set this time value to the current time, so when the query before is called again, another player will be found:

$query = "UPDATE data SET `last_target_fetch` = NOW() WHERE `id` = '$row->id';";

Is there a way to make queries similar like that with PlayFab?

The only approach I found in the PlayFab documentary would be with the help of leader boards. So I would create a leader board for a value like "last broke in time", the leader board will do the ordering so that I can get the first (or last) player, resets its time value and then present its data to the current player.

Is this a proper approach? How fast these leader boards got updated? Is there a more direct way to access, search and filter the data in PlayFab?

In future, we also need to be able to group users, i.e. only find maps of other users which have the same level like the current player. My approach for that would be to create a leader board for the "last broke in time" for each level (i.e. by using a prefix). Therefor it would nice to also delete user stats, so that they do not show up in every leader board.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
brendan avatar image
brendan answered

11 Comments
Brendan Vanous said on Mon, 21 September 2015 at 1:20 PM

Unfortunately, ad-hoc queries across a database of millions of users are extremely inefficient. To keep our system highly scalable while keeping our pricing down, our service doesn't allow for that. However, it's easy to design for the functionality you need, as long as we know the actual behavior you're trying for - ad-hoc queries are only one way to approach a problem.

In your case, you want to find another user for the current user to attack, and you want to prevent people being attacked more than others by making sure that the time since last attack is a determining factor. Sounds good! It's certainly possible to use a leaderboard to record the last time a player was attacked, and so use it to pick a player who hasn't been attacked in a long time (and then update their attacked time in Cloud Script).

But as a possible alternative, here's another approach:

Create a Shared Group Data to hold a master index of player "sets". Each player "set" is a Shared Group Data covering a certain range of player levels (if you have that concept - it could be total amount of virtual currency spent to build the level, or any other factor you use to determine how "powerful" the user is). The master index is so that you can resize the number of groups you have, as your player base grows.

Now, each of the Shared Group Data "sets" would have data for the players who fit in the range - the Key for the record would be the modulus of the attacked user's PlayFab ID modulo N, where N is going to be a number you choose based upon how large you want the set of users to be in that Shared Group Data bucket. The Value would then be the PlayFab ID of that user, plus any info you need for the user (time last attacked). There are two main reasons you don't want it to hold 100% of the users:

  1. You need to make sure that the total data returned is under 1 MB (it's possible to query for more, but you'll get throttled past 1 MB, slowing you down),

  2. While you do want to hit users who haven't played in a while, there's a point beyond which you'll want the abandoned players out of the set. This is both to prevent hitting them with Push Notifications too much (which will cause them to turn off Push, usually) and so that you're focusing the attacks on players who still have a good chance of re-engaging.

As users level up, you could remove them from their old bucket (the Shared Group Data "set") when you add them to the new one, or you could have an overlap range you calculate, so that players in the top/bottom of a range show up in two buckets (and are updated in each when attacked).

When you need to resize your buckets, you would create the new Shared Group Data, and start updating it when you update the old one, until you've reached some time period you've defined internally for the switch to using just the new one.

It's possible to do this with leaderboards as well, having a set of leaderboards instead of the Shared Group Data "sets", but this system gives you the flexibility to have other factors you may want to consider stored as part of the Value, while rolling abandoned users out of the active set.

Brendan


Byterockers' Games said on Tue, 22 September 2015 at 12:59 AM

Hi Brendan,

thank you very much for you answer. I like your concept of using the group data, I will definitely try that.

But I don't understand what you need the modulo for in creation of the group key:
"the Key for the record would be the modulus of the attacked user's PlayFab ID modulo N"
How does it help to find players? Please explain me more =D

And how are these data sets ordered? Is it like a list (FIFO)? So basically, is there a guaranteed order when I would iterate through the dictionary in cloud script like this:

for (var key in dictionary) {
  // do something with key
}

Or does your idea mean I have to iterate and check every player in the group data bucket if (s)he is the one who was not attacked for the longest time?


Brendan Vanous said on Tue, 22 September 2015 at 3:47 PM

What I'm referring to is only storing a subset of players. So, PlayFab ID 1 mod 10 would be 1, as would PlayFab ID 11 mod 10. In the case of using 10 for your modulo operation, you'd only be storing a total of 10 entries (obviously, you'd pick a higher number). In this design, you would need to iterate on the list of players, to find the one who was attacked least recently (as well as matching on any other factors you might want to store and compare), which is another reason why you'll want to have multiple Shared Group Data "sets" across your user base.

Brendan


Byterockers' Games said on Wed, 23 September 2015 at 12:07 AM

Hi Brendan,

thank you or your answer. Unfortunately it does not really help me. As far as I understand the PlayFab IDs are 16 digit hexadecimal numbers stored in a string. As far as I know, JavaScript does not support unsigned 64 bit integer (UInt64), which would be needed to hold such a big number. But even if they were normal integers, the group would never be filled with consecutive IDs. Another reason modulo wont work is that the IDs would need to be ordered in the list as well. Please correct me if I am wrong, but I can only find reasons why modulo will not work.

Finally, I still don't understand why I need modulo in the first place, since I have to iterate through the whole set anyway. Speaking of performance of your servers, this seems to be much more inefficient then running a (normally highly optimized) query on a database.

Last time, I also ask: "And how are these data sets ordered? Is it like a list (FIFO)?" When I add an entry to the dictionary, will it become the first entry or the last entry or totally random?

Is there a way to get faster support? We are situated in Europe, it looks like you are working from America so I understand it takes a while to answer. But this issue is blocking me for days now and is jeopardizing our sprint goal. I am really sorry, but I cannot wait a whole day for an incomplete answer.

Kind regards


Brendan Vanous said on Thu, 24 September 2015 at 11:03 AM

Hi again,

I actually meant you would pass up the modulo value - do you mean your title is written in JavaScript? If so, I believe there are libraries which enable 64 bit operations - I can help you to track one down, if needed.

I'm not clear on why you would need consecutive IDs, however. The intent is that you would store a sub-set of users who fit in a particular bucket. This is in large part to make sure you're removing from the active set players who have abandoned the game, since having active players interacting is far more valuable. If you don't feel that's important, then the original design discussed above, using a leaderboard, should work fine.

Just to make sure you have 100% of the info: We do not provide ad-hoc queries into the user data. Relational databases are not highly scalable, requiring sharding to grow to support the levels of data required for massive user bases. So the two solutions presented are 1) to use leaderboards, as you described (and using GetLeaderboardAroundCurrentUser), which is a workable solution, though it does not remove abandoned players, and 2) to use a Shared Group Data system as described, so that you have a "living" dataset of your player base, and can match people who are known to be active. The Shared Group Data is a DynamoDB data store, and technically multiple nodes, so the order of the data in that set is not guaranteed. You would indeed pull the data for the set, then iterate on it to make a decision concerning the best match.

Believe me, we do appreciate that we have an international community of developers, and we're aware that different timezones can cause delays. We take pains to answer issues as quickly and completely as we can, but it can take a day or two for issues to be resolved at times. We're doing two things right now, to address that. First and foremost, we've been searching for an additional technical representative to be based in the EU, along with our current rep, Mark Val. If you know of anyone who would be interested in this opportunity, please do give them this link: https://playfab.com/jobs/?gh_jid=66232. Second, we are working to find ways to get the community more engaged, so that developers feel that they can exchange ideas and discuss issues directly in our forums. This is already an option, and we have several users you may have encountered who are quite active themselves. If you have recommendations on ways you'd like to see the community grow, feel free to post them to the General Discussion board, or email them to us at devrel@playfab.com.

That said, if you have a need for more direct support, we do have the ability to provide support contracts. If you'd like to look into that option, please email us at devrel, and we can put you in touch with the sales team, who can assist.

Brendan


Byterockers' Games said on Fri, 25 September 2015 at 1:58 AM

Hi Brendan,

thank you for your detailed answer. Now I start to understand our misunderstanding ;)

We do not use JavaScript, we use C# (Unity3D) inside our project. But I would not do this calculation on client side, for security reasons the target selection should happen on server side. So when I wrote JavaScript I actually meant your Cloud Code.

You mentioned the Shared Group Data is a DynamoDB data store, so technically querying should not be difficulty or performance intensive.

But anyways, sure that it is your decision and business model, I wont argue with that. Just for your information, we moved to a competitor of yours, where we have much more control of our data, even with more or less direct access to the noSQL database.

Best regards


Brendan Vanous said on Sat, 26 September 2015 at 1:13 PM

Hi again,

Sorry to hear that, but I'm a bit confused. What you were originally asking about was a relational database model. What we provide is specifically a noSQL database - the DynamoDB key/value datastore model - so that we give titles as much flexibility as possible in their data while ensuring that they can scale to millions of users reliably and without complications that would cause increased costs at high CCU/DAU levels. Some good questions to make sure any backend service can answer is, what level of concurrency can they support, and what are their prices as you scale to millions of daily users? Do they have examples of high DAU titles operating on their service (obviously, we don't share private information from titles, but the number of user ratings on the iOS/Android pages are a good indicator of the size of the user base - look for titles with at least 250K ratings on a single platform, and preferably those with at least 500K across both). Also, how flexible is your title's configuration? Can you make changes to Title Data or cloud-hosted logic if you need to respond to issues quickly, and have those changes immediately be reflected across all users?

For our service, the answers would be that we've tested to millions of daily users with no issues. The datastore is triple-redundant with daily backup, and nothing special needs to be done if a title suddenly scales to that level (which is what you see if you get featured on Penny Arcade, for example). Our pricing page always shows our current offering, and it does show a progression through to a million users, though I'll be very frank - if you're at that level of usage, we're going to give you a break on the pricing, as economy of scale benefits you, in that case. The mot recent large-user-base title on our service is AdVenture Capitalist (iOS/Android). Any changes made to your game configuration, whether data or logic, are immediately available to the client.

Brendan


Byterockers' Games said on Mon, 28 September 2015 at 12:02 AM

I did not not wanted to make advertisement for another company in your forum, but as you ask for it: We went to GameSparks, which is based on MongoDB, giving us all the search features we need. They might not be as big as you, which is actually an imputation since I do not have any idea about your or their revenue, user base or whatsoever. Their best title is Lara Croft: Relic Run "only" has about 170k rating, but actually has more users in Google play than your AdVenture Capitalist. Sorry, but yes, they give us much more flexibility than PlayFab.

"Can you make changes to Title Data or cloud-hosted logic if you need to respond to issues quickly, and have those changes immediately be reflected across all users?"
The answer is Yes!

"What we provide is specifically a noSQL database - the DynamoDB key/value datastore model - so that we give titles as much flexibility as possible"
Like I mentioned, we need more flexibilty than it is "possible". And noSQL database does not mean that you cannot run efficient and scalable search queries. I think the opposite is actually the case.

I don't want to be rude or do any shit chat, just giving you honest feedback about our needs as game developers, so you might be able to improve your service for others.

Kind regards


Brendan Vanous said on Mon, 28 September 2015 at 11:40 AM

Thanks for clarifying - we always appreciate feedback. Given your game, and given our current features, I agree that for you going with GameSparks makes sense.

Part of the reason we haven't offered direct database queries is because it can cause big performance issues. We designed our system to scale to support millions of players, and while you are absolutely correct that it is possible to write a noSQL query which would return results quickly, it's also possible to write queries that can bring down the whole system.

Going forward, we plan on addressing your needs in two ways:

1) Making it possible to activate a custom database just for your title. Next year we'll be making it possible to add-on additional services, and for games that want their own per-title database, we can make that available -- this avoids the risk of one game affecting performance of other games (as you would see in a multi-tenant system like Gamesparks provides)

I'm sure you realize, however, that this effectively shifts the responsibility for writing and supporting the database to the developer. It requires that you have to define the schema and queries, and if anything goes wrong with them, it's on you to fix it -- and not having to deal with that sort of thing is one of the reasons why we think developers want backend systems like playfab.

2) Providing an asynchronous match-making feature that can find other players in our system based on parameters you provide. We've heard this request often enough that it's on our roadmap, though we don't have a date yet. But it's clear that a feature like this would have met your needs. We already have match-making for synchronous games, extending that to asynchronous would be the next step.

As you go down the path with GameSparks, we'd love to hear how it goes -- our services are quite different and we'd love to hear about what you find the strengths and weaknesses to be!

Brendan


Byterockers' Games said on Wed, 30 September 2015 at 12:21 AM

Hi Brendan,

both features you described sound really great. Sure that brings great responsibility for the developers. The threat of theoretically limited API calls (Gamesparks has some fair use agreement on that, but it is hard to know how your game actually performs) pushes the developer to build efficient code from the beginning. But like I said, more transparency would be nice on that.

I'll try to keep you informed about our experience at Gamesparks.

Best regards


Brendan Vanous said on Wed, 30 September 2015 at 1:17 AM

Thanks - we'll have all the details on our fair use limits posted shortly, so that you'll be able to review them.

Brendan

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.