Question from a developer:
Once I've set up my S3 bucket for the PlayStream Event Archive, how are the events delivered? What's the format of the files, and how are they organized?
Question from a developer:
Once I've set up my S3 bucket for the PlayStream Event Archive, how are the events delivered? What's the format of the files, and how are they organized?
Edit: Updating and clarifying export timing.
In your S3 bucket, if you've specified a Prefix, there will be a folder with that name, where all the events will be delivered. If you didn't specify a Prefix, we'll be starting at the root of the bucket.
First, a sub-folder will exist for every event, with the fully qualified name as the folder name. In that will be folders for years, in those there'll be folders for months, and in there, days. In the days folders, we write the events to .gz (GZip) files, up to once every 10-20 seconds (depending on the volume of events, it may be significantly less). The file names the fully qualified event name, timestamp, version info, and a generated value.
So, I just ran a test in my own Title ID (5F4), logging in a user three times in a row. There's now a folder in my S3 bucket in:
com.playfab.player_logged_in/2016/10/27/
named:
playfab-events-com.playfab.player_logged_in-2016-10-27T1744Z-v1-9e2f8385.jsonstream.gz
In it are the events for that 10 second period - the login calls I made:
{"EventName":"player_logged_in","Platform":"PlayFab","PlatformUserId":"ADADCB6F9583EA99","EventNamespace":"com.playfab","EntityType":"player","Source":"PlayFab","TitleId":"5F4","EventId":"593ed883a20e499ea69d77cfd48be040","EntityId":"ADADCB6F9583EA99","SourceType":"BackEnd","Timestamp":"2016-10-27T17:44:36.7772602Z","History":null,"CustomTags":null,"Reserved":null} {"EventName":"player_logged_in","Platform":"PlayFab","PlatformUserId":"ADADCB6F9583EA99","EventNamespace":"com.playfab","EntityType":"player","Source":"PlayFab","TitleId":"5F4","EventId":"f8e0ddb47a774cb6900b8179e41a98be","EntityId":"ADADCB6F9583EA99","SourceType":"BackEnd","Timestamp":"2016-10-27T17:44:37.049571Z","History":null,"CustomTags":null,"Reserved":null} {"EventName":"player_logged_in","Platform":"PlayFab","PlatformUserId":"ADADCB6F9583EA99","EventNamespace":"com.playfab","EntityType":"player","Source":"PlayFab","TitleId":"5F4","EventId":"08dae4a808634bd4a93011e90de9f0cc","EntityId":"ADADCB6F9583EA99","SourceType":"BackEnd","Timestamp":"2016-10-27T17:44:37.7936194Z","History":null,"CustomTags":null,"Reserved":null}
You'll notice that these are the same as the JSON you see in the PlayStream tabs of the Game Manager. This makes it easy to parse the data, whether into Redshift or another database, for analysis.
Approximately how frequently are events written to the event archive? (e.g. approximately how long does it take for an event to show up in the S3 bucket after being generated via the server API?)
It varies, but as long as there's no connection issue between the AWS datacenters involved (which can happen, but is rate), it would be less than a minute. We package all the events for a period of time (10-20 seconds) and upload that as a zip to a folder based on the event. The path is basically: {TitleID}/{event}/{year}/{month}/{day}/{zip files go here}.
@Brendan When I try to parse the event archive file with Python, I get an error like so:
events = json.loads(u_str) File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads return _default_decoder.decode(s) File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
using
import json import gzip fp = gzip.open("test.gz") contents = fp.read() # contents now has the uncompressed bytes of foo.gz fp.close() u_str = contents.decode('utf-8') # u_str is now a unicode string print("data: ", u_str) events = json.loads(u_str)
with or without utf-8. There seems to be some additional characters in the file. Is there a specific way to parse the playfab file for the event JSON? Please help
Bear in mind that a JSON Stream file is newline delimited, while loads is looking to read a JSON object. If you use a "for line" to pass each line of JSON data into your loads call, that should work.
3 People are following this question.