作者：Anders Drachen, Alessandro Canossa, Magy Seif El-Nasr
Intro to User Analytics
by Anders Drachen, Alessandro Canossa, Magy Seif El-Nasr
3) Core business attributes: The essential attributes related to the core of the business model of the company, for example, logging every time a user purchases a virtual item (and what that item is), establishes a friend connection in-game, or recommends the game to a Facebook friend — or any other attributes related to revenue, retention, virality, and churn. For a mobile game, geolocation data can be very interesting to assist target marketing. In a traditional retail situation, none of these are of interest, of course.
4) Stakeholder requirements: In addition, there can be an assortment of stakeholder requirements that need to be considered. For example, management or marketing may place a high value on knowing the number of Daily Active Users (DAU). Such requirements may or may not align with the categories mentioned above.
5) QA and user research: Finally, if there is any interest in using telemetry data for user research/user testing and quality assurance (recording crashes and crash causes, hardware configuration of client systems, and notable game settings), it may be necessary to augment to attributes on the list of features accordingly.
When building the initial attribute set and planning the metrics that can be derived from them, you need to make sure that the selection process is as well informed as possible, and includes all the involved stakeholders. This minimizes the need to go back to the code and embed additional hooks at a later time — which is a waste that can be eliminated with careful planning.
That being said, as the game evolves during production as well as following launch (whether a persistent game or through DLCs/patches), it will typically be necessary to some degree to embed new hooks in the code in order to track new attributes and thus sustain an evolving analytics practice. Sampling is another key consideration. It may not be necessary to track every time someone fires a gun, but only 1 percent of these. Sampling is a big issue in its own right, and we will therefore not delve further on this subject here, apart from noting that sampling can be an efficient way to cut resource requirements for game analytics.
Figure 2: The drivers of attribute selection for user behavior attributes. Given the broad scope of application of game analytics, a number of sources of requirements exist.
One important factor to consider during the feature selection process is the extent to which your attribute set selection can be driven by pre-planning, by defining the game metrics and analysis results (and thereby the actionable insights) we wish to obtain from user telemetry and select attributes accordingly.
Reducing complexity is necessary, but as you restrict the scope of the data-gathering process, you run the risk of missing important patterns in user behavior that cannot be detected using the preselected attributes. This problem is exasperated in situations where the game metrics and analyses are also predefined — for example, relying on a set of Key Performance Indicators (such as DAU, MAU, ARPU, LTV, etc.) can eliminate your chance of finding any patterns in the behavioral data not detectable via the predefined metrics and analyses. In general, striking a balance between the two situations is the best solution, depending on available analytics resource. For example, focusing exclusively on KPIs will not tell you about in-game behavior, e.g., why 35 percent of the players drop out on level 8 — for that we need to look at metrics related to design and performance.
It is worth noting that when it comes to user analytics, we are working with human behavior, which is notoriously unpredictable. This means that predicting user analytics requirements can be challenging. This emphasizes the need for the use of both explorative (we look at the user data to see what patterns they contain) and hypothesis-driven methods (we know what we want to measure and know the possible results, not just which one is correct).
Strategies Driven by Designers’ Knowledge
During gameplay, a user creates a continual loop of actions and responses that keep the game state changing. This means that at any given moment, there can be many features of user behavior that change value. A first step toward isolating which features to employ during the analytical process could be a comprehensive and detailed list of all possible interactions between the game and its players. Designers are extremely knowledgeable about all possible interactions between the game and players; it’s beneficial to harness that knowledge and involve designers from the beginning by asking them to compile such lists.
Secondly, considering the sheer number of variables involved even in the simplest game, it is necessary to reduce the complexity through a knowledge-driven factor reduction: Designers can easily identify isomorphic interactions. These are groups of similar interactions, behaviors, and state changes that are essentially similar even if formally slightly different. For example “restoring 5 HP with a bandage” or “healing 50 HP with a potion” are formally different but essentially similar behaviors. The isomorphic interactions are then grouped into larger domains. Lastly, it’s required to identify measures that capture all isomorphic interactions belonging to each domain. For example, for the domain “healing,” it’s not necessary to track the number of potions and bandages used, but just record every state change to the variable “health.”
These domains have not been derived through objective factor reduction; there is a clear interpretive bias any time humans are asked to group elements in categories, even if designers have exhaustive expert knowledge. These larger domains can potentially contain all the possible behaviors that players can express in a game and at the same time help select which game variables should be monitored, and how.
Strategies Driven by Machine Learning
Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed. More than an alternative to designer-driven strategies, automated feature selection is a complementary approach to reducing the complexity of the hundreds of state changes generated by player-game interactions. Traditionally, automated approaches are applied to existing datasets, relational databases, or data warehouses, meaning that the process of analyzing game systems, defining variables, and establishing measures for such variables, falls outside of the scope of automated strategies; humans already have defined which variables to track and how. Therefore, automated approaches individuate only the most relevant and the most discriminating features out of all the variables monitored.
Automated feature selection relies on algorithms to search the attribute space and drop features that are highly correlated to others; algorithms can range from simple to complex. Methods include approaches such as clustering, classification, prediction, and sequence mining. These can be applied to find the most relevant features, since the presence of features that are not relevant for the definition of types affects the similarity measure, degrading the quality of the clusters found by the algorithm.
In a situation with infinite resources, it is possible to track, store, and analyze every user-initiated action — all the server-side system information, every fraction of a move of an avatar, every purchase, every chat message, every button press, even every keystroke. Doing so will likely cause bandwidth issues, and will require substantial resources to add the message hooks into the game code, but in theory, this brute-force approach to game analytics is possible.
However, it leads to very large datasets, which in turn leads to huge resource requirements in order to transform and analyze them. For example, tracking weapon type, weapon modifications, range, damage, target, kills, player and target positions, bullet trajectory, and so on, will enable a very in-depth analysis of weapon use in an FPS. However, the key metrics to evaluate weapon balancing could just be range, damage done, and the frequency of use of each weapon. Adding a number of additional variables/features may not add any new relevant insights, or may even add noise or confusion to the analysis. Similarly, it may not be necessary to log behavioral telemetry from all players of a game, but only a percentage (this is of course not the case when it comes to sales records, because you will need to track all revenue).
In general, if selected correctly, the first variables/features that are tracked, collected, and analyzed will provide a lot of insight into user behavior. As more and more detailed aspects of user behavior are tracked, costs of storage, processing, and analysis increase, but the rate of added value from the information contained in the telemetry data diminishes.
What this means is that there is a cost-benefit relationship in game telemetry, which basically describes a simplified theory of diminishing returns: Increasing the amount of one source of data in an analysis process will yield a lower per-unit return.
A classic example in economic literature is adding fertilizer to a field. In an unbalanced system (underfertilized), adding fertilizer will increase the crop size, but after a certain point this increase diminishes, stops, and may even reduce the crop size. Adding fertilizer to an already-balanced system does not increase crop size, or may reduce it.
Fundamentally, game analytics follow a similar principle. An analysis can be optimized up to a specific point given a particular set of input features/variables, before additional (new) features are necessary. Additionally, increasing the amount of data into an analysis process may reduce the return, or in extreme cases lead to a situation of negative return due to noise and confusion added by the additional data. There can of course be exceptions — for example, the cause of a problematic behavioral pattern, which decreases retention in a social online game, can rest in a single small design flaw, which can be hard to identify if the specific behavioral variables related to the flaw are not tracked.
Goals of User-Oriented Analytics
User-oriented game analytics typically have a variety of purposes, but we can broadly divide them into the following:
Strategic analytics, which target the global view on how a game should evolve based on analysis of user behavior and the business model.
Tactical analytics, which aim to inform game design at the short-term, for example an A/B test of a new game feature.
Operational analytics, which target analysis and evaluation of the immediate, current situation in the game. For example, informing what changes you should make to a persistent game to match user behavior in real-time.
To an extent, operational and tactical analytics inform technical and infrastructure issues, whereas strategic analytics focuses on merging user telemetry data with other user data and/or market research.
When you’re plotting a strategy for approaching your user telemetry, the first factors you should concern yourself with are the existence of these three types of user-oriented game analytics, the kinds of input data they require, and what you need to do to ensure that all three are performed, and the resulting data reported to the relevant stakeholder.
The second factor to consider is to clarify how to satisfy both the needs of the company and the needs of the users. The fundamental goal of game design is to create games that provide a good user experience. However, the fundamental goal of running a game development company is to make money (at least from the perspective of the investors). Ensuring that the analytics process generates output supporting decision-making toward both of these goals is vital. Essentially, the underlying drivers for game analytics are twofold: 1) ensuring a quality user experience, in order to acquire and retain customers; 2) ensuring that the monetization cycle generates revenue — irrespective of the business model in question. User-oriented game analytics should inform both design and monetization at the same time. This approach is exemplified by companies that have been successful in the F2P marketplace who use analysis methods like A/B testing to evaluate whether a specific design change increases both user experience (retention is sometimes used as a proxy) and monetization.
Up to this point, the discussion about feature selection has been at a somewhat abstract level, attempting to generate categories guiding selection, ensuring comprehensiveness in coverage rather than generating lists of concrete metrics (shots fired/minute per weapon, kill/death ratio, jump success ratio). This because it is nigh-on impossible to develop generic guidelines for metrics across all types of games and usage situations. This not just because games do not fall within neat design classes (games share a vast design space and do not cluster at specific areas of it), but also because the rate of innovation in design is high, which would rapidly render recommendations invalid. Therefore, the best advice we can give on user analytics is to develop models from the top down, so you can ensure comprehensive coverage in data collection, and from the core out, starting from the main mechanics driving the user experience (for helping designers) and monetization (for helping making sure designers get paid). Additional detail can be added as resources permit. Finally, try to keep your decisions and process fluent and adaptable; it’s necessary in an industry as competitive and exciting as ours.(source:gamasutra)