作者：Jens Peter Jensen
在他的文章中，Daniel Cook解释道：“紧密的系统有清楚定义的因果关系。而在松散的系统中，则很难识别因果关系。因果循环中没有正确的‘紧密度’。”之后他列举了调整游戏紧密度的若干种方法。在本文中，我们将把他的方法解释为实用的游戏设计指标。当然，以下内容距离“使用指标追踪游戏设计的‘紧密度’ 的完全指导”还很遥远。正如Daniel写道的：
“松散：多种不一致的、干扰的信号，且与原因并不相关。在Jeff Minter的《Space Giraffe》中的一种重要技术就是，学习看穿迷幻背景的视觉噪音。”
在某些游戏中，“节奏”当然是一个更大的问题。像《Candy Crush Saga》这样的“三个匹配消除”游戏基本上只有瞬时结果，而多人策略游戏如《亚瑟王国》和社交经济游戏如《Hay Day》的因果相距非常长。如果游戏开发者希望制作一款策略游戏，并且目标是尽量把玩家长时间留在游戏里，那么他们就必须意识到原因和结果可能相距数周之远。例如，在《亚瑟王国》中，新玩家很难理解为什么在“新手保护”时间结束后，他们会比更强的对手打压，以及要怎么避免这种情况。
（在《Frontline Commando: D-Day》中，当飞机作直线飞行时，玩家很容易打中它们。）
因此，可以建立指标来检验游戏中的物品的移动模式，以评估游戏难度。在动作游戏如《Frontline Commando: D-Day》中，玩家要预测敌人的移动才能命中他们。游戏的操作很迟钝，很难同时瞄准和射击，所以玩家的做法就是，预测目标将出现的地方，然后在那里守株待兔。那意味着游戏的难度取决于玩家预测敌人活动的能力。在这种情况下，可以建立游戏指标，根据XYZ座标来衡量线性敌人和不稳定敌人的移动方式。这样，开发者就能知道不同关卡的难度是多少，以及如何调整这个难度值。
“紧密：确定性就是，相同的结果永远是由同一个原因导致的。在国际象棋中，移动的结果永远是一样的；骑士按‘L’步移动，可以吃掉斜对面的棋子。你可以想象另一种情况，你扔骰子决定赢家。你可以通过限制概率来提高紧密度，这样某个玩家扔得的骰子数就会比其他人大。《Pawn of Doom》的20面骰子太可怕了。”
收集掉落物品是《暗黑》类游戏的主要玩法，但许多玩家抱怨《暗黑3》中的掉落物品。游戏很难在奖励玩家和刺激玩家继续寻找新物品之间保持平衡。通过使用追踪物品掉落率的指标，游戏设计师可以控制游戏的平衡性。对于具有复杂的经济系统的MMO游戏，平衡性是非常重要的。以《EVE OL》为例，设计师严格 地控制物品产生的种类和数量，以保持游戏的经济平衡，使新老玩家都觉得公平。（本文为游戏邦/gamerboom.com编译，拒绝任何不保留版权的转载，如需转载请联系：游戏邦）
Building tight games with game metrics
by Jens Peter Jensen
In July 2012 Daniel Cook, Chief Creative Officer at SpryFox, published a blog post about making games easy to understand and play, which he called “Building tight game systems of cause and effect”. Here are a number of game metrics ideas and suggestions inspired by the techniques described in that post.
How to build “tightly”
In his blogpost, Daniel Cook explains that: “a tight system has clearly defined cause and effect. A loose system make is more difficult to distinguish cause and effect relationships. There is no correct ‘tightness’ of a loop.” He then enumerates several methods to tweak the tightness of a game under development. In this post, we tried to translate those aspects into actual game and player metrics. The list below is, of course, far from a complete guide for using game metrics to track game design “tightness”. As Daniel himself writes:
“Not all systems are readily amenable to the intuitive formation of models of cause and effect.”
Not all of the methods apply to all games. Depending on what the game designer is trying to accomplish, some methods will be important and others will not make any sense at all.
As a consequence, the game metrics examples and ideas mentioned here will not be applicable to all games and situations. Each tracking metric is generally designed to answer one specific question or query. The process of data tracking should always start with a question, and then the data to track can be selected. Going the other way (from data to questions) can be just as interesting, but in game development economy and time constraints make such strategies unfeasible. Therefore, the metrics mentioned here are specific to a very narrow set of tasks. Hopefully, they can be useful to game developers or inspire other uses of game metrics.
1. Strength of feedback
“Tighter: Multiple channels of aligned feedback such as colour, animation, sound, and touch that reinforce one another.”
“Looser: One channel of feedback that is weakly evident. In multiplayer FPS games often the only sense that you have that another player is near comes from the faint patter of their footsteps. Expert players gain immense satisfaction from being able to predict the location of their opponent by combining knowledge of the levels with tiny hints of where they might be.”
Corresponding Metric: Alarm and reaction index
If you want to track user response to feedback, there are several methods available depending on the desired feedback. For instance, if the player gets a warning and does not react to it, that indicates that the feedback from the warning is not strong enough. There are, of course, other factors that influence whether a player reacts to the alarm. Also, the player might acknowledge the alarm and choose to not react to it. But if there are more players who do not react than those who react, the feedback is probably not adequate and needs to be strengthened.
Hungry Shark Evolution pressure warning
It is also interesting to track indicators with game metrics. If the game uses indicators to convey game status or objectives, it could be interesting to track how much the player reacts to them. For instance if there is a pointer on the screen to lead the player towards the game objective, it would make sense to track how much the player actually follows the pointer. Again, there are many factors affecting whether a player follows a pointer: for example, whether or not it is the only objective and what possibility the game offers for exploration. But if the game is designed so that the player follows something and the player more often than not does not follow that thing, it would indicate that the feedback of the indicator is not strong enough.
The problem with tracking for the effect of feedback is that, while game metrics are excellent at determining exactly what the players do and when they do it, metrics are not good at determining player intent – the why. These methods can work but the more complex the game, and the more options the player has, and so the more difficult it is to derive player intent from metrics. To fully understand player intent it is more useful to use qualitative research methods such as think-aloud play sessions and interviews.
“Tighter: A clear signal of effect that is related to the cause.”
“Looser: A multiplicity of conflicting, attention sapping signals, which are not related to cause. One of the critical skills in Jeff Minter’s Space Giraffe is learning to see through the visual noise of the psychedelic backgrounds.”
Corresponding Metric: Acoustic and visual clutter
Noise is strongly connected to strength of feedback: too strong feedback from too many sources will create noise. Therefore, when designing increased strength of feedback, designers should be aware that their solutions to weak feedback can create noise and counter-effect the purpose of the strength increase.
A way to measure noise is to track for how many sounds are being played at any one time, and the compared volume of each. If there are too many sounds playing at one time or at the same volume level, it will be more difficult for the player to distinguish between the sounds. When that happens, any meaning or feedback that sounds are conveying will be in danger of being misheard or misunderstood. To use this method it is necessary to distinguish between important and unimportant, as an explosion might be one of a thousand basic enemies dying, or the signal that the end boss is finally vulnerable for attack.
The same method can be used to track how many different things are being displayed on the screen at any one time. This method can also be used to enhance performance, to keep track of game progression, or to make sure that the game expands and speeds up at the desired rate.
3. Sensory type
“Tighter: Visually or tactile feedback is often more clearly perceived. Consider the many billions of dollars spent on improving visual feedback each year so that we can demonstrate the visceral impact of a player’s bullet on simulated flesh with ever greater fidelity. Tight visual feedback is highly functional; it communicates the effect to the player in an elegant efficient fashion. It is not just about making pretty pictures. In a recent update of Triple Town, we changed the colour scheme so that the background was the same general value as the foreground objects. The result was attractive, but players were pissed because the icons weren’t nearly as visible as before.
Looser: Auditory and smell are less clearly perceived. Not as much has been done here, but due to the looseness that comes with such systems it would seem that there are potential systems of mastery. It is perhaps ironic that most music games, a topic typically associated with auditory mastery, can be played with the sound turned off.”
Corresponding Metric: Finger screen blocking
Deciding what is clearly visible for the player and what is not is at centre of game design, as computer games are perceived mainly visually. With mobile games there is an added challenge with the visual aspect, as the player’s main method of control is by touching the screen. That adds the aspect of what can’t the player see when the fingers are blocking part of the screen.
The visual design of games on mobile devices, especially on phones, becomes very important. It is possible to use game metrics to examine what the player can and cannot see while playing. Tracking which areas are being pressed will enable the creation of a heat map of what parts of the screen are covered and when. In the case of an action game where the player must hold the device in a certain way, it is possible to accurately extrapolate where the rest of the finger is based on the place of contact. By using metrics to track these factors both during development and after launch, it is possible to make a very accurate map of what parts of the screen are hidden from the player during play and for how long. Then the designer must make sure that the hidden parts of the screen do not contain any important or critical information. The data collected could also be used to determine if the game interface should be changed in the event that the game is played in an unexpected way.
4. Tapping existing mental models
“Tighter: Closely map the theme, feedback and system to existing mental models. Due to decades of exposure to pop culture, players know how zombies move and that they should be avoided. One means of quickly communicating the dozens of variables in a particular slow moving group of monsters is to label them ‘zombies’.”
“Looser: Step away from existing models and introduce the player to new systems that they’ve never experienced. Consider the metaphors involved in Tetris. Falling elements are something our brain can process as reasonably familiar. Tetriminos that you fit into lines that disappear to earn points while Russian music plays? That doesn’t fit any known metaphor that I know, yet it results in a great game.”
Corresponding Metric: User pattern recognition
It is very difficult for metrics to measure whether a player recognizes game design elements, or “natural mapping” as Donald Norman calls it. The problem is, again, the “why” in elusive trough metrics. A player might very well recognize that red barrels usually blow up, but all the metrics can track is that the player shot the barrel. It is possible to track whether or not a player behaves as the designer expected. For instance, if there are also black barrels in the game, metrics can track whether the player mainly shoots them first or shoots them more than the red ones, or if the player continues shooting them after they have shot a red barrel.
This method only works when there are several similar game objects to compare, and game metrics is not the best way to examine if a game uses “natural mapping” effectively. The challenge of identifying and using well known and recognizable thought models falls on the designer.
“Tighter: Discrete states or low value numbers. Binary is the tightest. For example, recently we were playing with units moving at various speeds. By making them move a 1, 2, and 4 tiles/sec, it suddenly became very obvious to the player how each unit type was distinct. This is one of my favourite techniques for getting unruly systems under control.“
“Looser: Analogue values or very high value numbers. For example, in Angry Birds, you can give your bird a wide range of angles and velocities. This makes the results surprisingly uncertain. Think of how predictable (and boring) the game would be if you could only pick two distinct angles and velocities.”
Corresponding Metric: Player action / Player selection
By “discreteness” Daniel Cook seems to mean a mix of player action distinctiveness and consistency, user agency and game elements. That is a somewhat complex collection of concepts to understand, and might be generally hard to track. But it is possible to look at the sub-categories one at the time.
For instance, by looking at user agency it is possible to track how many times the user employs the different actions available. This will reveal what actions the user prefers and will also help indicate what actions are redundant or unimportant. If for instance there are four different types of attacks and the players use only one of them and are still advancing in the game that indicates that the three other types of attack are redundant. The game then needs to be redesigned either by making the other types of attack necessary, by removing them or by modifying them. This kind of tracking is used to determine what classes, weapons and building types are most popular in games.
The race and class census of World of Warcraft
nalytical journey. If you missed the first part, you can find it here.
“Tighter: Short time lapses between cause and effect. When creating mouse over boxes like you find in Diablo, a common mistake is to add a delay between when the mouse is over the inventory item and when the hover dialog appears. If the delay is too short, the hover dialog pops up when the player doesn’t expect it. If the delay is too long, the dialog feels laggy and non-responsive. (In my experience, 200ms seems ideal. That’s right inside the perception gap where you’ve decided to do something, but your conscious mind hasn’t quite caught up)“
“Looser: Long time lapses between cause and effect. Too long and the player misses that there is an effect at all. Imagine an RPG where you have a switch and a timer. If you hit the switch, a door opens 60 seconds later. Surprisingly few people will figure out that the door is linked to the switch. On the other hand, early investment in industry in Alpha Centauri resulted in alien attacks deep in the end game. This created a richer system of interesting trade off for players to manipulate over a long time span.“
Corresponding Metric: Cause and effect per game session
Determining if a player understands the correlation between action and consequence is difficult to achieve through user research methods. Even if you ask the player directly, he or she might not be able to fully explain their view upon action and reaction to you. There has been and still is a lot of research conducted within the field of psychology on how humans understand the connection between cause and effect. It is a complicated field and therefore it translates into a difficult task for game developers as well, especially if there is a large delay between the moments when cause and effect take place.
In The Walking Dead game by Telltale, dialog options can have effects that does not happen for hours.
As Cook writes, the general rule is: the shorter the time between cause and effect, the easier it is for users to understand their correlation. The longer the time between the two, the more likely it is for the user to fail to associate the two events, and the even more likely it is for them to associate the effect with another incorrect cause.
As with Strength of Feedback and Noise, using metrics to determine if the player understands something is a challenging task, but it can be done by using creative metrics and by comparing the results of different metrics. Casual games, for example, are characterized by short game sessions that stop and start at the user’s discretion. This means that if there are instances where cause and effect are far from each other, they could be in different game sessions hours apart. That would make it very difficult for the player to understand the connection between the two.
One way to track the tightness of the time between cause and effect (or “pacing”) could be by determining if cause and effect are in the same game session. That translates into identifying the cause and effect of in-game actions, and then setting up metrics to measure how much game time and real time there is between cause and effect. This does mean that the game designer has to be able to determine what the effects of long term game play are, either by prediction or through play testing. But when this is done, it would be possible to define the actual metrics.
This should apply only to instances where there is a long time between cause and effect, long meaning anything over 200ms. Yes, the human mind is easily distracted, so a long time in this instance is anything that is not instantaneous. If there are any cases where the cause and effect are in different game sessions, the game designer should take extra care in communicating the connection between cause and effect. Again, the longer the time between cause and effect, the more effort has to be put into the game design in order to explain the connection. If the metrics data from the “cause and effect” is then compared to retention data, it could help identify moments where the player has difficulty understanding long-term cause and effect relations.
In Kingdoms of Camelot for IOS, it is not just difficult but impossible to defend against “old” players. It is not possible to make 100000 troops during the one week player protection.
“Pacing” is of course a much bigger issue in some games than in others. “Match three objects” games like Candy Crush Saga have mostly only an instantaneous effect, while multiplayer strategy games like Kingdoms of Camelot and social tap economy games like Hay Day have very long time between cause and effect. If a game developer wants to make a strategy game and aims to keep the player engaged for a long time, it is very important for them to be aware of cause and effects that can be weeks apart. For instance, in Kingdoms of Camelot it can be hard for players to understand why they are being farmed by stronger players as soon as the “new player protection” ends, and what the player could have done to avoid that.
“Tighter: Linearly increasing variables are more predictable. Consider the general friendliness of throwing a sword in a straight line in Zelda versus catching an enemy with an arcing boomerang while moving.
Looser: Non-linearly increasing variables, less so. The Medusa heads in Castlevania pose a surprisingly difficult challenge to many players because tracking them breaks the typical expectation linear movement. Even something as commonplace as gravity throws most people off their game. After all, it took thousands of years before we figured out how to accurately land an artillery shell.”
Corresponding Metric: Difficulty measuring trough object movement
It seems that Cook here refers to both the player’s ability to predict an action and the ease of successfully complete that action. Which makes Linearity similar to the Tapping Existing Mental Models section in part 1. It basically means that players can easily predict where something is going if it goes in a straight line, and the more erratic something is moving the more difficult it is for the player to predict its movement. This is only relevant to games where the gameplay includes objects that move. In such cases, linearity is directly tied to game difficulty.
In Frontline Commando: D-Day hitting the planes is easy when they are flying straight.
Therefore it is possible to set up a metric that looks at the movement patterns of objects in games in order to measure difficulty. In an action game like Frontline Commando: D-Day, the player depends on predicting the movements of the enemy units in order to be able to hit them. The controls are clumsy and it is difficult to aim and shoot at the same time, so the player’s main course of action is to maneuver the aim relative to where the target is going to appear, then wait and shoot. That means that the difficulty of the game depends on the player’s ability to predict enemy movement. A game metric could in this case be set up to measure how straight or erratic the enemies move, based on the x,y,z, coordinates. That would give the developer an index on how difficult the different levels are, and on how to tweak this difficulty.
Translating position data to meaningful metrics is more challenging in 3D games than in 2D games. In 3D it is more difficult to define a straight line and how player perceive movement and predict it, but it is still possible to set some parameters that track this.
“Tighter: Primary effects where the cause is directly related to the effect. In Zelda again, the primary attack is highly direct. You press a button, the sword swings out and a nearby enemy is hit.”
“Looser: Secondary effects where the cause triggers a secondary (or tertiary) system that in turn triggers an effect. Simulations and AI’s are notorious for rapidly become indecipherable due to numerous levels of indirection. In a game of SimEarth, it was often possible to noodle with variables and have little idea what was actually happening. However, the immense indirection yields systems that people can play with for decades.”
Corresponding Metric: Code tracking
This is very similar to the concept of “Pacing” described earlier, in terms of how strong the connection is between cause and effect. And the metrics described in Pacing can also be used to measure the Indirection.
Another way to measure the strength of the connection between cause and effect is to measure how many agents affect the player action. Simulator games are a good example of games where effects do not originate directly from player actions. Simulators have a lot of different variables influencing a very long list of states, that then again influence other states. These kind of games can very easily become so complicated that not even the game designers can figure out the cause and effect relationships any longer.
It is possible to set up a metrics system that tracks what pieces of program code influence what game states, and that way make a map of causality. This method is useful if the program code has grown in complexity beyond what the programming team can keep track of. It is also possible to get a sense of causality by simply mapping out when each state, value and method is being called in the code. This strategy can also be used as a debugging tool for missing code.
9. Hidden information
“Tighter: Visible sequences that are readily apparent. For example, in Triple Town we signal that a current position is a match. The game isn’t about matching patterns so instead the design goal is to make the available movement opportunities as obvious as possible.”
“Looser: Hidden information or off screen information. A game like Mastermind is entirely about a hidden code that must be carefully deciphered via indirect clues. Board games that are converted into computer games often accidentally hide information. In a board game, the systems are impossible to hide because they are manually executed by the players. However, in computers the rules are often simulated in the background, turning a previously comprehensible system into mysterious gibberish.”
Corresponding Metric: used vs. displayed information
The quantity of information varies a lot from game to game. Angry Birds uses very few states and values, and most of them are binary, while Simcity has a very large amount of states and values that all need to be calculated constantly to simulate the dynamic interactions of a city. It would not be possible to display all the information being processed at any given time in Simcity. Even if this information could be displayed, users might not be able to perceive it in a meaningful way.
In this unreleased 8-bit version of Angry Birds, the power and angle are displayed on the screen.
Despite the game’s perceived simplicity, there is hidden information in Angry Birds as well. For instance the power and angle are not displayed when aiming a shot, but in this case it is to make the game more difficult.
Hiding information is connected to both making the game comprehensible and difficult. Thus, how hiding information influences a game depends heavily on what kind of game it is and what the designer is trying to achieve.
One way to examine the level of hidden information in a game is by creating a metric that tracks the number of states and values that is being handled in the code at a specific point in time. When looking at the data from this metric the game designer knows exactly what information is being used over time and can then decide what to show on the screen, adding or removing information depending on what effect they wish to create. This data can also be used to evaluate if there are too many or too few processes going on in the background to achieve the desired level of complexity in the simulation.
This method could also be used to examine how much data there is on enemy units, and compare it to what information is available to the player. Then the difficulty can be raised or lowered by tweaking the amount of information, for example by indicating the time and place of the next enemy attack or by hiding the enemy’s strength.
“Tighter: Deterministic where the same effect always follows a specific cause. In a game like chess, the result of a move is always the same; a knight moves in an L and will capture the piece in lands upon. You can imagine a variant where instead you roll a die to determine the winner. You can make that tighter again by constraining the probability so that certain characters roll larger dice than others. The 1d20 Pawn of Doom is a grand horror.”
“Looser: Probabilistic so that sometimes one outcome occurs but occasionally a different one happens. In one prototype I worked on there was both a long time scale between the action and the results as well as a heavily weighted but still semi-random outcome. Players were convinced that the game was completely random and had zero logic. If you pacing is fast enough and your feedback strong enough, you might be able to treat this as a slot machine.”
Corresponding Metric: Event occurrence rate
On one hand, using a single element of randomness in a game is easy to understand and the outcome, even though random, is still overseeable. On the other hand, if you have hundreds or even thousands of outcomes that might be interdependent, it is very difficult to keep an overview of the game’s unpredictability and its impact. It can therefore be helpful to create a metric that actually calculates the occurrence rate of a specific event. That way, the designer can see if the events are happening at the expected rate.
For loot drops in hack and slash games like the Diablo series, a complicated formula is used to create a random item every time a monster is killed. To make sure that the dispersion between common and rare items is the desired one, it is necessary to track what items are dropped.
Here in Diablo 3 the player is rewarded with a carefully measured dose of randomly generated items.
Looting was a major part of the gameplay in the Diablo games and many players complained about it in Diablo 3. It is very difficult to get the right balance between adequately rewarding the player and keeping them motivated to look for new items. Using game metrics that track the item drop rate, the game designer can regulate the balance of the game. This is very important in MMO games with a complex economy. EVE Online for instance has a very tight control on what items are created and in what quantity, to maintain an economical balance that is fair to both new and old players.(source:gameanalytics part1 part 2)