作者：Jens Peter Jensen
2012年7月，SpryFox的创意总监Daniel Cook发表了一篇关于制作容易理解和玩的游戏的文章，在文中，他提出“因果关系紧密的游戏系统”。以下是受他的文章启发而整理出来的 游戏设计想法和建议。
在他的文章中，Daniel Cook解释道：“紧密的系统有清楚定义的因果关系。而在松散的系统中，则很难识别因果关系。因果循环中没有正确的‘紧密度’。”之后他列举了调整游 戏紧密度的若干种方法。在本文中，我们将把他的方法解释为实用的游戏设计指标。当然，以下内容距离“使用指标追踪游戏设计的‘紧密度’ 的完全指导”还很遥远。正如 Daniel写道的：
因此，本文提到的游戏指标的例子和想法也不可能适用于所有游戏和情况。通常来说，各个追踪指标只针对特定的问题或疑问。数据追踪的过程应该总是从问题开始，这样才能挑 选出要追踪的数据。另一种从数据到问题的过程可能同样有趣，但考虑到游戏开发的资金和时间限制，这种策略可能行不通。因此，本文提到的指标只适用于非常有限的情况。但 愿，游戏开发者能从中受到启发，达到举一反三的目的。
建立追踪指示器的游戏指标也是很有意义的。如果游戏使用指示器来表示游戏状态或目标，那么就可以看看有多少玩家对它们做出反应。例如，如果屏幕上有一个指针引导玩家前 往游戏目标，那么就应该追踪一下有多少玩家是跟着指针前进的。还是那句话，有许多因素会影响玩家是否跟着指针。例如，玩家要寻找的并不只是那个目标，或者玩家为了探索 其他东西。但如果设计师的意图就是让玩家跟随指针前进，而多数玩家却没有，那就说明指针的反馈不够强。
“松散：多种不一致的、干扰的信号，且与原因并不相关。在Jeff Minter的《Space Giraffe》中的一种重要技术就是，学习看穿迷幻背景的视觉噪音。”
移动设备的游戏的视觉设计，特别是手机，变得非常重要。可以使用游戏指标来发现玩家在游戏时能和不能看到的东西。通过追踪玩家按下的屏幕区域可以制成热点地图。在动作 游戏中，玩家必须以某种姿势持设备，根据接触的地方，可以准确地推测其他手指的位置。通过在开发过程中和发布后使用指标追踪这些因素，可以制作非常准确的地图，显示玩 家在游戏时看不到的屏幕部分以及看不到的时间有多长。之后，设计师必须确保玩家看不到的部分不包含重要的或关键的信息。收集到的这些信息还可以用于确定游戏界面是否需 要改变。
很难用指标衡量玩家是否认出游戏设计元素，或Donald Norman所谓的“自然映射”。玩家可能知道红色的木桶通常会爆炸，但指标可以追踪的是，玩家朝木桶射击。玩家是否按设 计师的意图行动是有可能追踪到的。例如，如果游戏中有黑色的木桶，指标可以追踪玩家选择射击黑色木桶还是红色木桶，或者玩家是否射击完红色木桶后是否继续射击黑色木桶 。
例如，通过查看玩家角色，有可能追踪到玩家执行不同动作的次数。结果除了显示玩家偏好的行为，还可以体现哪些动作是多余的或不重要的。例如，如果有四种不同类型的攻击 ，玩家只使用其中一种，且仍然能够进展，那就说明其他三种进攻类型是多余的。因此，游戏应该重新设计，要么增加这三种类型的重要性，要么修改或直接删除它们。这种指标 还可用于发现游戏中什么种族/职业、武器和建筑类型最受欢迎。
“紧密：原因和结果之间相距的时间短。比如在《暗黑》中，最常见的错误就是，当你把鼠标放在可以打开的箱子上时出现悬浮对话框延迟问题。如果延迟太短，悬浮对话框出现 时玩家还没反应过来。如果延迟太长，玩家可能会因为对话框反应太慢而错过。（以我的经验，200毫秒是最理想的。这正是人类的认知差距的时间，即一个人决定做某事，但意识 还没跟上的时间）”
“松散：原因和结果之间相距的时间长。如果因果时间相距太长，玩家可能根本不会发现原因导致的结果。想像一下，在RPG中，你有一个开关和计时器。如果你按下开关，一扇门 在60秒后才打开，几乎没有人会把门打开与按下开关联系起来。但另一方面，早期投资Alpha Centauri导致最后外星人入侵的结果，使玩家要在较长的时间段内权衡策略，从而产 生更加丰富有趣的系统。”
这应该只适用于因果相距较长的游戏，这里的“较长”指的是超过200毫秒。是的，人类的思考是很容易被干扰的，所以任何发生相距超过200毫秒的事都不算同时发生。如果原因 和结果落在不同的游戏回合中，游戏设计师应该特别注意如何向玩家传达因果关系。还是那句话，原因和结果之间相距越久，游戏设计师就要更努力地解释二者关联。把来自“因 果”的指标数据与留存率数据作比较，有助于设计师发现玩家在什么时候难以理解长期性因果关系。
在某些游戏中，“节奏”当然是一个更大的问题。像《Candy Crush Saga》这样的“三个匹配消除”游戏基本上只有瞬时结果，而多人策略游戏如《亚瑟王国》和社交经济游戏如 《Hay Day》的因果相距非常长。如果游戏开发者希望制作一款策略游戏，并且目标是尽量把玩家长时间留在游戏里，那么他们就必须意识到原因和结果可能相距数周之远。例如， 在《亚瑟王国》中，新玩家很难理解为什么在“新手保护”时间结束后，他们会比更强的对手打压，以及要怎么避免这种情况。
（在《Frontline Commando: D-Day》中，当飞机作直线飞行时，玩家很容易打中它们。）
因此，可以建立指标来检验游戏中的物品的移动模式，以评估游戏难度。在动作游戏如《Frontline Commando: D-Day》中，玩家要预测敌人的移动才能命中他们。游戏的操作很迟 钝，很难同时瞄准和射击，所以玩家的做法就是，预测目标将出现的地方，然后在那里守株待兔。那意味着游戏的难度取决于玩家预测敌人活动的能力。在这种情况下，可以建立 游戏指标，根据XYZ座标来衡量线性敌人和不稳定敌人的移动方式。这样，开发者就能知道不同关卡的难度是多少，以及如何调整这个难度值。
“松散：具有隐藏信息或脱离屏幕的信息。游戏如《Mastermind》的主要玩法就是让玩家仔细地解读间接线索再找到隐藏答案。移植到电脑上的桌面游戏通常会意外地隐藏掉某些 信息。在桌面游戏中，系统是不可能隐藏的，因为它们是由玩家手动执行的。然而，在电脑中，这些规则通常是在后台模拟的，这样就把原本可以理解的系统变成捉摸不透的暗语 。”
“紧密：确定性就是，相同的结果永远是由同一个原因导致的。在国际象棋中，移动的结果永远是一样的；骑士按‘L’步移动，可以吃掉斜对面的棋子。你可以想象另一种情况， 你扔骰子决定赢家。你可以通过限制概率来提高紧密度，这样某个玩家扔得的骰子数就会比其他人大。《Pawn of Doom》的20面骰子太可怕了。”
收集掉落物品是《暗黑》类游戏的主要玩法，但许多玩家抱怨《暗黑3》中的掉落物品。游戏很难在奖励玩家和刺激玩家继续寻找新物品之间保持平衡。通过使用追踪物品掉落率的 指标，游戏设计师可以控制游戏的平衡性。对于具有复杂的经济系统的MMO游戏，平衡性是非常重要的。以《EVE OL》为例，设计师严格 地控制物品产生的种类和数量，以保持游 戏的经济平衡，使新老玩家都觉得公平。
“宽松：系统要求模拟多个步骤去预测结果。另一方面，在《Triple Town》中，优秀的玩家需要想着向前进。让玩家基于各种可能影响你想看到的结果的计划进行思考将会带给他 们一定的认知负荷。玩家计算中的一个小小错误便有可能生成不可预期的结果。”
手机游戏的一般趋势是限制游戏和复杂性处理过程。这么做是为了适应游戏的休闲属性以及较短的游戏时间。也有一些策略游戏在长期的玩家互动中呈现出复杂性，但是大多数手 机游戏开发者都是围绕着快速理解和快速游戏进行设计，从而完全忽视了行动结果。所以如果你需要使用游戏指标去衡量手机游戏的复杂性，那么游戏可能对于大多数手机用户来 说都太过复杂了。既然如此，设计师将考虑是否要完全改变游戏，或者至少选择其它市场。当然，你也可以创造一些全新的内容，但往往说起来容易做起来难。
一种可能方法便是去追踪游戏中的不同机制，并明确玩家使用每种机制的频率。根据游戏是基于关卡，目标还是宽松结构，我们可以选择基本比例或者目标基础做到这点。即使游 戏并未使用关卡，设计师也应该单独测量。像《部落战争》和《辛普森一家：枯竭》等游戏便拥有较为宽松的结构：玩家可以在特定区域内按照自己想法安置建筑，但是他们仍需 要经历一些关卡。在这些例子中，游戏的目标便是完成关卡。还有其它拥有宽松结构的游戏则是逐渐扩展玩家的基地/城堡/城市。这些游戏的目标便是扩展行动。如果游戏同时拥 有关卡和扩展机制去划分目标的话，你便能够比较不同类型目标的复杂性。
然后指标将变成玩家会使用怎样的行动和机制去实现一个目标并走向下一个目标，以及他们需要花费多长时间去执行这些行动。最终数据将呈现出玩家在不同游戏机制中实现每个 目标所执行的行动数和花费的时间。这一信息可以用于明确游戏中不同关卡的复杂性。基于此，设计师便可以决定是该提高还是降低不同关卡的复杂性，从而实现预期的难度标准 。
Cook似乎只专注能够直接影响游戏玩法的社交元素，而忽视了社交互动的主要元素，也就是聊天。如果手机和社交游戏是围绕着简单机制进行设计，那么它们便不需要复杂的玩家 合作，并且大多数玩家vs.玩家游戏的例子都只是关于玩家vs.静态数据（保存在服务器上）。当然也存在间接合作型玩家互动，如在《FarmVille》中浇灌其他玩家的植物之类。一 个玩家贴出了某些内容而其他玩家去浇灌他的植物，这并不属于直接的玩家间社交互动，而是玩家与服务器状态间的互动。
最简单的方法便是设置指标去搜索开发者所感兴趣的单词或短语。即我们可以搜索“游戏，优秀，出色”或者“游戏，糟糕，无聊”等关键词。但是我们最好更具有针对性地进行 搜索，特别是当结果是面向人类读者时。如果搜出了20多万行聊天记录的话，我们便不可能得出真正有意义的结果。我们可以无需基于人为解释去使用挖掘数据，但这却会让结果 显得较为含糊。一般说来，搜索更具有针对性，最终得出的结果便更有效，人们也能够以更加有意义的方式去使用这些数据。
“严格：要求基于玩家理想中的速度而模拟模式。这与处理复杂性和选择复杂性有关，因为玩家只能基于特定速度去执行模式。如果降低时间压力的话玩家更有可能创造出因果联 系。举个例子来说吧，《NetHack》带有复杂的互动系统，要求真正的侦探去解谜。为了提高玩家间联系的可能性，我们可以将游戏设置为回合制游戏，并让玩家根据自己的想法去 决定每个回合所需要的时间。你将会发现情境变得更加复杂，甚至连有经验的玩家也会放慢速度去理解所有分枝内容。”
如果玩家购买了游戏，他们便具有额外的利益动机而继续游戏，所以设计师便能够创造出更具挑战性的游戏，并期待玩家能够继续游戏。在免费模式中，难度平衡会发生改变。大 多数免费游戏使用的是“提升游戏速度”或“降低游戏难度”的盈利策略。这便意味着平衡难度的挑战并不能让游戏变得更有趣，反而会让开发者很难去判断玩家是否会在游戏中 消费。所以免费模式应该在保证不会赶走玩家的前提下，推动他们的极限，并超越他们所习惯的范围。
我们可以创造一个反应基于时间关卡的游戏（例如像《Candy Crush》，《宝石迷阵闪电战》以及《Bubble Mania》这样的三消游戏）难度的指标。除了三消游戏，还可以是其它带 有时间限制且基于目标结构的游戏。设置指标去收集每个玩家完成每个关卡所花费的时间，收集玩家死去或失败的时间，收集他们在每个回合所投入的时间。然后你便可以获得玩 家完成每个关卡的平均时间，以及每个关卡中有多少玩家遭遇失败，并发现他们是在何时选择放弃。
Building tight games with game metrics
by Jens Peter Jensen
In July 2012 Daniel Cook, Chief Creative Officer at SpryFox, published a blog post about making games easy to understand and play, which he called “Building tight game systems of cause and effect”. Here are a number of game metrics ideas and suggestions inspired by the techniques described in that post.
How to build “tightly”
In his blogpost, Daniel Cook explains that: “a tight system has clearly defined cause and effect. A loose system make is more difficult to distinguish cause and effect relationships. There is no correct ‘tightness’ of a loop.” He then enumerates several methods to tweak the tightness of a game under development. In this post, we tried to translate those aspects into actual game and player metrics. The list below is, of course, far from a complete guide for using game metrics to track game design “tightness”. As Daniel himself writes:
“Not all systems are readily amenable to the intuitive formation of models of cause and effect.”
Not all of the methods apply to all games. Depending on what the game designer is trying to accomplish, some methods will be important and others will not make any sense at all.
As a consequence, the game metrics examples and ideas mentioned here will not be applicable to all games and situations. Each tracking metric is generally designed to answer one specific question or query. The process of data tracking should always start with a question, and then the data to track can be selected. Going the other way (from data to questions) can be just as interesting, but in game development economy and time constraints make such strategies
unfeasible. Therefore, the metrics mentioned here are specific to a very narrow set of tasks. Hopefully, they can be useful to game developers or inspire other uses of game metrics.
1. Strength of feedback
“Tighter: Multiple channels of aligned feedback such as colour, animation, sound, and touch that reinforce one another.”
“Looser: One channel of feedback that is weakly evident. In multiplayer FPS games often the only sense that you have that another player is near comes from the faint patter of their footsteps. Expert players gain immense satisfaction from being able to predict the location of their opponent by combining knowledge of the levels with tiny hints of where they might be.”
Corresponding Metric: Alarm and reaction index
If you want to track user response to feedback, there are several methods available depending on the desired feedback. For instance, if the player gets a warning and does not react to it, that indicates that the feedback from the warning is not strong enough. There are, of course, other factors that influence whether a player reacts to the alarm. Also, the player might acknowledge the alarm and choose to not react to it. But if there are more players who do not react than those who react, the feedback is probably not adequate and needs to be strengthened.
Hungry Shark Evolution pressure warning
It is also interesting to track indicators with game metrics. If the game uses indicators to convey game status or objectives, it could be interesting to track how much the player reacts to them. For instance if there is a pointer on the screen to lead the player towards the game objective, it would make sense to track how much the player actually follows the pointer. Again, there are many factors affecting whether a player follows a pointer: for example, whether or not it is the only objective and what possibility the game offers for exploration. But if the game is designed so that the player follows something and the player more often than not does not follow that thing, it would indicate that the feedback of the indicator is not strong enough.
The problem with tracking for the effect of feedback is that, while game metrics are excellent at determining exactly what the players do and when they do it, metrics are not good at determining player intent – the why. These methods can work but the more complex the game, and the more options the player has, and so the more difficult it is to derive player intent from metrics. To fully understand player intent it is more useful to use qualitative research methods such as think-aloud play sessions and interviews.
“Tighter: A clear signal of effect that is related to the cause.”
“Looser: A multiplicity of conflicting, attention sapping signals, which are not related to cause. One of the critical skills in Jeff Minter’s Space Giraffe is learning to see through the visual noise of the psychedelic backgrounds.”Corresponding Metric: Acoustic and visual clutter
Noise is strongly connected to strength of feedback: too strong feedback from too many sources will create noise. Therefore, when designing increased strength of feedback, designers should be aware that their solutions to weak feedback can create noise and counter-effect the purpose of the strength increase.
A way to measure noise is to track for how many sounds are being played at any one time, and the compared volume of each. If there are too many sounds playing at one time or at the same volume level, it will be more difficult for the player to distinguish between the sounds. When that happens, any meaning or feedback that sounds are conveying will be in danger of being misheard or misunderstood. To use this method it is necessary to distinguish between important and unimportant, as an explosion might be one of a thousand basic enemies dying, or the signal that the end boss is finally vulnerable for attack.
The same method can be used to track how many different things are being displayed on the screen at any one time. This method can also be used to enhance performance, to keep track of game progression, or to make sure that the game expands and speeds up at the desired rate.
3. Sensory type
“Tighter: Visually or tactile feedback is often more clearly perceived. Consider the many billions of dollars spent on improving visual feedback each year so that we can demonstrate the visceral impact of a player’s bullet on simulated flesh with ever greater fidelity. Tight visual feedback is highly functional; it communicates the effect to the player in an elegant efficient fashion. It is not just about making pretty pictures. In a recent update of Triple Town, we changed the colour scheme so that the background was the same general value as the foreground objects. The result was attractive, but players were pissed because the icons weren’t nearly as visible as before.
Looser: Auditory and smell are less clearly perceived. Not as much has been done here, but due to the looseness that comes with such systems it would seem that there are potential systems of mastery. It is perhaps ironic that most music games, a topic typically associated with auditory mastery, can be played with the sound turned off.”Corresponding Metric: Finger screen blocking
Deciding what is clearly visible for the player and what is not is at centre of game design, as computer games are perceived mainly visually. With mobile games there is an added challenge with the visual aspect, as the player’s main method of control is by touching the screen. That adds the aspect of what can ’t the player see when the fingers are blocking part of the screen.
The visual design of games on mobile devices, especially on phones, becomes very important. It is possible to use game metrics to examine what the player can heat map of what parts of the screen are covered and when. In the case of an action game where the player must hold the device in a certain way, it is possible to accurately extrapolate where the rest of the finger is based on the place of contact. By using metrics to track these factors both during development and after launch, it is possible to make a very accurate map of what parts of the screen are hidden from the player during play and for how long. Then the designer must make sure that the hidden parts of the screen do not contain any important or critical information. The data collected could also be used to determine if the game interface should be changed in the event that the game is played in an unexpected way.
4. Tapping existing mental models
“Tighter: Closely map the theme, feedback and system to existing mental models. Due to decades of exposure to pop culture, players know how zombies move and that they should be avoided. One means of quickly communicating the dozens of variables in a particular slow moving group of monsters is to label them ‘zombies’.”
“Looser: Step away from existing models and introduce the player to new systems that they’ve never experienced. Consider the metaphors involved in Tetris. Falling elements are something our brain can process as reasonably familiar. Tetriminos that you fit into lines that disappear to earn points while Russian music plays? That doesn’t fit any known metaphor that I know, yet it results in a great game.”
Corresponding Metric: User pattern recognition
It is very difficult for metrics to measure whether a player recognizes game design elements, or “natural mapping” as Donald Norman calls it. The problem is, again, the “why” in elusive trough metrics. A player might very well recognize that red barrels usually blow up, but all the metrics can track is that the player shot the barrel. It is possible to track whether or not a player behaves as the designer expected. For instance, if there are also black barrels in the game, metrics can track whether the player mainly shoots them first or shoots them more than the red ones, or if the player continues shooting them after they have shot a red barrel.
This method only works when there are several similar game objects to compare, and game metrics is not the best way to examine if a game uses “natural mapping” effectively. The challenge of identifying and using well known and recognizable thought models falls on the designer.
“Tighter: Discrete states or low value numbers. Binary is the tightest. For example, recently we were playing with units moving at various speeds. By making them move a 1, 2, and 4 tiles/sec, it suddenly became very obvious to the player how each unit type was distinct. This is one of my favourite techniques for getting unruly systems under control.“
“Looser: Analogue values or very high value numbers. For example, in Angry Birds, you can give your bird a wide range of angles and velocities. This makes the results surprisingly uncertain. Think of how predictable (and boring) the game would be if you could only pick two distinct angles and velocities.”Corresponding Metric: Player action / Player selection
By “discreteness” Daniel Cook seems to mean a mix of player action distinctiveness and consistency, user agency and game elements. That is a somewhat complex collection of concepts to understand, and might be generally hard to track. But it is possible to look at the sub-categories one at the time.
For instance, by looking at user agency it is possible to track how many times the user employs the different actions available. This will reveal what actions the user prefers and will also help indicate what actions are redundant or unimportant. If for instance there are four different types of attacks and the players use only one of them and are still advancing in the game that indicates that the three other types of attack are redundant. The game then needs to be redesigned either by making the other types of attack necessary, by removing them or by modifying them. This kind of tracking is used to determine what classes, weapons and building types are most popular in games.
The race and class census of World of Warcraft
nalytical journey. If you missed the first part, you can find it here.
“Tighter: Short time lapses between cause and effect. When creating mouse over boxes like you find in Diablo, a common mistake is to add a delay between when the mouse is over the inventory item and when the hover dialog appears. If the delay is too short, the hover dialog pops up when the player doesn’t expect it. If the delay is too long, the dialog feels laggy and non-responsive. (In my experience, 200ms seems ideal. That’s right inside the perception gap where you’ve decided to do something, but your conscious mind hasn’t quite caught up)“
“Looser: Long time lapses between cause and effect. Too long and the player misses that there is an effect at all. Imagine an RPG where you have a switch and a timer. If you hit the switch, a door opens 60 seconds later. Surprisingly few people will figure out that the door is linked to the switch. On the other hand, early investment in industry in Alpha Centauri resulted in alien attacks deep in the end game. This created a richer system of interesting trade off for players to manipulate over a long time span.“Corresponding Metric: Cause and effect per game session
Determining if a player understands the correlation between action and consequence is difficult to achieve through user research methods. Even if you ask the player directly, he or she might not be able to fully explain their view upon action and reaction to you. There has been and still is a lot of research conducted within the field of psychology on how humans understand the connection between cause and effect. It is a complicated field and therefore it translates into a difficult task for game developers as well, especially if there is a large delay between the moments when cause and effect take place.
In The Walking Dead game by Telltale, dialog options can have effects that does not happen for hours.
As Cook writes, the general rule is: the shorter the time between cause and effect, the easier it is for users to understand their correlation. The longer the time between the two, the more likely it is for the user to fail to associate the two events, and the even more likely it is for them to associate the effect with another incorrect cause.
As with Strength of Feedback and Noise, using metrics to determine if the player understands something is a challenging task, but it can be done by using creative metrics and by comparing the results of different metrics. Casual games, for example, are characterized by short game sessions that stop and start at the user’s discretion. This means that if there are instances where cause and effect are far from each other, they could be in different game sessions hours apart. That would make it very difficult for the player to understand the connection between the two.
One way to track the tightness of the time between cause and effect (or “pacing”) could be by determining if cause and effect are in the same game session. That translates into identifying the cause and effect of in-game actions, and then setting up metrics to measure how much game time and real time there is between cause and effect. This does mean that the game designer has to be able to determine what the effects of long term game play are, either by prediction or through play testing. But when this is done, it would be possible to define the actual metrics.
This should apply only to instances where there is a long time between cause and effect, long meaning anything over 200ms. Yes, the human mind is easily distracted, so a long time in this instance is anything that is not instantaneous. If there are any cases where the cause and effect are in different game sessions, the game designer should take extra care in communicating the connection between cause and effect. Again, the longer the time between cause and effect, the more effort has to be put into the game design in order to explain the connection. If the metrics data from the “cause and effect” is then compared to retention data, it could help identify moments where the player has difficulty understanding long-term cause and effect relations.
In Kingdoms of Camelot for IOS, it is not just difficult but impossible to defend against “old” players. It is not possible to make 100000 troops during the one week player protection.
“Pacing” is of course a much bigger issue in some games than in others. “Match three objects” games like Candy Crush Saga have mostly only an instantaneous effect, while multiplayer strategy games like Kingdoms of Camelot and social tap economy games like Hay Day have very long time between cause and effect. If a game developer wants to make a strategy game and aims to keep the player engaged for a long time, it is very important for them to be aware of cause and effects that can be weeks apart. For instance, in Kingdoms of Camelot it can be hard for players to undrstand why they are being farmed by stronger players as soon as the “new player protection” ends, and what the player could have done to avoid that.
“Tighter: Linearly increasing variables are more predictable. Consider the general friendliness of throwing a sword in a straight line in Zelda versus catching an enemy with an arcing boomerang while moving.
Looser: Non-linearly increasing variables, less so. The Medusa heads in Castlevania pose a surprisingly difficult challenge to many players because tracking them breaks the typical expectation linear movement. Even something as commonplace as gravity throws most people off their game. After all, it took thousands of years before we figured out how to accurately land an artillery shell.”Corresponding Metric: Difficulty measuring trough object movement
It seems that Cook here refers to both the player’s ability to predict an action and the ease of successfully complete that action. Which makes Linearity similar to the Tapping Existing Mental Models section in part 1. It basically means that players can easily predict where something is going if it goes in a straight line, and the more erratic something is moving the more difficult it is for the player to predict its movement. This is only relevant to games where the gameplay includes objects that move. In such cases, linearity is directly tied to game difficulty.
In Frontline Commando: D-Day hitting the planes is easy when they are flying straight.
Therefore it is possible to set up a metric that looks at the movement patterns of objects in games in order to measure difficulty. In an action game like Frontline Commando: D-Day, the player depends on predicting the movements of the enemy units in order to be able to hit them. The controls are clumsy and it is difficult to aim and shoot at the same time, so the player’s main course of action is to maneuver the aim relative to where the target is going to appear, then wait and shoot. That means that the difficulty of the game depends on the player’s ability to predict enemy movement. A game metric could in this case be set up to measure how straight or erratic the enemies move, based on the x,y,z, coordinates. That would give the developer an index on how difficult the different levels are, and on how to tweak this difficulty.
Translating position data to meaningful metrics is more challenging in 3D games than in 2D games. In 3D it is more difficult to define a straight line and how player perceive movement and predict it, but it is still possible to set some parameters that track this.
“Tighter: Primary effects where the cause is directly related to the effect. In Zelda again, the primary attack is highly direct. You press a button, the sword swings out and a nearby enemy is hit.”
“Looser: Secondary effects where the cause triggers a secondary (or tertiary) system that in turn triggers an effect. Simulations and AI’s are notorious for rapidly become indecipherable due to numerous levels of indirection. In a game of SimEarth, it was often possible to noodle with variables and have little idea what was actually happening. However, the immense indirection yields systems that people can play with for decades.”
Corresponding Metric: Code tracking
This is very similar to the concept of “Pacing” described earlier, in terms of how strong the connection is between cause and effect. And the metrics described in Pacing can also be used to measure the Indirection.
Another way to measure the strength of the connection between cause and effect is to measure how many agents affect the player action. Simulator games are a good example of games where effects do not originate directly from player actions. Simulators have a lot of different variables influencing a very long list of states, that then again influence other states. These kind of games can very easily become so complicated that not even the game designers can figure out
the cause and effect relationships any longer.
It is possible to set up a metrics system that tracks what pieces of program code influence what game states, and that way make a map of causality. This method is useful if the program code has grown in complexity beyond what the programming team can keep track of. It is also possible to get a sense of causality by simply mapping out when each state, value and method is being called in the code. This strategy can also be used as a debugging tool for missing code.
9. Hidden information
“Tighter: Visible sequences that are readily apparent. For example, in Triple Town we signal that a current position is a match. The game isn’t about matching patterns so instead the design goal is to make the available movement opportunities as obvious as possible.”
“Looser: Hidden information or off screen information. A game like Mastermind is entirely about a hidden code that must be carefully deciphered via indirect clues. Board games that are converted into computer games often accidentally hide information. In a board game, the systems are impossible to hide because they are manually executed by the players. However, in computers the rules are often simulated in the background, turning a previously comprehensible system into mysterious gibberish.”Corresponding Metric: used vs. displayed information
The quantity of information varies a lot from game to game. Angry Birds uses very few states and values, and most of them are binary, while Simcity has a very large amount of states and values that all need to be calculated constantly to simulate the dynamic interactions of a city. It would not be possible to display all the information being processed at any given time in Simcity. Even if this information could be displayed, users might not be able to perceive it in a meaningful way.
In this unreleased 8-bit version of Angry Birds, the power and angle are displayed on the screen.
Despite the game’s perceived simplicity, there is hidden information in Angry Birds as well. For instance the power and angle are not displayed when aiming a shot, but in this case it is to make the game more difficult.
Hiding information is connected to both making the game comprehensible and difficult. Thus, how hiding information influences a game depends heavily on what kind of game it is and what the designer is trying to achieve.
One way to examine the level of hidden information in a game is by creating a metric that tracks the number of states and values that is being handled in the code at a specific point in time. When looking at the data from this metric the game designer knows exactly what information is being used over time and can then decide what to show on the screen, adding or removing information depending on what effect they wish to create. This data can also be used to evaluate if there are too many or too few processes going on in the background to achieve the desired level of complexity in the simulation.
This method could also be used to examine how much data there is on enemy units, and compare it to what information is available to the player. Then the difficulty can be raised or lowered by tweaking the amount of information, for example by indicating the time and place of the next enemy attack or by hiding the enemy’s strength.
“Tighter: Deterministic where the same effect always follows a specific cause. In a game like chess, the result of a move is always the same; a knight moves in an L and will capture the piece in lands upon. You can imagine a variant where instead you roll a die to determine the winner. You can make that tighter again by constraining the probability so that certain characters roll larger dice than others. The 1d20 Pawn of Doom is a grand horror.”
“Looser: Probabilistic so that sometimes one outcome occurs but occasionally a different one happens. In one prototype I worked on there was both a long time scale between the action and the results as well as a heavily weighted but still semi-random outcome. Players were convinced that the game was completely random and had zero logic. If you pacing is fast enough and your feedback strong enough, you might be able to treat this as a slot machine.”
Corresponding Metric: Event occurrence rate
On one hand, using a single element of randomness in a game is easy to understand and the outcome, even though random, is still overseeable. On the other hand, if you have hundreds or even thousands of outcomes that might be interdependent, it is very difficult to keep an overview of the game’s unpredictability and its impact. It can therefore be helpful to create a metric that actually calculates the occurrence rate of a specific event. That way, the designer can see if the events are happening at the expected rate.
For loot drops in hack and slash games like the Diablo series, a complicated formula is used to create a random item every time a monster is killed. To make sure that the dispersion between common and rare items is the desired one, it is necessary to track what items are dropped.
Here in Diablo 3 the player is rewarded with a carefully measured dose of randomly generated items.
Looting was a major part of the gameplay in the Diablo games and many players complained about it in Diablo 3. It is very difficult to get the right balance between adequately rewarding the player and keeping them motivated to look for new items. Using game metrics that track the item drop rate, the game designer can regulate the balance of the game. This is very important in MMO games with a complex economy. EVE Online for instance has a very tight control on what
items are created and in what quantity, to maintain an economical balance that is fair to both new and old players
11. Processing Complexity
“Tighter: System requires simulating few steps to predict an outcome. In a vertically scrolling shooter, you see the bullet coming towards you. It doesn’ t take a lot of thought to figure out that if you stay in that location you are going to be hit.”
“Looser: System requires simulating multiple steps to predict an outcome. On the other hand, in Triple Town, good players need to think dozens of moves ahead. Thinking through all the various machinations necessary to get the result you want adds a serious cognitive load to the player. A single mistake in the player’s calculations yields unexpected results.”
Corresponding Metric: Mechanics and actions per goal
In mobile games the general trend is to limit both game and processing complexity. This is in order to accommodate the casual nature and short play times that characterize the mobile platform. There are some strategy games that exhibit complexity in long-term player interaction, but most mobile game developers design for quick understanding and quick play, thus ignoring action consequences completely. With that in mind, if you need a game metric to measure the
complexity of a mobile game, the game is probably already too complex for the average mobile consumer. In that case, the designer should consider whether the game should be fundamentally changed or at least directed at another market. Of course, you might also get away with making something new and revolutionary that breaks all conformities, but that is easier said than done, right?
One possibility is to track how many different game mechanics there are in the game and how often the player uses each one. This can be done on a general scale or on a goal by goal basis, depending on whether the game is level-based, goal-based or has a loose structure. Even if the game does not use levels, the designer should somehow separate the measurements. Games like Clash of Clans and Simpsons Tapped Out have a more loose structure: the player can place
buildings where they like within a difined area, but they still have to go through levels. In those cases, the goal of the game could be achieving levels. Other games with a loose structure gradually expand the area of the player’s base/castle/city. In those cases the goal could be the act of expanding. If the game has both level and expansion mechanics that delimit the goals, you can compare the complexity of the different types of goals to each ther.Expanding the town of Springfield can be a goal to track for.
Or count the important buildings in Clash of Clans as a goal.
The metric could then check for what actions and mechanics the player uses to get from one goal to the next and how long it took to execute those actions. The resulting data would show how many player actions and how many times the different game mechanics were used to achieve each goal. This information can be use to determine the complexity of the different stages of the game. Based on this, the designer can choose to increase or decrease the complexity of the
different stages to reach the desired rate of progression and difficulty.
12. Option Complexity
“Tighter: Fewer options are available to consider. In a recent upgrade system I was building I give players 3 choices for their upgrades. I could have given them a menu of 60 upgrades, but that would be rather overwhelming. By focusing the user on a few important choices, I give them the mental space to think about each and pick the one with the biggest impact.”
“Looser: A large number of options must be considered. In a game of Go there are often dozens of potential moves and hundreds of secondary moves. This options complexity is a large part of why the game has been played for thousands of years.”
Corresponding Metric: Mechanics and actions pr. goal and options
Option Complexity seems to cover both gameplay options and appearance customization options. These are two very different things, but both add to the amount of game content. The main difference between the two is that appearance customization does not have to influence gameplay in any way, while gameplay options should influence the game in some way. In this article the focus will be on the game option complexity and not the appearance customization.
There are two different approaches to game upgrade/options mechanics. One is an open and unguided approach, where the player has many different options to choose from and aspects to explore or upgrade. This approach is non-restrictive and leaves everything up to the player. This is often used in RPG games like Skyrim,to a lesser extend in MMORPGs like World of Warcraft and Minecraft.
The other approach is narrow and strict guidance. The player has a very limited range of options/upgrade possibilities and game progression is linear. The game is designed so that the player must do things in a specific order so as to advance. The designer can choose to do this in an obvious way or he can give the player the illusion of choice. This is used in games like Call of Duty, God of War, Simpsons Tapped Out and Megapolis, and basically all tap based economy games.
There are a lot of weapons and upgrades in Call of Duty Black Ops II but they unlock trough a strict line of progression.
Megapolis has a multitude of buildings, but they are not made available until the player reach a certain level.
The two approaches are not to be thought at as “one or the other” as very few games go to the extreme in either direction. It is more a case of leaning towards one approve with more or less intensity.
The same metrics as described in the Processing Complexity section can be used to examine Option Complexity. Simply take the same data and look for upgrade game mechanics and for sections where the player can use different actions to achieve the same effect. Then you will get a sense of the Options Complexity metric. Again, the designer can use the data to tweak option complexity in order to lengthen or shorten game progression.
13. Social Complexity
“Tighter: Another human broadly signals intent, capabilities and internal mental state. In an MMO, a player dresses as a high level healer and stands in a spot where adhoc groups meet up. There’s a good chance you know what they’ll do if you ask them to go adventuring together. Or in a managed trade window, you know exactly what you are getting when he puts up a potion for your sword. There is little ambiguity.”
“Looser: Another human disguises, distorts or mutes intent, capabilities and their mental state.”
Corresponding data mining: Chat and mail mining
Cook seems to focus only on the social aspects that directly influence gameplay, ignoring a key element of social interaction – namely the chat. As mobile and social games are designed to be simple, there is no need for complex player cooperation, and most examples of PVP are really just player vs. static data saved on a server. There are, of course, indirect collaborative player interactions like watering each others’ plants in Farmville and the like. One player
posts something and someone else waters the plants, so this is not direct player on player social interaction, but actually player on server state interaction.
In any game where players can interact an ingame chat or mail system is usually implemented, and it is always very interesting for the game developer to keep an eye on what the players are talking about. The player on player communication is a treasure trove of information of what the players think of the game and how they play it. That makes the chat and/or mail system an obvious target for data mining. It is however difficult to properly extract information from the chat, because a lot of manual (human) interpretation is needed.
The simplest way is to set up a metric that searches for words or phrases that the developer is interested on. This could be done for general concerns (for example, if the player likes the game) by searching for words and phrases like “game, good, awesome” or “game, bad, sucks”. But it is better to be more specific, especially if the results have to read by humans. If the search query comes up with 200.000 lines of chat, it is not possible for a human to get something meaningful out of it. It is possible to use the mined data without human interpretation, but that makes the results very ambiguous. Generally, the more specific the search the more useful the results, and the more easily humans can engage the information in a meaningful way.
The player chat, here from Kingdoms of Camelot, about everything, the task is to mine only the good stuff.
This approach can be used to gather many different types of information. Possibilities are only limited by imagination and the resources necessary for metric computation . There is an element of privacy invasion in this method, so be sure to always include a clause for monitoring and examining ingame chat and mail in the EULA.
14. Time Pressure
“Tighter: Requires simulating the model at the player’s preferred pace. This is related to processing and option complexity since players can only execute their models at a given pace. Players are more likely to make causal connections if the time pressure is greatly reduced. For example, the game NetHack has complexly interwoven systems that require real detective work to decipher. In order to increase the likelihood that players will make the connection,
the game is set up as a turn-based game where players may take as much time as they want between turns. You’ll see that as the situation becomes more complex, even good players will slow down their play substantially so they can understand all the ramifications.”
“Looser: Requires simulating the model quickly. In a game of WarioWare, there isn’t really much complexity involved in each individual puzzle. However, we can dramatically ramp up the cognitive load and increase outcome uncertainty by setting a very short timer.”
Corresponding Metric: Completion time vs time limits
Time pressure here does not only refer to time limit challenges in games, but also to the speed at which the games actually play. Increasing the speed of the game is a guaranteed method of increasing difficulty, dating back to classics like Pac Man.
Cook writes that the loose way is to enforce time pressure at the rate that the player is most comfortable with. But such games usually increase in difficulty by gradually applying more time pressure, which means that they will eventually become uncomfortable anyway. The challenge of the designer then becomes to balance the difficulty rate in such a way that the player feels challenged and interested.
If the player bought the game, then there is an extra financial motivation to keep playing, so the designer can make the game very challenging and still expect the player to play along. In the “free to play” or freemium model the balance of difficulty is changed. Now there is the added aspect of trying to motivate the player to spend money, while not pushing the player away. Most freemium game use a “make the game faster” or “make the game easier” monetization strategy. That means that the challenge of balancing the difficulty is not necessarily to make the game more interesting, but to make it difficult enough to determine the player to make in-game purchases. So freemium games should push the player’s limits beyond the comfort level, while at the same time trying to not drive them away.
It is possible to make a metric that reflects the difficulty of games which use time-based levels (for example “match three” games like Candy Crush, Bejeweled and Bubble Mania). It doesn’t have to be match three games, it can be any game that has a goal-based structure with a time restriction. Setup the metric to collect how long each player takes to complete each level, and to collect when the player dies or fails, and how long each play session is. Then you can get an average time to complete each level, get a count on how many players fail each level and find out when they give up.
The Cyborg Cat is watching the player in Bejeweled Blitz, that uses time pressure as a game mechanic.
With this data, the designer can then find out if the game is more or less difficult than expected. If all the players failed the first level and the designer expected the average player to get to level 10 without dying, the game is too hard. It is a common design mistake to make the game too hard at first. But with the data collected through the metrics above, the designer can fine tune the difficulty.
If the game follows the freemium model, the same data can be used to tweak the game difficulty towards motivating in-game purchases. If the players are progressing too fast they might enjoy the game more, but they would also be less likely to buy extra time or powerups. In that case, the designer could use the above metrics to choose a suitable moment when the player should fail a level.
For example, the designer could aim for 10 minutes of seamless playing, only to get the player interested. Then he could introduce some mechanics which make the players fail a lot in order to introduce the advantages of a paid game experience.
Tighter is more important on mobile devices
When looking at the 14 different parameters, it becomes clear that “tight” games are easier to play, understand and therefore most suited for the mobile market. So the “tight” approach is clearly the better way when designing a mobile game for the general mobile game consumer. If a designer wants to try something new on the mobile market, then taking the “loose” approach would surely achieve that. If the designer is trying to make a game for the mainstream mobile consumer though, it would be a good idea for him to go through the 14 parameters and make sure the game is as tight as possible. If on the other hand the designer is into experimentation, looking at the 14 items and choosing where to be “loose” might inspire new ideas for gameplay mechanics.
These blog posts have given some examples of how to make use of game metrics in game development. Game metrics have to be very specific in order to be effective, so it they do not apply to your particular game maybe they can at least inspire you to come up with your own relevant metrics.
In any case, if you are designing a game, no matter the platform, do yourself the favor of reading Daniel Cooks blog post