万字长文，从度量和参数角度来权衡游戏的乐趣

发布时间：2014-10-15 14:58:37 Tags：游戏参数,游戏度量,游戏迭代

术和创意领域存在普遍模式，尤其是对于考古、艺术保护、心理学或医学而言，它们既依靠一定直觉，同时也存在某种“正确答案”或“最佳方式”。其发展过程如下：

1. 实践者将他们的领域看作“软科学”；他们并不知道最佳原则或实践。他们最终会把握事情的运作方式，但这多半是通过反复试验。

2. 有人创造某种技术，其从算法上解决许多相关问题。实践者颇为高兴。最终，这变成硬科学。我们无需再进行猜测。许多传统实践者摒弃“传统方式”，将“技术”视作解决行业问题的方式。而保守派则认为这对传统制作方式构成威胁，持怀疑态度看待。

3. 经过广泛应用后，技术的局限性变得显而易见。实践者发现他们所进行的工作依然包含神秘和情感元素，虽然总有一天技术会解决此问题，但这天非常遥远。广大业内人士之所以瞬间醒悟是因为人们不再相信自己的直觉，因为从理论角度看，技术在此表现更突出，但人们之所以不信任当前技术是因为其实现效果尚不那么明显。在年轻人士看来，这并没有他们想象中那般万能；而保守派人士表示，其所起的作用比他们想象的显著。

4. 最终，大家会习惯于这样的模式：他们清楚什么元素能够由计算机程序完成，什么元素需要真正的人类创意思维，随着各优质元素的相互结合，行业变得日益强大（游戏邦注：但掌握什么元素最适合由人类完成，什么适合留给电脑操作是个学习过程，需耗费一定时间）。

目前，游戏设计刚步入第二步。我们逐渐听到越来越多人谈论参数和数据分析之所以能够拯救他们的公司的原因所在。我们开始发现能够在玩家充分掌握应用知识前通过瞄准玩家模式解决游戏平衡问题的MMO内容。我们听说Zynga通过将字体由红色调成粉色，吸引更多玩家体验其游戏。如今行业还出现专门帮助开发者获取和分析参数信息的专业公司。行业开始着迷于参数，但我猜测未来至少有一家完全依靠参数的公司会以失败告终，到那时局面就会发生变化，他们过于执着于数据，完全忘记有些用户体验行为无法通过参数体现。或者也许不会出现这种情况。

Acquisition(from playerize)

无论如何，如今关于参数的运用，业内存在3种派别：

* 传统Zynga模式：设计完全基于参数。无论你讨厌，还是喜欢，Zynga庞大的MAUU（monthly active unique user）就足以证实这种这种模式的效果。

* Zynga模式反对派：参数容易被误读，被操纵，因此非常危险，弊大于利。假设你衡量用户行为，发现很多玩家点击登陆页面，而非进行其他游戏操作，这并不意味着你应该在游戏中融入更多登陆页面（游戏邦注：认为说玩家进行此操作就意味着此内容极富趣味）。若你的设计采用参数，你就会将自己局限于仅凭参数设计的内容，错过众多有趣的电子游戏类型。

* 中间派：参数有其价值，它们帮你调节游戏，发现特定趣味“高潮点”。通过这些信息，你能够将原本颇突出的作品变得更杰出，它们帮你挖掘临近设计空间。但直觉也起到一定的作用；有时你需要大步跨至尚未开拓的领域，寻找总体“高潮点”，单凭参数无法让你到达此处，因为有时特定趣味性的实现会以牺牲其他趣味为代价，参数无法帮我们判断出这一点。

假设你将在游戏中设置一些度量指标，以便你能够更好地进行数据分析并保持游戏的平衡。那么你真正需要衡量的内容到底是什么？通常人们会从两个角度出发。一些人会记录下任何能想到的内容，执行的是先记下再思考的方法。这些人认为比起仅收集一些重要信息，发现不足后再重做测试，他们宁愿先收集更多信息。

还有一些人认为“记录下任何内容”从理论上讲并没有错，但是实际上，面对如此堆积着的外部信息，如果你真的要找到一些有用的信息，那还真的如大海捞针般困难了；并且更糟糕的是，所有的这些收集信息中可能根本就不存在与你所要找的内容相关的信息。基于这种思维，你将会先思考自己在下一次游戏测试时需要哪些内容，适度衡量，如此便不会在后来的执行过程中感到迷茫了。

所以你应该明确自己到底适合哪种方法。

我认为做出不同选择主要取决于你所拥有的资源。如果你是和几个朋友一起在Flash上制作一款小型的商业游戏，你可能没有太多时间进行广泛的数据挖掘，所以你最好能够尽快找到那些对自己有用的信息，而如果后来出现了一些问题要求你使用未收集到的信息进行解决，那么你便可以在游戏中添加更多参数度量指标。而如果你是在一家拥有许多精算统计学家的大公司，这些统计学家每天的工作便是寻找任何相关数据，那么你就可以省去数据收集这项繁琐的工作，并将更多精力和时间投入于其它任务中。

你需要衡量哪些特定内容？

不管是“只收集我们需要的”还是“尽可能收集所有信息”，这两种方法都不属于真正的游戏设计。有时候你需要真正明确自己需要衡量的到底是什么。

就像游戏设计本身，度量指标也是一个二阶问题。你想从游戏中挖掘出的大多数问题其实并不能直接通过测量而得，而应该尽可能地找出那些重要的内容，并对此进行衡量。

例子：乐趣难以衡量

让我们以单人Flash游戏为例。你总是想知道一款游戏是否有趣，但是我们却很难直接衡量游戏的乐趣。而与乐趣相关并且你能够测量的内容是什么？即玩家是否持续长时间游戏，是否坚持到游戏最后并获得了许多成就，是否多次回到游戏中继续游戏（特别是尽管他们“失败”了也仍然继续重新游戏）等等，这些都是你可以衡量的内容。但是你也需要记住，这些并不是绝对相关内容；因为玩家重新回到游戏中可能基于多种原因，如你设置了庄稼枯萎机制以惩罚未重返游戏的玩家等。但是至少我们可以肯定，玩家愿意继续游戏肯定是有原因的，而对于我们来说这些原因便是需要挖掘的重要信息。更重要的是，如果很多玩家同时在游戏的在某一时刻停止并未再次回到游戏中，那么你就要思考是否这部分游戏内容不够有趣或者为何玩家会在此终止游戏（游戏邦注：如果玩家“终止”游戏的位置是在游戏末尾，有可能是因为他们感受到了游戏的乐趣，并最终获胜，但是游戏却未有其它吸引力能够让他们重新挑战。所有的这些都需要视情形而定。）

玩家的使用模式使也很重要，因为不论他们是否选择游戏，玩游戏的频率怎以及游戏时间长度等都是与他们对游戏的满意程度有关。在那些要求玩家在固定时间后回返的游戏中，我们经常能够看到月活跃独立用户（Monthly Active Uniques，简称MAU）以及日活跃独立用户（Daily Active Uniques，简称DAU）这两个术语。“活跃”是个很重要的定义，因为如此你能够避免将那些已经不玩游戏的僵尸用户帐号也计算在内。“独立”这一词也很重要，因为我们并不能把每天登录10次《FarmVille》的独立用户算作10个用户。而这时候你可能会认为月和日的算法的同等的，只要将以日计算的数值乘以30就可以获得以月计算的数值，但是实际上从用户流失率角度来看，这两个数值是完全不同的概念。所以如果你能够明确区分MAU与DAU，你便能够清晰地看到游戏中有多少新玩家以及多少回头玩家。

举个例子来说，你拥有一款用户粘性较强的游戏，但是只有较低的用户基础，也就是只有100来名的玩家，但是所有的玩家每天至少都会登录游戏一次。这时候你的游戏MAU就是100 ，而平均的DAU也是100，所以游戏的MAU/DAU就是1。再假设如果你的游戏玩家玩了一次游戏后便不会再回到游戏中，但是你拥有强大的市场营销策略，所以每天都能够吸引100名新玩家，但是他们也是玩了一次游戏后不会再回头的那种类型。这时候你的平均DAU也仍然是100，但是MAU却变成了3000，所以MAU/DAU比值是30。所以MAU/DAU的变化幅度将在1至28 ，30或31之间浮动（这里的数值取决于每月天数的变化）。

注意：许多度量指标（如Facebook所提供的），便是使用不同方法去计算各种数据，所以一般情况看来，每一套数据的衡量标准其实是不同的。举个例子来说，我曾经看过一个网站罗列出了100款拥有“最糟糕”MAU/DAU比值的应用，但是说实话这些数据却不应该出现在一起，因为它们可能是来自不同媒体基于不同标准而得出的衡量结果。有些人以百分比，即平均数计算一天中玩家的登录数，而这个数值能够从最低点的3.33%（即每天有1/30的月活跃玩家登录游戏）延伸到100%（即所有的月活跃用户玩家每天都会登录游戏）。这是通过DAU/MAU（而不是MAU/DAU）比值乘以100而获得的百分比。所以当你在任何分析网站上看到这些数值，都要先明确他们的计算方法，以便你不会盲目地将不同层面的内容进行比较。

为什么我们需要了解这些数值？首先，如果一款游戏拥有较高的玩家回头率，那就说明它是一款好游戏。其次，这也意味着你能够从游戏中获得收益，因为你每天都能够让相同的人在游戏中驻足——就像是经营实体店时，如果顾客一次流连于橱窗外并未购买任何东西，那就算了，而如果同一位顾客每天都会来看同一样商品，那么最终他便有可能花钱买下这件商品。

难度衡量

类似于趣味性，游戏难度也是个本质上无法直接衡量的东西，但是你可以衡量进程和失败进展。进程衡量根据游戏的不同而不同。

对于向Retro街机游戏等呈现基于技能的挑战的游戏而言，你可以衡量玩家通过每个关卡所花费的时间，每个关卡中的角色死亡次数。更为重要的是，要了解他们死亡的地点和原因。收集这些信息使你可以很容易地找到游戏中最困难的地方在哪里以及是否存在打乱难度曲线的内容。我知道Valve就采取这种措施来跟踪他们的FPS游戏，他们还有个可视化工具，不仅能够显示出所有上述信息，还能够在关卡的地图上显示出具体位置，这样你便可以看到玩家在哪些地方死亡率较高。有趣的是，从《半条命2：第2章》开始，他们允许玩家将实时报告上传到他们的服务器上，而且他们将指标显示在公开页面上（游戏邦注：这或许对之前提到的盗版问题有所帮助，因为玩家自己就可以看到上传的东西及其使用情况）。

衡量游戏平衡性

假如你想知道游戏是否平衡，那要怎么办呢？这也是个无法直接衡量的东西。但是，你可以跟踪任何与玩家、动作或游戏中物品相关的数字，这会告诉你许多有关普通玩法以及战略、物品和其他内容间的相对平衡的信息。

比如，假设在你的战略游戏中，每个玩家每回合可以从4种不同的动作中选择1种，而且你有数量化跟踪每个玩家回合持续时间的方法。你可以记录在每个回合中，每个玩家做出了何种动作及其对玩家各自在游戏中的持续时间的影响。

或者说，假设你拥有的是玩家可自行建造船只的CCG，或每个玩家可自行选择战斗者的战争游戏，或玩家可自行选择派别的RTS，或玩家可自行选择种族及职业组合的MMO及桌面RPG 游戏。无论你面对的是何种游戏，你都可以跟踪最抢手和最不受待见的选择，还可以跟踪哪些选择与最终获胜的关联性最高。应当注意的是，以上两个跟踪的内容并非总是相同，有些东西因外观出众和易于使用而深受所有人的喜爱，但他们仍然有可能被经验丰富并且善出奇招的玩家所击败。有时，玩家需要数个月乃至数年时间，历经数万次游戏体验才能形成主流策略。《万智牌：旅法师对决》中的Necropotence卡牌在发布后将近半年的时间里无人问津，直到某些顶尖玩家弄清楚如何有效地使用。这张卡牌产生的效果很复杂而且令人费解，但是一旦人们开始尝试使用，他们发现它是最强大的卡牌之一。因而，流行度和与胜利的关联度都是可用来衡量游戏平衡的指标。

如果某个游戏物品的使用率大大超过你的预期值，那么就标志着可能存在潜在的游戏平衡问题。这种现象或许还意味着，某些其他原因导致该物品对目标受众产生更大的吸引力。比如，在玄幻游戏中，你或许会惊奇地发现，选择精灵族的玩家比选择人类的玩家更多，事实上这与游戏的平衡问题无关。在某些游戏中，流行度能够体现出某种玩法风格与其他相比更为有趣，而且有时你可以采取一定措施将流行度转移向其他角色、职业或卡片上，提升游戏的整体趣味性。

如果某个游戏物品的使用率远低于期望值，或许意味着它的效能过低或成本过高。但是，还可能意味着它只是用起来不十分有趣而已，即便该物品效果出众。或者，还可能意味着它的使用过于复杂，相对游戏的其他物品来说有着过高的学习曲线，因而玩家并不愿意过早地尝试使用这件物品（游戏邦注：不可单纯凭借游戏测试者的行为便做出判断，因为测试者很经常忽视某些物品的存在，将其弃之不顾）。

metrics(from-blog.acumenfund.org)

除了游戏物品外，指标还可以用在其他的方面。比如，用来衡量起点不对称性，首先做出动作的玩家很可能占有优势或处于下风。收集大量与顺序编排相关的数据，将其与最终的结果想比较。这种做法在职业游戏和运动中并不少见。比如，统计学家已计算出美式足球的主场优势在2.3分左右，《象棋》中先手优势为6.5至7.5分。《卡坦岛》锦标赛数据显示，4人游戏中的第2个行动的玩家仅占有微弱的优势，通常情况下完全可以忽略不计。但是要得出如此有份量且令人信服的数据，就必须统计大量的比赛场次数据。

游戏设计与伦理

这里所考虑的伦理问题指的是，这些指标注重的都是玩家的行为，但它们并没有考虑到对玩家生活的影响。有些游戏被人指责是无耻地操控和利用人类心理中的已知弱点，让玩家不断玩游戏并支付金钱。Facebook游戏在这个方面表现尤为突出，它们对指标的利用已经大大超过其他类型的游戏。现在，上述内容听起来似乎很荒诞，因为我们一直将玩游戏视为自愿行为，所以游戏“囚禁玩家”听起来似乎是个很诡异的想法。从另一方面来看，任何你投入大量时间来玩的游戏，你都在其中投入了自己的情感，而这种情感投资也是带有货币化价值的。我认为游戏会让你为它花钱，如果你觉得这种想法很愚蠢的话，那么就看看以下这个例子。假设我发现你所有游戏存档的存放之处，比如主机内存卡或硬盘以及PC 硬盘驱动器。对于网络游戏来说，你的“游戏存档”位于某些公司的服务器上。然后，假设我威胁要摧毁所有存档。不过你无需担心，我只是更换硬件而已。也就是说，有人愿意免费更换你的硬盘驱动器和主机内存卡，为你提供订阅的所有网络游戏的全新账户。接下来，假设我问你，你愿意支付多少金钱来让我打消计划。我敢打赌，你肯定愿意花点钱解决问题。原因就在于，那些游戏存档对你来说有一定的价值！假设有款游戏威胁称，如果你不购买额外的可下载内容，就要删除你的所有存档，那么你肯定会考虑付费购买。原因不在于你想要获得那些额外的内容，而在于你不希望失去自己的存档。

公平地说，所有游戏都会操控玩家的心理，电影、书籍和所有其他媒体也是如此。对多数人来说，这并不会构成问题，他们依然会将游戏体验本身视为生活的附加值。

但是游戏对生活价值的增加和减少值并非恒久不变，它如同难度曲线那样因人而异。这就是为何我们会看到，诸如MMO之类的游戏既能够提升数百万订阅者的生活乐趣，也会导致少部分人因沉迷游戏而失去婚姻和家庭等惨剧的发生，有些人甚至因为无暇顾及身体机能的基本需求而死于电脑屏幕之前。

所以，如何保持玩家能够在伦理界线之内健康地玩游戏和花费金钱，就成了个需要考虑的问题，那些主要由以金钱指标为主的游戏更是如此。

游戏邦注：本文发稿于2011年3月31日，所涉时间、事件和数据均以此为准。

相关拓展：篇目1，篇目2，篇目3（本文由游戏邦编译，转载请注明来源，或咨询微信zhengjintiao）

Here’s a common pattern in artistic and creative fields, particularly things like archaeology or art preservation or psychology or medicine where it requires a certain amount of intuition but at the same time there is still a “right answer” or “best way” to do things. The progression goes something like this:

1. Practitioners see their field as a “soft science”; they don’t know a whole lot about best principles or practices. They do learn how things work, eventually, but it’s mostly through trial and error.

2. Someone creates a technology that seems to solve a lot of these problems algorithmically. Practitioners rejoice. Finally, we’re a hard science! No more guesswork! Most younger practitioners abandon the “old ways” and embrace “science” as a way to solve all their field’s problems. The old guard, meanwhile, sees it as a threat to how they’ve always done things, and eyes it skeptically.

3. The limitations of the technology become apparent after much use. Practitioners realize that there is still a mysterious, touchy-feely element to what they do, and that while some day the tech might answer everything, that day is a lot farther off than it first appeared. Widespread disillusionment occurs as people no longer want to trust their instincts because theoretically technology can do it better, but people don’t want to trust the current technology
because it doesn’t work that great yet. The young turks acknowledge that this wasn’t the panacea they thought; the old guard acknowledge that it’s still a lot more useful than they assumed at first. Everyone kisses and makes up.

4. Eventually, people settle into a pattern where they learn what parts can be done by computer algorithms, and what parts need an actual creative human thinking, and the field becomes stronger as the best parts of each get combined. But learning which parts go best with humans and which parts are best left to computers is a learning process that takes a while.

Currently, game design seems to be just starting Step 2. We’re hearing more and more people anecdotally saying why metrics and statistical analysis saved their company. We hear about MMOs that are able to solve their game balance problems by looking at player patterns, before the players themselves learn enough to exploit them. We hear of Zynga changing the font color from red to pink which generates exponentially more click-throughs from players to try out
other games. We have entire companies that have sprung up solely to help game developers capture and analyze their metrics. The industry is falling in love with metrics, and I’ll go on record predicting that at least one company that relies entirely on metrics-driven design will fail, badly, by the time this whole thing shakes out, because they will be looking so hard at the numbers that they’ll forget that there are actually human players out there who are trying to have fun in a way that can’t really be measured directly. Or maybe not. I’ve been wrong before.

At any rate, right now there seems to be three schools of thought on the use of metrics:

* The old school Zynga model: design almost exclusively by metrics. Love it or hate it, 60 Million monthly active unique players laugh at your feeble intuition-based design.

* Rebellion against the old school Zynga model: metrics are easy to misunderstand, easy to manipulate, and are therefore dangerous and do more harm than good. If you measure player activity and find out that more players use the login screen than any other in-game action, that doesn’t mean you should add more login screens to your game out of some preconceived notion that if a player does it, it’s fun. If you design using metrics, you push yourself into designing the kinds of games that can be designed solely by metrics, which pushes you away from a lot of really interesting video game genres.

* The moderate road: metrics have their uses, they help you tune your game to find local “peaks” of joy. They help you take a good game and make it just a little bit better, by helping you explore the nearby design space. However, intuition also has its uses; sometimes you need to take broad leaps in unexplored territory to find the global “peaks,” and metrics alone will not get you there, because sometimes you have to make a game a little worse in one way before it gets a lot better in another, and metrics won’t ever let you do that.

Think about it for a bit and decide where you stand, personally, as a designer. What about the people you work with on a team (if you work with others on a team)?

In Part I, game designer Ian Schreiber outlines the debate between metrics-driven design and the more touchy-feely intuition-based design. In Part II, he explains the difficulties with trying to measure the “fun” in your game.

How much to measure?

Suppose you want to take some metrics in your game so you can go back and do statistical analysis to improve your game balance. What metrics do you actually take – that is, what exactly do you measure?There are two schools of thought that I’ve seen. One is to record anything and everything you can think of, log it all, mine it later. The idea is that you’d rather collect too much information and not use it, than to not collect a piece of critical info and then have
to re-do all your tests.

Another school of thought is that “record everything” is fine in theory, but in practice you either have this overwhelming amount of extraneous information from which you’re supposed to find this needle in a haystack of something useful, or potentially worse, you mine the heck out of this data mountain to the point where you’re finding all kinds of correlations and relationships that don’t actually exist. By this way of thinking, instead you should figure out
ahead of time what you’re going to need for your next playtest, measure that and only that, and that way you don’t get confused when you look at the wrong stuff in the wrong way later on.

Again, think about where you stand on the issue.

Personally, I think a lot depends on what resources you have. If it’s you and a few friends making a small commercial game in Flash, you probably don’t have time to do much in the way of intensive data mining, so you’re better off just figuring out the useful information you need ahead of time, and add more metrics later if a new question occurs to you that requires some

data you aren’t tracking yet. If you’re at a large company with an army of actuarial statisticians with nothing better to do than find data correlations all day, then sure, go nuts with data collection and you’ll probably find all kinds of interesting things you’d never have thought of otherwise.

What specific things do you measure?

That’s all fine and good, but whether you say “just get what we need” or “collect everything we can,” neither of those is an actual design. At some point you need to specify what, exactly, you need to measure.

Like game design itself, metrics is a second-order problem. Most of the things that you want to know about your game, you can’t actually measure directly, so instead you have to figure out some kind of thing that you can measure that correlates strongly with what you’re actually trying to learn.

Example: measuring fun

Let’s take an example. In a single-player Flash game, you might want to know if the game is fun or not, but there’s no way to measure fun. What correlates with fun, that you can measure?

One thing might be if players continue to play for a long time, or if they spend enough time playing to finish the game and unlock all the achievements, or if they come back to play multiple sessions (especially if they replay even after they’ve “won”), and these are all things you can measure. Now, keep in mind this isn’t a perfect correlation; players might be coming back to your game for some other reason, like if you’ve put in a crop-withering mechanic that punishes them if they don’t return, or something. But at least we can assume that if a player keeps playing, there’s probably at least some reason, and that is useful information. More to the point, if lots of players stop playing your game at a certain point and don’t come back, that tells us that point in the game is probably not enjoyable and may be driving players away. (Or if the point where they stopped playing was the end, maybe they found it incredibly enjoyable but they beat the game and now they’re done, and you didn’t give a reason to continue playing after that. So it all depends on when.)

Player usage patterns are a big deal, because whether people play, how often they play, and how long they play are (hopefully) correlated with how much they like the game. For games that require players to come back on a regular basis (like your typical Facebook game), the two buzzwords you hear a lot are Monthly Active Uniques and Daily Active Uniques (MAU and DAU). The “Active” part of that is important, because it makes sure you don’t overinflate your numbers by counting a bunch of old, dormant accounts belonging to people who stopped playing. The “Unique” part is also important, since one obsessive guy who checks FarmVille ten times a day doesn’t mean he counts as ten users. Now, normally you’d think Monthly and Daily should be equivalent, just multiply Daily by 30 or so to get Monthly, but in reality the two will be different based on how quickly your players burn out (that is, how much overlap there is between
different sets of daily users). So if you divide MAU/DAU, that tells you something about how many of your players are new and how many are repeat customers.

For example, suppose you have a really sticky game with a small player base, so you only have 100 players, but those players all log in at least once per day. Here your MAU is going to be 100, and your average DAU is also going to be 100, so your MAU/DAU is 1. Now, suppose instead that you have a game that people play once and never again, but your marketing is good, so you get 100 new players every day but they never come back. Here your average DAU is still
going to be 100, but your MAU is around 3000, so your MAU/DAU is about 30 in this case. So that’s the range, AU/DAU goes between 1 (for a game where every player is extremely loyal) to 28, 30 or 31 depending on the month (representing a game where no one ever plays more than once).

A word of warning: a lot of metrics, like the ones Facebook provides, might use different ways of computing these numbers so that one set of numbers isn’t comparable to another. For example, I saw one website that listed the “worst” MAU/DAU ratio in the top 100 applications as 33-point-something, which should be flatly impossible, so clearly the numbers somewhere are being messed with (maybe they took the Dailies from a different range of dates than the Monthlies
or something). And then some people compute this as a %, meaning on average, what percentage of your player pool logs in on a given day, which should range from a minimum of about 3.33% (1/30 of your monthly players logging in each day) to 100% (all of your monthly players log in every single day). This is computed by taking DAU/MAU (instead of MAU/DAU) and multiplying by 100 to get a percentage. So if you see any numbers like this from analytics websites, make sure you’re clear on how they’re computing the numbers so you’re not comparing apples to oranges.

Why is it important to know this number? For one thing, if a lot of your players keep coming back, it probably means you’ve got a good game. For another, it means you’re more likely to make money on the game, because you’ve got the same people stopping by every day… sort of like how if you operate a brick- and-mortar storefront, an individual who just drops in to window-shop may not buy anything, but if that same individual comes in and is “just looking”
every single day, they’re probably going to buy something from you eventually.

[This article is an excerpt from Level 8: Metrics and Statistics, part of Ian Schreiber's course on game balance called Game Balance Concepts.]

Ian Schreiber has been making games professionally since 2000, first as a programmer and then as a game designer. He currently teaches game design classes for Savannah College of Art andDesign and Columbus State Community College. He has worked on five shipped games and hundreds of shipped students. You can learn more about Ian at his blog,

Teaching Game Design.

Player difficulty, like fun, is another thing that’s basically impossible to measure directly, but what you can measure is progression, and failure to progress. Measures of progression are going to be different depending on your game.

For a game that presents skill-based challenges like a retro arcade game, you can measure things like how long it takes the player to clear each level, how many times they lose a life on each level, and importantly, where and how they lose a life. Collecting this information makes it really easy to see where your hardest points are, and if there are any unintentional spikes in your difficulty curve. I understand that Valve does this for their FPS games, and that they actually have a visualizer tool that will not only display all of this information, but actually plot it overlaid on a map of the level, so you can see where player deaths are clustered. Interestingly, starting with Half-Life 2 Episode 2 they actually have live reporting and uploading from players to their servers, and they have displayed their metrics on a public page (which probably helps with the aforementioned privacy concerns, because players can see for
themselves exactly what is being uploaded and how it’s being used).

Yet another example: measuring game balance

What if instead you want to know if your game is fair and balanced? That’s not something you can measure directly either. However, you can track just about any number attached to any player, action or object in the game, and this can tell you a lot about both normal play patterns, and also the relative balance of strategies, objects, and anything else.

For example, suppose you have a strategy game where each player can take one of four different actions each turn, and you have a way of numerically tracking each player’s standing. You could record each turn, what action each player takes, and how it affects their respective standing in the game.

Or, suppose you have a CCG where players build their own decks, or a Fighting game where each player chooses a fighter, or an RTS where players choose a faction, or an MMO or tabletop RPG where players choose a race/class combination. Two things you can track here are which choices seem to be the most and least popular, and also which choices seem to have the highest correlation with actually winning. Note that this is not always the same thing; sometimes the big, flashy, cool-looking thing that everyone likes because it’s impressive and easy to use is still easily defeated by a sufficiently skilled player who uses a less well-known strategy. Sometimes, dominant strategies take months or even years to emerge through tens of thousands of games played; the Necropotence card in Magic: the Gathering saw almost no play for six months or so after release, until some top players figured out how to use it, because it had this really complicated and obscure set of effects… but once people started experimenting with it, they found it to be one of the most powerful cards
ever made. So, both popularity and correlation with winning are two useful metrics here.If a particular game object sees a lot more use than you expected, that can certainly signal a potential game balance issue. It may also mean that this one

thing is just a lot more compelling to your target audience for whatever reason – for example, in a high fantasy game, you might be surprised to find more players creating Elves than Humans, regardless of balance issues… or maybe you wouldn’t be that surprised. Popularity can be a sign in some games that a certain play style is really fun compared to the others, and you can sometimes migrate that into other characters or classes or cards or what have you in order to make the game overall more fun.

If a game object sees less use than expected, again that can mean it’s underpowered or overcosted. It might also mean that it’s just not very fun to use, even if it’s effective. Or it might mean it is too complicated to use, it has a high learning curve relative to the rest of the game, and so players aren’t experimenting with it right away (which can be really dangerous if you’re relying on playtesters to actually, you know, playtest, if they leave some of your things alone and don’t play with them).

Metrics have other applications besides game objects. For example, one really useful area is in measuring beginning asymmetries, a common one being the first-player advantage (or disadvantage). Collect a bunch of data on seating arrangements versus end results. This happens a lot with professional games and sports; for example, I think statisticians have calculated the home-field advantage in American Football to be about 2.3 points, and depending on where you
play the first-move advantage in Go is 6.5 or 7.5 points (in this latter case, the half point is used to prevent tie games). Statistics from Settlers of Catan tournaments have shown a very slight advantage to playing second in a four-player game, on the order of a few hundredths of a percent; normally we could discard that as random variation, but the sheer number of games that have been played gives the numbers some weight.

A Note on Ethics

The ethical consideration here is that a lot of these metrics look at player behavior but they don’t actually look at the value added (or removed) from the players’ lives. Some games, particularly those on Facebook which have evolved to make some of the most efficient use of metrics of any games ever made, have also been accused (by some people) of being blatantly manipulative, exploiting known flaws in human psychology to keep their players playing (and giving
money) against their will. Now, this sounds silly when taken to the extreme, because we think of games as something inherently voluntary, so the idea of a game “holding us prisoner” seems strange. On the other hand, any game you’ve played for an extended period of time is a game you are emotionally invested in, and that emotional investment does have cash value. If it seems silly to you that I’d say a game “makes” you spend money, consider this: suppose I
found all of your saved games and put them in one place. Maybe some of these are on console memory cards or hard disks. Maybe some of them are on your PC hard drive. For online games, your “saved game” is on some company’s server somewhere. And then suppose I threatened to destroy all of them… but not to worry, I’d replace the hardware. So you get free replacements of your hard drive and console memory cards, a fresh account on every online game you
subscribe to, and so on. And then suppose I asked you, how much would you pay me to not do that. And I bet when you think about it, the answer is more than zero, and the reason is that those saved games have value to you! And more to the point, if one of these games threatened to delete all your saves unless you bought some extra downloadable content, you would at least consider it… not because you wanted to gain the content, but because you wanted to not lose your
save.

To be fair, all games involve some kind of psychological manipulation, just like movies and books and all other media (there’s that whole thing about suspending our disbelief, for example). And most people don’t really have a problem with this; they still see the game experience itself as a net value-add to their life, by letting them live more in the hours they spend playing than they would have lived had they done other activities.

But just like difficulty curves, the difference between value added and taken away is not constant; it’s different from person to person. This is why we have things like MMOs that enhance the lives of millions of subscribers, while also causing horrendous bad events in the lives of a small minority that lose their marriage and family to their game obsession, or that play for so long without attending to basic bodily needs that they keel over and die at thekeyboard.

So there is a question of how far we can push our players to give us money, or just to play our game at all, before we cross an ethical line… especially in the case where our game design is being driven primarily by money-based metrics. As before, I invite you to think about where you stand on this, because if you don’t know, the decision will be made for you by someone else who does.

分享到： QQ空间新浪微博开心网人人网

上一篇:如何面对具有内在抽象性的游戏玩法

下一篇:分析《英雄联盟》的成熟与不成熟