游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

设计师可采用7种策略衡量游戏成品指标

发布时间:2012-02-03 00:13:33 Tags:,,,

作者:Mark J. Nelson

游戏设计师为提升自己对所设计作品的了解,通常采用的方法便是测试原型。这种方式可以让设计师得知玩家操作游戏的相关信息(游戏邦注:包括玩家动作的时间和方式),以及玩家的主观反应。近期较为普遍的做法是,使用形象化和AI工具来改善收集和理解这些信息的过程。最著名的形象化方式可能是热图,即在游戏关卡地图上用颜色标注出某些事件在地图中各部分发生的频率,这样设计师便可以很直观地看到某些信息,比如,玩家经常会在哪些地方死亡。这种结果能被扩展出更为复杂的内容,比如对游戏玩法样式的分析、玩法风格的特征以及对主观玩家体验的分析。

在所有上述方法中,信息只能来源于玩家。经验性信息收集自玩家,采用的方法包括记录他们的游戏过程、跟踪他们在游戏过程中的心理响应和在游戏后开展调查等。随后,这些数据将被分析并视觉化呈现出来,从而能够理解游戏及其创造出的游戏玩法,并用来探讨如何改善设计。

对于某些游戏设计问题来说,将玩家作为经验性数据来源是必要之举。如果我们想要知道目标用户是否觉得游戏有趣,或了解发现隐藏房间的玩家比例,那么我们就必须让他们玩游戏才能得出数据结果。但是,设计师进行游戏测试还有另一个目标,那就是更好地理解自己的游戏产品,了解代码系统和规则能否以特定的方式恰当地运转。但游戏测试常出现某些完全不是经验性事实的结果。比如,当设计师向测试玩家提醒道:“如果你不先同Gandalf交谈,就无法到达那个地方。”,此类测试结果就不属于有关玩家或游戏玩法的经验性事实,而是游戏如何运转的逻辑事实。

虽然游戏测试可以加深对游戏成品的理解,但玩家在揭示此类信息的作用只是推动游戏按照预定路线进展下去。我们无需将游戏视为只有观察玩家在其中的行为才能够理解的黑盒子,我们可以直接通过运转情况来分析游戏。事实上,设计师已经这么做了:当他们在设计规则系统和编写代码时,他们根据自己设计的游戏应当如何运行的想法而建立起心智模型,花大量时间探索各种可以采用的设计方式,细心地得出规则间的互动情况,甚至可能建立Excel电子表格,而这些行为都发生在测试者接触游戏之前。我们能否使用AI和形象化技术来改善设计师这种思维模拟游戏成品的工作?

我的答案是肯定的。在下文中,我将阐述使用这种方法的7个策略,有些策略已经为人所知,有些是全新的策略。它们可以用来作为基于玩家的指标和将某些设计问题形象化的替代方法,尤其是在早期的游戏原型设计中(游戏邦注:这样设计师便可以专注于测试更多主观性或实验性问题),许多策略还可以很自然地融合到现有的指标和形象化方法中。在这篇文章中我将以通用于各种技术方法的立场来讨论这些策略,以期实现对游戏的最大化了解。

策略1:提问“能否实现?”

最容易提出的问题形式为:X能否发生?比如:游戏能否获胜?玩家能否收集每项道具?玩家能否在死亡后回到第1个房间?玩家能否在我正点击的这个位置死亡?这两个门能否同时上锁?

这个策略能够回答所有直接由游戏规则决定答案的是非问题:在玩家可能做出某些行为的情况下,是否可能出现某种游戏状态或发生某事件。事实上,有时问题涉及的不是玩家方面的改变,而是系统方面可能产生的变动。比如,敌人是否可能在坑底刷新?

这个分析策略也可以给设计师提供直观的结果。同样,我们也可以形象化这些问题的答案类型。比如,我们可以想象,在分成多个格子的关卡中,玩家在每个格子上死亡的可能性,并据此绘制出相应的颜色地图,制作出玩家可能死亡的位置热图。

以下便是两个玩家死亡热图的示例:

hot map(from kmjn)

hot map(from kmjn)

左边是在类似《塞尔达传说》游戏测试中得出的死亡经验性热图,右边是玩家可能面临死亡的结果地图。在这两个图形中我们很容易便可以看出结果,根据游戏规则而非经验性测试数据,玩家只可能在右图所示的两行处死亡。在左边的图片中,经验性玩家死亡数据显得更为清晰。在风格更为复杂的游戏中,整理出这种图形可能需要大量的游戏测试数据,而且可能存在未曾发现的规则、游戏机制和关卡设计间的相互作用。此外,两种热图都是很有用的,设计师可以分清哪些类型的死亡源于设计可能性,哪些类型的死亡源于玩家行为。右边的图像还可以用来更细节化地调整关卡设计,比如设置安全地点。

我们有很多可用来执行这项策略的技术。比如,我们可以用模拟玩家来测试游戏,给这个模拟玩家设置某个具体的目标。我自己采用的方法是用逻辑推理来回答各种问题。对于某些特别的可能性问题,也可以使用更为具体的算法(游戏邦注:具体算法也可能更加有效)。比如,flood-fill算法常常用来确保关卡中各部分都紧密联系,graph-reachability算法也可以用来实现上述目标。在大型系统中使用这些特别算法的挑战在于明确何时该使用特定的算法。比如,基础的flood-fill算法便足以测试关卡中各部分的可达性,只要关卡中不含有可移动的墙体及其他元素。

策略2:提问“如何实现?”

除了确保某些事情可实现之外,设计师往往还想要知道这件事情会如何发生。在玩家指标衡量中,收集和跟踪玩家的行为和动作,再配合上游戏状态的相关信息(游戏邦注:比如玩家的位置和生命值),便可以得出这个问题的答案。这些玩家踪迹既可以用来寻找和解除游戏漏洞(游戏邦注:也就是可以追查出哪些本不可能发生的事情出现了),也可以用来作为理解游戏世界动态的方式。以下是某个相当乏味的记录,玩家碰上怪物后未采取任何措施而死亡:

happens(move(player,north),1)

happens(attack(monster,player),2)

happens(attack(monster,player),3)

happens(die(player),3)

不同游戏状态下的不同数值可能会改变记录的内容,比如玩家的位置和生命值。但是,其乏味性又带来新的问题:仅展示可能发生的事情是不够的,这能否展示事情可能以何种有趣的方式发生?对于某些结果来说,比如那些本不应该发生的结果,任何记录都是很有趣的信息。但是对其他结果来说,这是个很棘手的问题。一种方法是让设计师改善他们的追踪方法。在这个例子中,他们可以设置玩家在进行抵抗后死亡,如果结果仍显得过于平淡的话,可以添加更多的警告信息,这样便可以提升追踪记录的趣味性。

在模拟框架中,事情可能如何发生的记录只是关于在成功进行模拟运行时所采取的动作记录。在逻辑框架中,可能出现所谓的诱导性问题:发现事件的出现顺序,如果情况确实如此发生,那么就可以解释其结果的来源。另外也有可能发生这种情况,即把游戏故事世界中某项结果的发生路径视为一个预选计划好的事件。例如,我们可为游戏关卡找到多种解决方案,并按顺序令其与相关事件对号入座。

策略3:必要性和附属性

当我们知道了哪些事情可能发生及其发生的方式,我们或许还会想知道哪些事情是必须要发生的。如果你没有“Meteo”的话能否打通《最终幻想6》?玩家可以跳过哪些任务?我刚刚添加的这个强大的宝剑对游戏来说是必要的还是多余的?

最终幻想6(from brutalgamer.com)

最终幻想6(from brutalgamer.com)

这些类型的问题也同通过前两个策略可以回答的问题有所关联。有些类型的必要性问题可以改编为,如果缺乏某些必要性道具的话,能否到达某个特别的游戏状态。比如,在遇到第2个BOSS前是否有必要升到第5级,这个问题等同于,是否有可能在4级或更低等级时就接触到该BOSS。其他类型的必要性问题可以转变为对踪迹的考量。比如,询问某把特别的剑是否是打通游戏的必要武器,这就等同于询问采用不包含拾取该宝剑事件的游戏玩法是否有可能打通游戏。

在经验性测试中,收集到有关使用率的指标是很常见的情况。比如,有多少个玩家完成了某个任务或者使用某个道具等。在策略1中,我们将经验性数据同分析得出的可能性并列对比。在这个策略中,我们可以采用同样的方法,将经验性数据同分析得出的必要性结合起来使用。当然,如果经验性数据显示某个道具或时间使用和发生的概率不到100%,那么它肯定非必要元素。但是从另一方面来说,有些测试者100%做的事情也可能是非必要元素,这一点通过游戏分析策略便可以区分和查找出来。

更为自动化的附属分析也是可行的。比如,询问事件A是否有必要先于事件B发生,这可以让我们构建起必要事件顺序图表,这样很容易便可以看出游戏世界中的因果结构。这还包含了未清晰书写表述出来的因果结构,比如,进入某个锁住的房间可能有许多明确的前提条件,比如玩家需要找到钥匙才可以进入,但是其他规则的互动可能还会产生许多不明确的前提条件,比如玩家需要在进入房间前找到某件装备(游戏邦注:或许玩家需要先获得那套装备才能够成功地进入房间)。

据我了解,现有的游戏分析工作还没有明确进行这种自动必要性或附属性分析的方法。我自己在工作中采用的正是这个部分描述的方法,将必要性和附属性分析融合到策略1和策略2中执行的一系列问题中。

策略4:最大最小值

有时候,某些幻数可以描述出可行性行为的情况。完成《超级马里奥兄弟》关卡的可行最短时间是多少?在《模拟城市》中,玩家进入游戏5分钟后可以获得多少金钱?

simcity_money(from protipoftheday.com)

simcity_money(from protipoftheday.com)

这种策略能够提供最初的游戏测试未能发现的信息,比如最快速的通关方法或可快速打通关卡的作弊方式,这往往是游戏测试过程中测试玩家无法察觉到的内容。此外,它还能够有效地同经验性玩家数据配对,让设计师了解到监测到的数据范围同游戏规则的理论范围有多大差异。比如,任何绘制玩家行为的数字图表都会标示出可能的最大值和最小值,这个最大最小值不只有数值的猜测范围,而且还涉及到在具体背景下数值可能达到的最大最小值。

除此之外,经验性数据范围和理论极端数值之间的关系还能够告诉我们测试中某些有关玩家的信息。比如,在某次实验中,我们发现参加某个地下挖矿游戏测试的玩家对于返回地表添加燃料过于谨慎。这可以结合策略2,用来弄明白如何实现门槛值。

在基于逻辑的方法中,使用分支定界方法可以找到门槛值。首次寻找可能会得出可行的解决方案,随后添加入参数后,可能会得出更好的解决方案。随后,我们继续寻找满足新参数的其他解决方案,并且不断重复直至无法找到新解决方案。这便是这种战略的优势所在,但可能需要较长的时间。

对于相同问题,未来的工作可以借用之前得出最优方法。此外,许多算法都可以用来寻找某些更具体问题的最大化或最小化解决方案。比如,提供关卡模型,我们便可以根据Dijkstra的算法找到通关的最短途径。与策略1中的专业算法一样,使用这些专业算法的难处就是找到合适的时机。比如,通过关卡的最短空间路线或许不是真正可以采用的路线。同时,结果和事实情况的这些差异化也可能传达出一定的信息,通过关卡的最短路线和玩家可以选择的通过关卡的最短路线之间的差异可以反映出关卡的难度。

策略5:状态空间特征

到目前为止,我们可将这些策略视为尝试从各种角度探索游戏的状态空间。我们能否更直接地分析和描述游戏的状态空间呢?

一种可能性是尝试将状态空间形象化。在任何重要游戏中,全分支状态图像都是相当大的,完全呈现几乎无法实现。但是,将可能存在的动态浓缩,呈现有意义的状态和动态还是可行的。此外,为总结经验性玩家数据状态空间而开发的技术可以用来总结可行玩法空间整体的范例踪迹。

单个踪迹可以呈现状态空间中与其相邻踪迹相关的部分信息。比如,显示替代性事件的分支点本可以在踪迹中的某个特定点发生。这可以用来查看可能玩法的空间,比策略2中得出的要求踪迹更具探索性。此外,我们可以经验性地收集踪迹并观察其可能存在的空间相邻踪迹,这可以用来将少量测试数据运用到更大范围的探索中,展示类似于观察到的游戏玩法的可行性玩法。

由此便进入了更具数学化的领域,游戏大部分是基于各种数学方法,不同的方程式、影响地图和动态系统都可以使用标准的数学技术来分析,比如找到固定点或呈现空间简图。有时候,某些基本实验是通过Excel原型来完成,但据我所知,这片领域的潜力尚未被完全发掘。

策略6:假定的玩家测试

上文策略尝试调查的是游戏运行的整体情况。我们可以总结出以上分析的限制性,就是尝试弄情况游戏在特定玩家模型下的运转特征。这种限制性主要并不是探究现实主义玩家模型,而是调查游戏在各种极端或理想情况下的运行。比如,当玩家在游戏中总是发动攻击,即便生命值较低时也不治疗,这样会发生什么事情。如果玩家在游戏中进行如此行为,依然可以良好地存活,那么这样的游戏或许显得有点过于简单。或者,在多人游戏中,不同玩家会互相攻击,调查这些内容可以告诉我们有关设计空间的某些信息。

Machinations系统模拟了假定玩家体验的是Petri游戏模式,收集游戏运行数次的结果。我自己使用的基于逻辑的系统利用策略1到4来定义玩家模型,回答了许多有关可能性和必要性的问题,而这种结果的前提是假设玩家能够以特定的方式来做出动作。

策略7:找到玩家

虽然假想玩家可以用来预测在各种游戏玩法条件下游戏会有何种表现,但我发现设计师往往难以发明出这种假设性玩家,而是希望这个过程反向运行:在拥有游戏原型的前提下,我们能否自动获得能够不断取得某些成果的简单玩家模型?比如,有些设计师不愿意提出类似“这款游戏能否通过简单地不断按动攻击键就可以打通?”之类的问题,他们更偏向于我们在分析游戏后得出结果:游戏确实存在能够不断被打通的最简单玩家模型(游戏邦注:也就是不断按动攻击键就可以打通游戏)。

发明这种类型的玩家模型有很多种方法。一种方法是抽样调查可以达到所要求状态的可行踪迹(游戏邦注:使用策略2中阐述的技术),然后从这些踪迹中提取出玩家模型,或者从多个不同的踪迹中提出许多种玩家模型。经验性游戏玩法指标中已经有许多种技术可以用来推导出此类玩家模型,这也可以用来探索游戏玩法踪迹。

另一种方法是,直接从可以取得预想状态的玩家中推导是否存在合适的玩家模型。比如,是否存在玩家不断按动就可以打通游戏的按键?如果没有的话,是否存在两个按键的组合可以简单地打通游戏(游戏邦注:比如攻击键和治疗键交替按动)?如果这种情况也没有,我们可以向状态机器设置更多的状态,或其他更为复杂的玩家模型。复杂性的中心点便是玩家模型的“盲目性”,也就是不断按动某个按键或交替按动某两个按键可以完全无视游戏状态打通游戏。如果不存在如此简单便可打通游戏的方法,那么是否有可能出现玩家只关注某个游戏状态或某两个游戏状态的情况?如果确实存在这样的玩家模型,那么结果会告诉我们某些与做决定有关联的游戏状态信息。我曾经开展过某些有关此类玩家模型的实验,可采取的方法还是很多的。

在多人游戏中,寻找玩家可以同游戏理论属于联系起来,比如寻找最优化战略和优势战略等。虽然使用游戏理论来进行电子游戏分析或平衡之前常被讨论过,但实际运用似乎仍显困难,部分原因在于游戏理论通常处理的游戏范围和电子游戏原型的大小并不相配。多数计算游戏理论要么假设游戏只有一个阶段,要么假设游戏是有着相对较少阶段的重复性游戏,通常阶段的数量不超过3到4个,而多数电子游戏有着许多个阶段,同时其动态需要较长的时间范围才能够显露出来。要解决这个问题,要么找到以较少重复的游戏理论问题来呈现许多有趣问题的方法,要么将工具扩展至可支持更多的阶段重复。

最后,许多的游戏玩法算法都可以用来作为寻找玩家的算法,尤其是那些通过有趣的内在结构来告诉我们某些游戏学习相关信息的算法。比如,学习加强算法的状态值功能可以扩大策略5中的状态图表,成为数据的重要来源。

虽然游戏测试是与游戏玩法相关的有价值信息的重要来源,但游戏作品本身不应当是个黑盒子,我们可以通过更好地理解游戏本身来学习许多有关游戏设计的东西,比如游戏内的规则和代码是如何构建玩家体验。这种游戏分析可以从指标和形象化技术的生态系统中受益,因此可能会不断发展。(本文为游戏邦/gamerboom.com编译,拒绝任何不保留版权的转载,如需转载请联系:游戏邦

Game metrics without players

Mark J. Nelson

A common (and sensible) way for a game designer to improve his or her understanding of a design-in-progress is to playtest a prototype. Doing so gives the designer empirical information about what players do in the game (and when and how they do it), as well as about their subjective reactions. There has been considerable recent work in using visualization and AI tools to improve the process of collecting and understanding this empirical information. The most well-known visualization is probably the heatmap, a map of a game level color-coded by how frequently some event occurs in each part of the map, allowing a quick visual representation of, e.g., where players frequently die. This can be extended into more complex analysis of gameplay patterns, characterization of play styles, and analysis of subjective player experience.

In all these approaches, the source of information is exclusively the player. Empirical information is collected from players, by methods such as logging their playthroughs, tracking their physiological responses during play, administering a post-play survey, etc. Then this data is analyzed and visualized in order to understand the game and the gameplay it produces, with a view towards revising the design.

For some kinds of game-design questions, it’s sensible or even necessary for our source of information to be empirical data from players. If we want to know if a target audience finds a game fun, or what proportion of players notice a hidden room, we have them play the game and find out. But an additional purpose of playtesting is for the designer to better understand their game artifact, in the sense of a system of code and rules that works in a certain way. Some common results of playtesting aren’t really empirical facts at all. When the designer looks over a player’s shoulder and remarks, “oops, you weren’t supposed to be able to get there without talking to Gandalf first”, that isn’t an empirical fact about players or gameplay that’s being discovered, but a logical fact about how the game works.

While this kind of improved understanding of a game artifact can be discovered through playtesting, the only real role of the player in uncovering that kind of information is to put the game through its paces, so the designer can observe it in action. We need not treat the game as a black box only understandable by looking at what happens when players exercise it, though; we can analyze the game itself to determine how it operates. Indeed, designers do so: when they design rule systems and write code, they have mental models of how the game they’re designing should work, and spend considerable time mentally tracing through possibilities, carefully working out how rules will interact, and perhaps even building Excel spreadsheets before the game ever sees a playtester. Can we use AI and visualization techniques to augment that thinking-about-the-game-artifact job of the designer, the way we’ve augmented thinking about player experience?

As you might guess, my answer is yes. Here I’ll sketch seven strategies for doing so, several existing and others new. While they can be used as alternatives to player-based metrics and visualizations for some kinds of design questions, especially early on in prototyping (so that the designer can focus playtesting on more subjective or experiential questions), many also work naturally alongside existing metrics/visualization approaches. Although my own work has been based on modeling game mechanics in symbolic logic, here I’ll attempt to discuss strategies in a way that’s open to a range of technical approaches that could be used to realize them—focusing on what we might want to get out of analyzing games and why.

Strategy 1: “Is this possible?”

The easiest questions to ask are of the form: can X happen? Examples: Is the game winnable? Can the player collect every item? Can the player die while still in the first room? Can the player die right on this spot I’m clicking? Can both of these doors ever be locked at the same time?

This strategy answers any yes/no question whose answer is determined directly by the rules of the game: whether a game state is possible, or an event can ever happen, given all possible player behaviors. In fact questions might not even involve variation on the player’s part, but variation in the system: Do enemies ever spawn at the bottom of a pit?

This analysis strategy could be exposed directly to the designer in a query setup. Alternately, classes of answers can be visualized. For example, we can ask, for every square in a grid-divided level, whether the player could die there, and color-code the map accordingly, producing a version of the popular player-death heatmap that shows instead where it is possible for players to die.

Consider this pair of player-death heatmaps on a toy example level:

On the left is a heatmap of where deaths empirically happened in a few playthroughs of a room in a Zelda-like game; and on the right, a map of where it’s possible to die. There’s a clear structural pattern immediately visible in the second figure, derived from the game rules rather than from empirical playtest data: the player can only die in two specific rows. In the first figure, this pattern hasn’t quite been made clear via the pattern of empirical player deaths. Especially with more complex patterns, it can take a lot of playtesting data to spot these sorts of structural features in the possibility space, which are usually caused by unnoticed interactions of rules, or interactions between game mechanics and level design. In addition, it can be useful to have both kinds of heatmaps, to allow the designer to disentangle which patterns are caused by the possibility space, and which are caused by patterns within player behavior. A figure like the second one can even be used to tweak level design at a more detailed level, e.g. to place safe spots.

Several techniques can been used to implement this strategy. For example, we can playtest games with a simulated player that evolves itself in order to try to achieve particular outcomes. My own work uses logical inference for query-answering. More specific (and likely more efficient) algorithms can be used for special cases of possibility as well; for example, flood-fill algorithms are sometimes used to make sure there are no disconnected parts of a level, and graph-reachability algorithms can be used for similar purposes. A challenge with using these special-case algorithms in a larger system is to automatically recognize when they’re applicable, and in which variants; for example, basic flood-fill suffices for reachability as long as a level doesn’t involve movable walls or keys.

Strategy 2: “How is this possible?”

Beyond finding that something is possible, a designer often wants to know how it could happen. In player metrics this is answered by collecting a log or trace of the actions the player took, along with information about game state (such as the player’s position and health). These traces can be used both for debugging (to track down how something happened when it shouldn’t have been possible), and as a way to understand the dynamics of the game world, by looking at the paths that can lead to various game states. Here’s an admittedly fairly boring event log of a player dying after walking up to a monster and then doing nothing.

happens(move(player,north),1)

happens(attack(monster,player),2)

happens(attack(monster,player),3)

happens(die(player),3)

This log could be elaborated with the value of various game states at each point in the log, such as the player’s position and health. However, its boringness raises the follow-on question: can you show me not only ways that something is possible, but interesting ways that it could happen? For some outcomes, like those that should never happen, any log is interesting, but for others this is a trickier question. One approach is to let the designer interactively refine the trace they’re requesting. In this example, they could ask for a way the player dies without ever standing around doing nothing, and then go on to add more caveats if the result were still too mundane; this can be thought of as trace zooming.

In a simulation framework, the log of how something is possible would simply be a log of the actions taken during the successful simulation run (although it may take time to recompute new runs if something like interactive zooming is offered). In a logical framework, it can be posed as an abduction problem: finding a sequence of events that, if they happened, would explain how the sought-after outcome could come about; zooming would be abduction with added constraints on the explanations. It’s also possible to view finding a path to an outcome as a planning problem within the story world, and use a classical AI planner. For example, we can find solutions to game levels and display them as comic-like sequences of illustrated events.

Strategy 3: Necessity and dependencies

Once we know what things are possible, and how they can happen, we might also want to know what must happen. Can you beat Final Fantasy VI without ever casting “Meteo”? Which quests can the player skip? Is this powerful sword I just added to the game needed or superfluous?

These kinds of questions also relate to the questions that can be asked via the first two strategies. Some kinds of necessity questions can be rephrased in terms of whether it’s possible to reach a particular state that lacks the property we want to determine the necessity of. For example, whether it’s necessary to level-up to level 5 before reaching the second boss is equivalent to asking whether it’s possible to reach that boss while at a level of 4 or below. Other kinds of necessity questions can be rephrased in terms of zooming in on traces. For example, asking whether a particular sword is necessary to beat the game is equivalent to asking for a gameplay trace where the player beats the game, which doesn’t contain the pick-up-that-sword event.

In empirical playtesting, it’s common to collect metrics about usage: how many players achieve a particular quest, use each item, etc. Similarly to how we can juxtapose empirical data with analytically determined possibilities in Strategy 1, in this strategy we can juxtapose empirical data with analytically determined necessities. Of course, if the empirical results show less than 100% for some item or event, it couldn’t have been necessary, but on the other hand there may be things that 100% of our playtesters did which aren’t actually necessary, which this game-analysis strategy would distinguish.

More automatic dependency analysis is also possible. For example, asking whether it’s necessary for event A to precede event B, or vice versa, can let us build up a graph of necessary event ordering, which at a glance indicates some of the causal structure of the game world. That includes causal structure that wasn’t explicitly written; for example, entering a locked room might have several explicit preconditions, like the player needing to find a key before entering, but also several implicit preconditions caused by interaction of other rules, like the player needing to find a particular suit of armor before entering (because there is simply no way they can successfully get to the room without having first acquired that suit of armor).

To my knowledge, no existing game-analysis work explicitly aims at this kind of automatic necessity or dependency analysis. My own work follows the approach, described in this section, of reducing necessity and dependency analysis to a series of queries implemented using strategies 1 and 2.

Strategy 4: Thresholds

Sometimes there are magic numbers delineating the boundaries of possible behavior, or of a certain regime of game behavior. What is the shortest possible time to complete a Super Mario Bros. level? What range of money could a SimCity player possess when five minutes into the game?

This strategy can give useful information often not discovered in initial playtesting, for example by finding speedruns of a level, or cheats to quickly finish it, that the usually not-yet-expert-at-the-game players in a playtesting session wouldn’t have found. In addition, it can be paired profitably with empirical player data to give an idea of how close the particular range of data being observed comes to the theoretical bounds that the game’s rules define. For example, any metrics graph that graphs players as a distribution on a numerical scale could also draw bounds of the largest and smallest possible values—and not just largest or smallest in the sense of the value’s hypothetical range (e.g. a score defined as 0 to 100), but largest and smallest actually reachable in the specific context.

In addition to telling us whether playthroughs significantly different on a particular metric’s axis from the empirically observed ones are possible, the relationship between the empirical range of data and the theoretical extrema can tell us something about the players in our playtest. For example, in one experiment we discovered that the typical players in our playtest of an underground-mining game were much more cautious with returning to the surface to refuel than was strictly necessary. This can then be used in concert with Strategy 2 to figure out how to achieve the threshold values.

In a logic-based approach, thresholds can be found using a branch-and-bound method. A possible solution (of any value) is first found, and then a constraint is added that a new solution be better than the one already found (e.g. shorter, if we’re looking for the shortest playthrough). Then we look for another solution meeting the new constraints, and repeat until no additional solutions are found. This has the advantage of generality, but can be slow.

Future work on the problem could explicitly use an optimization method, whether a randomized one like genetic algorithms, or a mathematical one like linear programming. In addition, there are a wide range of algorithms to find maximal or minimal solutions to more specific problems. For example, given a model of a level, we can find the shortest spatial path through the level using a standard algorithm like Dijkstra’s algorithm. As with the specialized algorithms in Strategy 1, a difficulty in using these specialized algorithms would be automatically determining when they’re applicable; for example, the shortest spatial path through a level may not be the shortest actually achievable path, given the game mechanics—it might not even be a lower bound, if the mechanics include teleportation. On the other hand, these kinds of differences might also give information; the difference between the shortest spatial path through a level and the shortest path that a player could possibly achieve through a level might give an indication of its worst-case difficulty, for example, since it would mean that there is some minimal level of off-perfect-path movement the player would have to perform.

Strategy 5: State-space characterization

The strategies so far can be seen as trying to probe a game’s state-space from various perspectives. Could we more directly analyze and characterize the state-space of a game?

One possibility is to try to visualize the state space. In any nontrivial game, the full branching state graph will be unreasonably large to display outright. However, it may be possible to cluster or collapse the possible dynamics into a smaller representation of meaningful states and dynamics. In addition, techniques developed for summarizing the state space of empirical player data could be applied to summarizing sampled traces from the overall space of possible playthroughs.

More interactively, single traces can display some information about their neighbors in the state space; for example, branch points might show alternate events that could have happened at a given point in the trace, besides the event that actually happened in that trace. This can be used to “surf” the space of possible playthroughs, in a more exploratory manner than the specifically requested traces returned from Strategy 2. Alternately, we can start with empirically collected traces and observe their possibility-space neighbors; this can be used to bootstrap a small amount of playtesting data into a larger amount of exploration, by showing possible—but not actually observed—gameplay that is similar to the observed gameplay.

Moving into more mathematical territory, games largely based on mathematical approaches such as differential equations, influence maps, and dynamical systems might be analyzed using standard mathematical techniques, such as finding fixed points or attractors or displaying phase-space diagrams. Some basic experimentation along these lines is sometimes done during Excel prototyping, but this area (to my knowledge) remains largely unexplored.

Strategy 6: Hypothetical player-testing

The strategies so far try to investigate the overall way a game operates. We could restrict this by trying to characterize only how a game operates with a particular, perhaps highly simplified, model of a player. Such a restriction is not intended mainly to insert a realistic player model, but to investigate how the game operates in various extreme or idealized cases. For example, what happens when the game is played by a player who always attacks, except heals when low on health? If that player does very well, the game might be a bit too simple. Or, in a multiplayer game, different players could be pitted against each other to see how they fare, which might tell us something about the design space.

The Machinations system simulates hypothetical players playing a Petri net game model, and collects outcomes after a number of runs. My own logic-based system applies strategies 1 through 4 conditioned on a player model, answering various questions about possibility, necessity, etc., under the added assumption that the player is acting in a particular manner. In a slightly different formulation, Monte-Carlo “rollouts” of boardgames pit two possible strategies against each other in a specific point in the game, to determine how they fare against each other.

Strategy 7: Player discovery

While hypothetical players can be useful for probing how a game behaves under various kinds of gameplay, I’ve found that designers often had difficulty inventing such hypothetical players, and instead wanted the process to work backwards: given a game prototype, could we automatically derive a simple player model that can consistently achieve certain outcomes? For example, rather than having to try out questions such as, “can this game be beaten by just mashing attack repeatedly?”, some designers would prefer we analyze the game and come back with: here is the simplest player model that consistently beats your game (e.g., “your game can be beaten by just mashing attack repeatedly”).

This question can be seen as a stronger or more generalized version of trace-finding (Strategy 2). Finding how a particular outcome is possible returns one possible instance where it could happen. Finding a player that can consistently make the outcome happen is a compressed description of many such instances.

There are several ways to invent these kinds of player models. One approach is to sample many possible traces reaching the requested state (using techniques from Strategy 2), and then inductively extract a player model from these traces, or perhaps several player models from different clusters of traces. There are already techniques from empirical gameplay metrics that can be used to infer such player models, which could be applied to extracted gameplay traces instead.

A different approach is to directly infer whether there exists a player model from a class of simplified players that can reach the desired state. For example: is there any single button a player can mash constantly to beat the game? If not, is there a 2-state button-mashing finite state machine that can consistently beat the game (perhaps alternating between attack and heal)? If not, we can query for state machines with more states, or other kinds of more complex player models. One axis of complexity is how “blind” the player model is: the button-mashing or alternate-between-two-states model ignores the game state completely. If there’s no simple blind finite state machine that can beat the game, how about one that only looks at one game state (or two game states)? If a player model of that kind exists, it would tell us something about the game state that is relevant for decision-making. I’ve been performing some experiments in this kind of player-model inference using logical abduction, but overall the space of possible approaches is quite open.

In multiplayer games, player discovery can be related to game-theory terminology, such as finding optimal strategies (for various kinds of optimality), dominated strategies, etc. While using game theory for videogame analysis or balancing has often been discussed, it seems to resist practical application in part due to the mismatch in scale between the size of games that game-theory software typically handles, and even small videogame prototypes. In particular, most computational game theory assumes either a one-step game, or an iterated (staged) game with relatively few steps, typically as few as three or four; whereas most videogames go on for many timesteps, and have their dynamics emerge over at least slightly longer timescales. Overcoming this problem would require either finding a way to pose many of the interesting problems in terms of less-iterated game-theory problems, or else scaling the tools to many more iterations.

Finally, a large class of gameplay algorithms can be used as player-discovery algorithms of a sort, especially if they produce interesting internal structure that tells us something about the game they learn to play. For example, a reinforcement-learning algorithm that learns a state-value function would be an interesting source of data for augmenting the kinds of state diagrams in Strategy 5, adding to them information about how valuable each state is from the perspective of reaching a particular goal.

* * *

While playtesting games is, and will remain, a valuable source of information about gameplay, the game artifact itself need not be a black box, since we can learn many things about a game design simply by better understanding the game itself—how the rules and code inside the game operate and structure play experience. This analysis of games can benefit from an ecosystem of metrics and visualization techniques that will hopefully grow as rich as that now featured in player metrics research.

Towards that end, I’ve sketched seven strategies for extracting knowledge from a game artifact: what information we might extract from a game, why we would want to extract it, and how it relates to the kinds of information we can find in playtests. While I’ve briefly mentioned existing work that tackles the implementation of some of these strategies, much remains unstudied, both in terms of undertaking many kinds of analysis at all, and for those that have been undertaken, in understanding the strengths and weaknesses of various technical approaches to modeling games and extracting information from them. My hope is that this set of proposed strategies will spur some work from diverse technical perspectives on implementing them, as well as start a conversation on the role of understanding game artifacts in game design. (Source: kmjn.org)


上一篇:

下一篇: