游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

多人游戏平衡理论第3部分:公平性

发布时间:2013-09-17 15:13:14 Tags:,,,,

作者:Sirlin

在不对称游戏中,为了确保在游戏过程中提供给玩家足够可行的选择,我们必须确保所有不同的初始选择足够公平。这便意味着一款战斗游戏和一款即时策略游戏中的每个角色都应该拥有在玩家的掌控下获取比赛胜利的机会。对于像《激战》和《魔兽世界》这样的纸牌收集游戏与团队游戏,我们至少应该提供一些可行的桥牌和类别组合帮助玩家赢得比赛。(请点击此处阅读本文第1第2部分

自动平衡

为了更轻松地实现这半不可能的任务,我们应该使用自动平衡力量。如此我们将在面对多种选择的同时待在自动防故障装置中,从而避免一些玩家可能在未来所创造出的未知战术。我将列举两款游戏进行说明:《万智牌:旅法师对决》和《罪恶装备XX》。

在《万智牌》中,各种游戏机制,如反制法术,直接伤害,治疗等等都被分为5种颜色。玩家可以根据自己的喜好利用这5种颜色去创造桥牌,但是当他们选择了更多颜色,他们便越难获得一个正确的魔法去使出各种颜色的咒语。

magic(from-sirlin)

关于5种魔法颜色的图解(from-sirlin)

结果便是桥牌将趋于专门化,并明确了其内在的劣势。例如红色不能够伤害魔法纸牌,所以即使红色桥牌最为强大,它也具有内在的劣势。同样地,每个颜色拥有2种颜色的敌人,这些敌人的颜色通常包含比敌人颜色强大的纸牌。再一次,如果红色桥牌变得非常强大,那么蓝色和白色纸牌便能将其遏制住。

最后,当Wizards of the Coast推出了带有新机制的新序列时,他们还包含了1或2张较弱的纸牌,但是足以对抗新机制了。我认为他们希望这些特定的反击是不必要的,但是如果元游戏因为新机制变得难以战胜,它们至少还可以使用一些工具去抗击这些新机制。

举个例子来说吧,《万智牌》的Odessey专注于包含弃牌区(在游戏中也叫做“墓地”)的新机制,纸牌Morningtide能够将所有纸牌从墓地中删除。如果玩家开始不知道如何面对墓地时,Morningtide便是反击手。尽管这种反击并非真正需要。之后,Mirrodin砖块将专注于工件纸牌。纸牌Annul可以只是用一个魔法便击败工件(和魔法),而纸牌Damping Matrix将阻止工件能力发挥作用。在Mirrodin的例子中,工件机制最终被遏制了。Annul和Damping Matrix是非常棒的理念,但是在Mirrodin机制中需要更强大的机制。

这与我在第二部分中提到的《Yomi Layer 3》的理念相同。该理念是为了对游戏做出反击,即当某些内容变得过于强大时,游戏有足够的弹性能够推动玩家去面对它们。

《罪恶装备》便是故障自动防护系统的重要例子。

防护仪表

每当你击中对手时,他们的“防护仪表”便会下降。下降越低,他们的射击范围便越短。这便意味着即使一连串的移动是一个“无限组合”,当你进行第一次射击时,你便能够不断保持向其发射,而随着对手的射击范围不断缩短,最终他们便很难逃离这种组合了。

进程引力

当你在组合中被耍了时,引力也将让你的角色随着时间的发展变得越来越棒。所以即使组合可能会将你耍得团团转,被害者的身体也将随着时间的发展更快速下降,这将最终摧毁无限游戏机制。

绿色阻块

想象一个关于阻块的攻击序列,即敌人连续发动攻击而导致你不断后退,退得太远而难以继续。但是当你面对最后的攻击时,你可以通过一个特别的移动取消它并让角色能够继续向前。之后,你可以重复该序列并阻止敌人的进攻。在这种情况下存在戒备陷阱,而在《罪恶装备》中,角色可以使用我所谓的“绿色阻块”阻止该陷阱。在阻止敌人进攻的同时,你也可以使用一些超级仪表去创造绿色力场而将敌人与自己远远地隔开,并最终摧毁他的陷阱。

guilty gear(from sirlin)

guilty gear(from sirlin)

这些功能是用于解决连设计师都不知道的问题。他们只知道如果游戏在无限组合或锁定状态下结束,一些自动故障装置便能够解救它们。同时他们也可以因此去设计一些不同的角色。不管角色多疯狂,这种战术多可怕,设计师都知道所有角色所共享的自动故障装置至少能够控制所有内容。

游戏测试和过程修正

不管你的游戏是否具有自动故障装置系统,在某些情况下你都需要设计一个多样化的角色/种族等设置,确保它们彼此连贯有趣,然后通过游戏测试自信地捡出平衡问题。这个世界上的所有理论都不会让你脱离游戏测试。

你需要开始调整游戏,做出反应并不断学习。别让制作人将调整放到你所负责的一个固定项目列表中。这是一个持续的过程,持续到你最终发行游戏。游戏测试能够让你发现你之前未曾预测到的情况,你也应该坦诚地面对这些发现。其目标并不是创造出你最初设想的游戏,因为你最初的理念并未考虑到自己从开发和测试中所学到的内容。当你或测试者发现细微差别或不可预见的属性时,你便能够围绕着这些属性进行创造,并将其整合到游戏的平衡中。

层级列表

在平衡《街头霸王》,《Kongai》以及纸牌游戏《Yomi》时,我使用的是与游戏测试者类似的方法。我认为这种方法滨不能真正依赖于类型,关键还在于管理层级列表。

“层级列表”这一次是来自打斗游戏类型。它表示每个角色的能量级别,从最高到最低,但它同时也接受了这一列表并不准确。比起将20个角色设定从1到20的排名,该理念更倾向于将他们归到能量“层级”中。要记住,即使你面对的是一款100%平衡的游戏,玩家也仍需要一个层级列表。你应该接受这样的玩家列表,而你的工作就是好好管理它们。

在《Kongai》和《Yomi》中,我甚至给予了玩家有关这种层级列表的模版(这对于作为设计师的我来说非常有用)。首先我让他们去思考3个层级:最高,中等和最低。然后我告诉他们我希望空着的两个“秘密层级”。

0)上帝层(没有一个角色应该出现在这一层,如果他们真的做到了,你就需要赋予其挑战性)

1)最高层(不要害怕将自己最喜欢的角色带到这里)

2)中层(很棒,但是不像最高层那么厉害)

3)最底层(我仍然能够战胜它们,但有点困难)

4)垃圾层(没人应该出现在这一层。操控这样的角色就没意义了)

关于平衡我的第一个目标便是确保上帝层每人。当然有些角色最终也会变得足够强大,或并列强大,这也没关系。但是“上帝层”的角色因为太过强大将会导致剩下的游戏变得无聊。我们必须解决这种问题,因为这会破坏整体的游戏玩法(乃至游戏)。同样地,上帝曾的任何能量级别都非常高,我们甚至不敢奢望去平衡围绕在它身边的游戏内容。

我的下一个目标是摆脱垃圾层角色。他们太糟糕了以至于没人愿意变成这样,但是他们也可以通过提高能量而攀升到最高,中间和最底层某一阶段。如果他们到了这三个阶层中,他们便可以继续游戏了。

akuma(from sirlin)

akuma(from sirlin)

公共层列表

如果测试者可以看到彼此的层列表就再好不过了。我有必要去阅读有关这些内容的争论,而测试者也该以此去划分他们的理念。有时候当某些人将一个角色一反常规地放置在列表中较高或较低的位置上时,我便会进一步挖掘玩家是否真的知道一些我们并不知道的内容。其它时候,玩家只是感到抓狂,而其他测试者则会很高兴地指出这些内容。我们同样也能看到测试者得出了怎样的一致性,就像他们是否会将特定角色归在最糟糕的位置上。

我认为在每一款游戏中最重要的一点便是测试社区能够始终提供上帝层或垃圾层不包含任何角色的层列表。当你做到这点时,你的下一个目标便是压缩层。这便意味着你需要确保最佳和最糟糕角色间的区别保持最小。需要注意的是这便意味着即使底层的角色仍与一个月前一样,你也有可能很大程度地完善了游戏。

调整层

在我平衡过的所有游戏中,我使用了相同的方法,即让高层去设定基础能量层。在《街头霸王》中,我便拥有一个已建立的高层作为最初游戏的起点,但是在《Kongai》和《Yomi》中,角色在高层结束游戏则参杂着意外元素。但是在早期,当忽略上帝层时,我们很清楚哪些角色/桥牌在上层,我将其设为目标能量级别。换句话说,那一层的角色是关于“人们对游戏的想法”。再一次,我也未计划谁应该出现在那里,但是我接受最终结果。所以如果高层是目标,你就应该尽力去调整底层。如果高层是预期能量级别,你肯定不希望陷入一些好的内容中。相反地,推动底层角色往上走并尽可能压缩层与层间的差距,最终结果便是最糟糕的角色与最厉害的角色间不会有太大的差别。

在做出这些调整时我不断注意到一些心理元素。第一个便是每当我做出糟糕的移动或让角色掉落到较底层时,玩家便会反应过度。有时候高层会掌握较高的能量,或者一般玩家最终面对一些意想不到的好结果,又或者某一角色的移动减少了游戏的策略性,并需要通过失败去获取其它内容。关于消弱能量也存在许多原因。

我将使用一些假设数字去传达总体思路。想象移动是在10个能量级别中的第9级,这对于角色来说太高了。我不只一次发现当我将能量级别设置在10级中第8级时,测试者便会抱怨该移动没有价值,并将角色下放(至少)1层。这种情况经常发生。出于某种原因,每一款游戏中的玩家似乎都不能掌握高层角色即使做得不好也仍然能够待在高层的这一理念。

这是其中一个你不会听从游戏测试者的理念。忽视他们对于消弱的第一反应,让他们慢慢去适应它,让他们看看基于新移动版本是否还能获取成功,然后重视他们关于移动或角色的反馈。

我们需要了解的其它心理效应是,当你提高移动的能量时会怎样。我在游戏开发者大会中听取了Rob Pardo有关平衡多人游戏的演讲,并尝试着将其用于自己平衡的所有游戏中,我想Rob的观点是正确的。他说如果你采取某一行动并且是自己还不确定如何做到平衡的,你就需要确保它足够强大。如果它过于虚弱,你便会冒着没人去使用它的危险。在此之后,当你提高它的能量时,也就没有测试者会注意或关注它了。因为他们的心里已经认定该移动是虚弱的。甚至当你提升它的能量而到达某一合理的能量级别时,你也很难吸引到人们的注意。

相反地,Pardo认为应该一开始就赋予该移动足够的能量。如此所有人便会在乎它。我在《街头霸王HD Remix》中的T.Hawk,Fei Long和Akuma上便都这么做,因为我不知道如何设定他们的能量级别。这些角色在游戏开发的某个点上都是最棒的角色,这便意味着我会获得来自测试者的大量反馈。有时候“太过强大”的角色版本的结果也是好的,但是有时候也没那么好。如果能够明确上限,我便能够快速选取最适当的能量级别。

错觉

Rob Pardo在有关多人游戏的演讲中提到的另一点并不是将乐趣置身于外。我也认可这点。不要只是将游戏当成一些需要排列的抽象数集。你也需要思考人们是如何理解它,以及这是否真的有趣。Pardo表示他希望玩家认为自己拥有的工具非常强大,尽管事实上不一定如此。

Tafari便是来自我的游戏中的一个例子(游戏邦注:他是《Kongai》中的捕兽者)。Tafari的主要能力便是,敌人在与他打斗时不能转换角色。转换角色是这款游戏的主要机制,所以与他打斗就像不能出石头而玩石头剪刀布。但是在一开始我便为Tafari设置了一些弱势,如果他在行进过程中遭遇战斗时便会不断遭遇失败。而当你带着他去抗击一个虚弱的角色时,他就会瞬间强大起来。

我知道Tafari并不是很强大。我让一些专家对其进行测试,他们更倾向于将其放置在中层。但是当我再添加新的测试者时,几乎所有人都认为Tafari过于强大。我拒绝改变他,在一年的测试后,最佳玩家仍然将其定位在中层,而菜鸟级玩家也仍将其归为高层。Tafari是错觉般的存在。

我之所以这么说是希望当你想要看到某些内容比实际上看来更强大时最好能够谨慎地审视相关反馈。如果你能做到这点的话便有可能取得成功,因为Tafari不仅让游戏变得更加有趣,也创造了许多争议,最后它也获得了平衡。

比赛

除了层列表,你还应该思考所有特殊的比赛。例如《街头霸王HD Remix》拥有17个角色和153种可能的比赛。对于HD Remix之前版本的《街头霸王》,专家更倾向于将角色分为4层(游戏邦注:没有角色是在上帝层或垃圾层),他们将Guile放置在第2层。尽管这意味着Guile的能量级别是可接受的,但是在2个特殊比赛:Vega和Dhalsim中却是处于不利地位。一个整体还不错的角色在遭遇两个特殊角色时是否还能占据优势?这并不见得。

如果这是FPS中的武器,RTS中的单位或基于团队的打斗游戏中的角色,这便是可行的。你可以在游戏开始时在FPS中选取武器,所以它们的平衡就无需满足一款不对称游戏的严格要求。RTS中的单位和基于团队的打斗游戏中的角色则是局部失衡的典例。但是在Guile的例子中,你在游戏一开始便锁定了自己有关Guile的选择,所以你在整款游戏中都无法摆脱他,如果他遭遇了一些糟糕的比赛,那么即使玩家将其设定在更高层也没有用了。

我们总是很难在不对称游戏中做出任何调整。我们该如何在不影响其它比赛的前提下帮助Guile战胜Dhalsim。这里并不存在简单的答案,但是我建议你能够真正去解决该问题,而不是选择逃避。

关于该问题我的解决方法是双重的。首先,因为与这一特殊匹配无关,所以我改变了Guile闪踢的轨迹。这能帮助他避开Dhalsim的火球。其次,Guile所面对的一个问题便是Dhalsim的低拳将摧毁Guile的音爆弹并穿过屏幕撞击Guile。我改变了Dhalsim的有效射击区,从而让Dhalsim在这种情况下发动攻击。这一改变不会对其它比赛产生任何影响,所以是这一问题的真正解决方法。

欺骗式解决方法是这种比赛的特例,将给予Guile更多生命值。虽然这听起来很诱人,因为你不需要担心会搞乱其它比赛,但是这种非解决方法却很假。它会混淆玩家的期望与对于Guile拥有多少生命值的直觉看法。

还有一种类似的方法,即在一款每个单位间相互对抗的RTS中创造一个巨大的表格,并创造一个特殊情况,即关于它们会给彼此造成多大伤害。这也混淆了玩家对于每个单位所造成的破坏的直觉看法,并创造了无形且不可靠的系统。我知道在平衡不对称游戏时你总是会受到引诱去使用这种特殊情况式解决方法,但我也想提醒你们一定要努力去抵制诱惑。

结论

如果可以的话你英国先基于一些自动平衡和自动防故障装置系统开始设计。然后致力于创造所有游戏的多样性,并开始进行长期的游戏测试。当你从游戏测试中学到更多时,你可以相对应地改变方向。开始追踪不同层,即先固定上帝层,然后固定垃圾层。并压缩层与层间的距离从而让糟糕的角色也不会比优秀的角色差多少。最后完善所有的比赛,解决所有问题,并避免逃避式解决方法。

本文为游戏邦/gamerboom.com编译,拒绝任何不保留版权的转载,如需转载请联系:游戏邦

Balancing Multiplayer Games, Part 3: Fairness

By Sirlin

In asymmetric games, we have to care about making all our different starting options fair against each other in addition to making sure the game in general has enough viable options during gameplay. That means each character in a fighting game and each race in a real-time strategy game should have a reasonable chance of winning a tournament in the hands of the right player. For collectable card games and team games like Guild Wars and World of Warcraft’s arenas, we should instead say that at least “several” possible decks and class combinations should be able to win tournaments.

Self-Balancing Forces

To make this semi-impossible task easier, we should use self-balancing forces if possible. This will let us go nuts with diverse options while building in some fail-safes to protect us from unknown tactics that players might develop in the future. I’ll give examples of this from two games: Magic: The Gathering and Guilty Gear XX.

In Magic, the various game mechanics such as counterspells, direct damage, healing, and so on, are divided amongst five colors. Players can build decks with as many of these five colors as they want, but the more colors they include, the harder it is to have the right mana to actually play the various colors of spells.

A simple diagram of the 5 colors of magic.

Consequently, decks are forced to specialize, which gives them inherent weaknesses. The color red, for example, has no way to destroy enchantment cards, so even if a red deck ended up being strong, it has a built-in weakness (it must either accept that it can’t destroy enchantments, or weaken its consistency by trying to incorporate another color that can). Also, each color has two enemy colors, and those enemy colors often include cards that are specifically powerful against their enemy colors. Again, if a red deck became too powerful, there will be blue and white cards that keep red in check, at least somewhat.

Finally, when Wizards of the Coast prints a new set with new mechanics, they usually include a card or two that are tuned to be fairly weak, but that specifically counter the new mechanic. I think they hope that these specific counters are not needed, but if the metagame becomes completely overwhelmed by the new mechanics, then there are at least some fail-safes the metagame can use to fight the new mechanic.

For example, Magic’s Odessey block focused on new mechanics involving the discard pile (called the “graveyard” in Magic), and the card Morningtide could remove all cards from all graveyards. If players started getting too tricky with their graveyards, Morningtide was a counter. It practice, this counter wasn’t really needed though. Later on, Magic’s Mirrodin block focused on artifact cards. The card Annul could counter artifacts (and enchantments) for only one mana, and the card Damping Matrix prevented artifact abilities from working. In Mirrodin’s case, the artifact mechanics really did get pretty out of hand. Annul and Damping Matrix were good ideas, but even stronger failsafes were needed during Mirrodin.

This is really a similar concept to Yomi Layer 3 that I mentioned in part 2. The idea is to build in counters to the game so that even if some things end up more powerful than you expected, the game is resilient enough that players can deal with it.

Guilty Gear is a very important example for its fail-safe systems. I described that game’s system in detail in this article, but here’s a quick refresher.

Guard Meter

Every time you hit the opponent, their “guard meter” goes down. The lower it is, the shorter their hitstun is. That means that even if a string of moves is an “infinite combo”, meaning that once you land the first hit, you could keep hitting them forever, their shorter hitstun eventually lets them block to escape the combo.

Progressive Gravity

When you are juggled in the air during a combo, the gravity applied to your character gets greater and greater over time. So even if a combo could juggle forever somehow, the victim’s body falls faster and faster over time, which would eventually ruin the infinite juggle.

Green Blocking

Imagine an attack sequence against a blocking opponent where do a few hits in a row that leave you pushed back, too far away to continue. But when you get to the last hit, you cancel it with a special move that makes your character move forward. After that, you repeat the sequence and force the opponent to block forever. In case this type of lock-down trap exists, Guilty Gear heads it off at the pass with a feature I call “green blocking.” While blocking, you can use some of your super meter to create a green force field that pushes the opponent pretty far away from you, letting you ruin the spacing of his trap.

Here’s that green blocking thing in Guilty Gear.

Each of these features is designed to solve a problem that the designers didn’t even know they had. They just know that if the game ever ended up in a state of infinite combos or juggles or lockdowns, that some fail-safe features need to save them. Also, these fail-safe features freed them to design incredibly varied and extreme characters. No matter how crazy a character is, or how scary this rushdown tactics ended up, the designers knew that this defensive system of fail-safes shared by all characters would keep things at least somewhat in check.

Playtesting and Course-correcting

Whether or not your game has fail-safe systems, at some point you have to design a diverse set of characters / races / whatever, make each one coherent and interesting, then have the confidence that you’ll sort out the balance problems in playtesting. All the theory in the world will not save you from playtests, of course.

You need to start tuning the game, and react and learn as you go. Do not let a producer turn tuning into a fixed list of items that you are accountable for checking off, one by one. It’s an organic, continuous process that keeps going until you need the ship the game. Playtesting lets you discover things you couldn’t have predicted ahead of time, and you should be open to those discoveries. The goal isn’t to make the exact game you originally envisioned, because your original vision did not take into account all the things you learned from development and playtests. When you or the testers discover nuances or unexpected properties, you have the chance to build around those and incorporate them into the game’s balance.

The Tier List

During the balancing of Street Fighter, Kongai, and my card game called Yomi, I used a similar approach with playtesters. I think this approach doesn’t really depend on the genre, and the key idea is managing the tier list.

The term “tier list” is, I think, a term from the fighting game genre. It means a ranking of how powerful each character is from highest to lowest, but it also accepts that such a list cannot be exact. Instead of ranking 20 characters from 1 to 20, the idea is to group them together into “tiers” of power. Remember that if a divine being handed you a 100% perfectly balanced game, that players would still make tier lists. You should accept the existence of these lists from players as a given, and its your job to manage this list.

In Kongai and Yomi, I even gave the players a template for the tier list that is most useful for me as a designer. First, I tell them to think of three tiers: top, middle, and bottom. Then I tell them about the two “secret tiers” that I hope are empty.

0) God tier (no character should be in this tier, if they are, you are forced to play them to be competitive)

1) Top tier (don’t be afraid to put your favorite characters here. Being top tier does not necessarily mean any nerfs are needed)

2) Middle tier (pretty good, not quite as good as top)

3) Bottom tier (I can still win with them, but it’s hard)

4) Garbage tier (no one should be in this. Not reasonable to play this character at all.)

My first goal of balancing is to get the god tier empty. Of course some character will end up strongest, or tied for strongest, and that is ok. But a “god tier” character is so strong as to make the rest of the game obsolete. We have to fix that immediately because it ruins the whole playtest (and the game). Also, the power level of anything in the god tier is so high, that we can’t even hope to balance the rest of the game around it.

My next goal is get rid of the garbage tier characters. They are so bad that no one touches them, and it’s usually pretty easy to increase their power enough to get them somewhere between top, middle, and bottom. If they are somewhere in those three tiers (which gives you a lot of latitude actually), at least they are playable.

Public Tier Lists

I really like it when playtesters all see each other’s tier lists. The debate this spawns is very useful for me to read (or overhear in person) and for the playtesters to sort out their ideas. Sometimes when someone put a character unusually high or low on the list, I dug deeper to find out that player really did know something most of the rest of us didn’t. Other times, that player is just crazy and the rest of the testers are happy to point that out. It’s also good to see what kind of consensus the testers come up with, like if they all rank a certain character as the worst, for example.

The biggest landmark moments in each of the games I balanced was when the tester communities consistently gave tier lists with no characters in the god tier or garbage tier. Once you’ve achieved that, the next goal is to compress the tiers. That means that you want the difference between the best and worst characters to be as small as possible. Notice that that means even if you have the same characters in the bottom tier that you did a month ago, you might have dramatically improved the game if all those “bad” characters are really only a hair worse than the tier above, rather than way worse.

Adjusting the Tiers

In all the games I balanced, I used the same approach of letting the top tier set the benchmark power-level. In Street Fighter, I already had an established top tier as a starting point from the previous game, but in Kongai and Yomi, it was somewhat accidental who ended up in the top tier. But early on, after the god tier was removed and it was pretty clear which characters / decks were top, I allowed that to be the target power level. In other words, the characters in that tier are “how the game is supposed to be.” Again, I didn’t plan exactly who would be here, but I accepted how it ended up and worked with it. So if the top tier is the target, it’s the bottom tier you should adjust the most. If the top tier is the intended power level, you don’t really want to mess up the good things you have going there. Instead, boost the bottom characters up and compress the tiers as much as you can, so you get the worst characters just barely below or equal to the best characters.

There are some psychological factors that I saw over and over again while making these adjustments. The first is that whenever I make a move or character worse (aka “nerfing”), players overreact. Sometimes that top tier creeps a little too high in power, or an otherwise average character ends up having something unexpected that’s crazily good, or a character has a move that really reduces the strategy in the game and needs to lose that in exchange for gaining something else. There’s lots of reasons for nerfs.

I’ll use some made-up numbers to convey the general idea here. Imagine a move is at power level 9 out of 10, and that’s just too good for that character. Time and time again, I saw that if I made the power level an 8 out of 10, playtesters would complain that the move was worthless and put the character down at least one tier. This happened consistently, and even in the cases where 8 out of 10 was still too powerful and it really needed to be a 7. For some reason, players in every game seem unable to grasp the concept that a top tier character who is made slightly worse can still be a top tier character.

This is one of the cases where I think you just can’t listen to the playtesters. Ignore their first reactions to nerfs, let them play it more and get used to it, let them see if they can still be successful with the new version of the move, then take their feedback on that move or character more seriously.

The other psychological effect to know about is what happens when you increase a move’s power. I learned about this Rob Pardo’s lecture on balancing multiplayer games at the Game Developer’s Conference, and I tried it on all the games I balanced, and I think Rob is right. He said that if you have a move that you’re not really sure how to balance, make it too powerful. If you make it too weak, then you run the risk of no one using it at all. Then, when you slightly increase its power, none of the testers will notice or care. They already decided that move is weak. Then if you make it slightly more powerful still, they still won’t care. Even when you inch it up past the reasonable level of power, it’s hard to get it on people’s radar and that makes it really hard to know how to tune the move.

Instead, Pardo said to start with the move too powerful. Then everyone will know about it and care about it. I did exactly this with T.Hawk, Fei Long, and Akuma in Street Fighter HD Remix, because I had trouble figuring out their power levels. Each one of those characters was the best character in the game at some point in development, and that meant I got lots of feedback from testers about these characters. It also gave me a sense of where the top of the scale even was. Sometimes my “too powerful” versions of a character would end up waaaaay too good, or sometimes just barely too good. By knowing where the upper limit was, it helped me pick appropriate power levels more quickly. That said, I did have to deal with the inevitable cries that follow all nerfs, but that just goes with territory here.

Illusions in Tiers

Another point from Rob Pardo’s speech on multiplayer games was not to balance the fun out of things. I’m very conscious of this as well. Don’t just think about the game as some abstract set of numbers that has to line up. You also have to think about how people will perceive it and whether it’s actually fun. Pardo said that he likes the player to feel like the tools they have are extremely powerful, even though they are actually fair.

An example of this in one of my games is Tafari, the Trapper in Kongai. Tafari’s main ability is that the enemy cannot switch characters while fighting him. Switching characters is one of the game’s main mechanics, so fighting him is like playing rock, paper, scissors with no rock. It seems, at first glance, ludicrously powerful. But from the start, I gave Tafari several weaknesses and he loses many fights if he ends up having to fight on even footing. He’s best when you bring him in against an already-weak character to finish them off.

I knew Tafari was not too powerful. I tested him with many experts and they tended to rank him as middle tier once they got the hang of him. As we added new testers over time, probably nearly 100% of them claimed that Tafari was too strong. I refused to change him though and after a year of testing, the best players still ranked him as middle tier, while inexperienced players still ranked him as top. Tafari is an illusion.

I’m telling you this because you have to be very careful with feedback in cases where you intentionally made something feel more powerful than it actually is. It’s a success if you can pull that off though, because Tafari makes the game more interesting, creates lots of debates, and at the end of the day, he is balanced.

Counter Matches

In addition to the tier list, you should also be thinking about all the specific matchups. Street Fighter HD Remix, for example, has 17 characters and 153 possible matchups. For the version of Street Fighter before HD Remix, experts tend separate the characters into four tiers (none of them are god tier or garbage tier), and they place Guile in the respectable second tier. Even though that means Guile’s power level is acceptable, he is severely disadvantaged in two specific matches: Vega and Dhalsim. Is it ok that an overall good character gets countered by two specific characters? Not really.

If these were weapons in an FPS or units in an RTS or characters in team-based fighting game, then it might be acceptable. You pick up weapons in an FPS after the game starts, so their balance doesn’t need to meet the hard requirements of an asymmetric game. And units in an RTS and characters in team-based fighting game are examples of local imbalances, which are fine (it’s the races and teams that need to be balanced). But in Guile’s case, you lock in your choice of Guile at the start of the game, then you are stuck with him the entire game, so it really is a problem if he has some bad counter matches, even though players rate him fairly highly overall.

It’s really tricky to adjust anything in an asymmetric game though. How can we help Guile in just the Dhalsim match without affecting all the other matches? There’s no easy answer here, but I advise you to really solve the problem, rather than copping out.

My real solution to this problem was two-fold. First, for reasons unrelated to this particular match, I changed the trajectory of Guile’s roundhouse flash kick. This happened to help a bit against Dhalsim’s fireballs, so we’ll count that as a lucky accident. Second, one of Guile’s problems is that Dhalsim’s low punches can go under Guile’s Sonic Boom projectiles and hit Guile from across the screen, with no repercussions. I changed Dhalsim’s hitboxes so that Dhalsim now trades hits in this situation, rather than cleanly hits. This change has virtually no affect on any other match, so it’s a real solution to the problem.

A cheating solution would have been to special case this match and give Guile more hit points. This sounds attractive because you don’t have to worry about messing up other matches, but this non-solution feels really artificial. It messes with players’ expectations and intuitions about how many hit points Guile has.

A similar cop out would be to create a giant table in an RTS of every unit versus every unit and special case how much damage they all do to each other. Again, it messes with player intuition about how damaging each unit is, and creates and invisible, wonky system. I know you’re going to be tempted to use these types of special case solutions when balancing asymmetric games, but try your hardest to avoid them.

Conclusion

Start your design with some self-balancing forces and fail-safes if you can. Then go wild and create all your game’s diversity, then start the long road of playtesting. As you learn more from playtesting, change your course as you go. Start keeping track of tiers, first by fixing the god tier, then by fixing the garbage tier. Then compress the tiers so that even the bad characters are only slightly worse than the best characters. Finally, fix all the counter-matches you can by actually solving the puzzle, and avoiding cop out solutions.(source:sirlin)


上一篇:

下一篇: