游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

《City Conquest》开发者谈AI在游戏设计中的运用

发布时间:2012-03-01 17:43:19 Tags:,,,

作者:Alex J. Champandard

随着游戏行业争相制作日益复杂的系统和机制,AI逐步被大家视作是实现此复杂性目标的杰出工具。值得一提的是,如今有越来越多的研究人员和公司开始把AI当作辅助游戏设计的工具,以此减少调试游戏内容属性所需的时间。

在同Paul Tozour的访谈中,我们将谈及其目前的kickstarter项目《City Conquest》,这是款通过引入进攻内容打破常规模式的塔防游戏。但文章的谈论焦点不是创新机制或功能,而是Paul如何逐步调试复杂参数,提高游戏平衡性。

City Conquest from handheldarcade.com

City Conquest from handheldarcade.com

整体蓝图

你觉得通过对应函数表现“玩家趣味”或“玩家体验”是否有必要?我们是否能够在实践中做到这点?

在我从业的前16年里,我觉得游戏中的机器学习(游戏邦注:机器学习研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能)毫无意义。我过去常说“趣味性没有对应函数”。每当话题转至神经网络或基因算法之类的机器学习技巧时,我总会说“趣味性没有对应函数”,称这会让机器学习在游戏中变得毫无意义,最终不了了之。

如果你从表面价值来看,情况完全就是如此!我的意思是,假设你拥有一个能够告诉你游戏各部分“趣味性”情况的电脑程序,此程序要如何运作?它如何知晓游戏的某块内容过于重复?它要如何获悉与Swamp Boss战斗要比和Lava Boss对决更有趣,或则关卡5缺乏粘性是因为其中包含过多横断物,还是Plasma Gun颇令人兴奋,但Plasma Gun运用起来不怎么有趣?

A Shield Generator from aigamedev.com

A Shield Generator from aigamedev.com

编写这样的程序就好比解决Turing Test问题,只是情况会更复杂点。你需要完全基于玩家的游戏认知及互动方式创建模型,准确模拟人类大脑中成千上万的奖励中枢,以获悉什么能够令玩家感到开心,及为什么看到水管工跳跃及点击奶牛我们会感到异常兴奋。

当然,这完全完全缺乏可能性。

认为“趣味性不存在对应函数”的观点完全正确——这还颇有益处!在游戏AI开发中,行业新手有时会对机器学习的操作能力及合理运用地点产生不切实际的预期。若他们有学术背景,有时甚至会觉得手动解决游戏AI问题是个“错误”举措。他们期望通过“魔杖”立即创造所有AI。

你需要抵消这种念头,打破这些误区,提醒设计师们,有时你可以通过机器学习将优质的AI机制变得更杰出,但这无法帮助你在起初阶段就制作出优质的AI。

很长时间以来,我都非常满意这个答案。

但我最终发现,这个观点虽然没错,但没有什么关联性!

我的观念发生很大改变。我发现机器学习不是制作游戏AI的工具——它是进行游戏设计的工具。

你也许无法编写出能够植入娱乐价值具体数据的适应性函数,但若你能够清晰呈现自己的设计目标,那么在多数情况下你都可以编写出能够衡量某些游戏内容运作情况的对应函数。

这首先涉及选择正确的设计目标,创造预期的娱乐体验,判定哪些设计目标适合基于AI的调试。

例如,在《City Conquest》中,我给每个进攻和防御建筑都设定特定的策略角色,我心中的许多预期都构成高效的进攻和防御策略。但关于建筑如何协同运作,我还设有相应的设计目标:

* 独特性:9个防御建筑(塔楼)和进攻建筑(运输舰衬垫)都应该别具一格。

* 效用最低限度:不应出现“性能较弱”的建筑类型——所有建筑都应处于重要位置,扮演明确策略角色,而且在有些情境中,特定建筑将是玩家获胜的关键。

* 效用最高限度:不应存在“过分强大”的建筑类型——换而言之,不应存在这样的建筑类型或建筑类型组合:让玩家通过应对无反抗能力的对手而胜出。

* 成本等效:游戏中的所有建筑都应该同其在游戏中的实际效用达成资源成本等效。

* 联合作战:若干不同建筑类型的组合体要比少量建筑类型的结合体作用更显著(游戏邦注:无论是进攻建筑,还是防御建筑)。

* 反应性:玩家的最佳策略应主要取决于其他玩家的战略。

(要把握的是,我的机制目前还无法通过对应函数衡量这些元素的运作情况,我不希望自己听起来有些言过其实。但这是计算机智能能够帮你优化的典型设计目标,虽然这通常不是基于直接方式。)

你是否觉得机器学习算法在游戏行业中遭遇不公平对待?

不见得如此。这里存在一定的偏见,但还没有得到充分的证实。

The Disruptor tower from aigamedev.com

The Disruptor tower from aigamedev.com

有些ML(神经网络)形式完全毫无作用;很多GA(基因算法)则是过于缓慢,无法在运行时间中运用;结果缺乏预测性;就如我之前提到的,行业新手有时会对机制的潜能持不切实际的态度,从而带来不良影响。没有什么比AI开发者寄希望于“机器学习”魔杖而不是着眼于自己编写代码更糟糕。

但我并没有这么做,因为我属于象牙塔中的学术研究者,或者说是有些过于理想主义。我算是行业老手;参与两款《银河战士》游戏的制作,在Gas Powered Games是排行第三。我这么做是因为我觉得这有可行性,因为我是个资本主义者,我认为这能够提高质量,降低成本。

你为何会运用计算机智能处理游戏平衡问题?

我厌倦设计AI,我意识到现在是时候该着手“AI”设计。

这里是个思维实验:

在今后的100年里,设计师们将如何设计游戏?

我们没有概念游戏未来这将呈什么态势。但“游戏”多半不会再保持当前的模式。也许它们会变成全息甲板。但这无关紧要。

2012年,设计师们将采用什么工作方式?

他们将采用什么类型的工具?

我不知道具体答案是什么,但这些无疑会和目前情况大不相同。游戏设计依然处在初级阶段——或者可以说是处在叛逆的前青春期阶段。

1个世纪后,“游戏设计师”的含义将发生翻天覆地的变化。不会再有设计师给予基本设计问题模糊答案,或者在未把握相关负面影响的情况下做出巨大游戏变更,抑或是以“迭代”名义投入过多时间探索设计领域。

但至少,我们将拥有业内的共享设计语言,对构成“优秀设计”的元素达成共识,得到更多强大的设计工具。

假设你想要建造一座桥,那么你就需要聘请有执照的建筑工程师,是吧?这类工程师需要学习3-5年的土木工程专业课程,方能达到建造桥梁的专业标准。

之所以要考虑这些是因为我们知道如果桥梁设计不当,就有可能出现坍塌,地震、大风或交通事故就有可能会摧毁桥梁,这些情况会危及群众生命,令桥梁不复存在。所以设计师需要投入众多时间学习如何设计出合格的桥梁。

大家深知这点,所以都觉得确认工程师是否合格是建造稳固桥梁的必要条件。

但在游戏中情况就不是如此。

我们倾向聘请对游戏设计怀有热情而不是真正能够胜任的设计师——我们偏好光说不做、有领袖气质的人员,而不是那些埋头苦干的严谨实践者。我们将几百万美元的项目交由未受过足够游戏设计训练的管理者负责,我们无法对他们进行进一步的培训或同他们达成共同设计理念。我们放纵他们盲目碰运气,然后不解为什么团队成员最终会变得筋疲力竭,项目最终会以失败告终,工作室总是屡屡惨遭失败。

会出现这种情况多半是因为我们没有在设计师遭遇失败时告知实际情况。我们掩盖设计失误的严重后果。当桥梁的某个位置发生坍塌时,情况通常显而易见,我们无法进行掩盖,整个世界都会知道究竟发生什么事情,但在游戏行业,设计失误颇令人尴尬,我们多半选择将此隐藏在工作室中。

The Ground Slammer from aigamedev.com

The Ground Slammer from aigamedev.com

我们必须遏制这种情况,持续以这种方式设计游戏代价过高,过于冒险,致使设计就会变成阻碍因素。

所以我们需要满足下述两个条件。

首先,游戏设计需要发展成真正的学科。并不是说游戏行业需要像土木工程领域那样执行持证上岗规定,但游戏设计学科需要摆脱自己的不成熟状态。

其次,我们需要更杰出的工具。

若回到30年前,询问设计师他们未来希望获得什么工具,他们多半会提到能够创建关卡及在三维空间中进行脚本处理的杰出工具。如今这些工具都已存在。如今我们拥有Unity、Unreal和Gamebryo之类的引擎,而这些正是设计师在1981年时所希望得到的工具。

但这些不是我们所需要的全部工具。Unity和Unreal无法让你深入把握游戏设计。它们能够帮你制作游戏,但无法协助你做出设计决策。

我们还需要更多工具,未来行业需要出现更多工具。我们没有意识到这点是因为这些工具尚未存在——所以我们没有意识到其存在的必要性,没有发现行业缺少这些工具。

现在我们已享有游戏“引擎”,但我们没有发现自己还缺少其他零部件。

我过去一直觉得机器学习技巧毫无用处。但现在我不再将机器学习视作游戏AI工具,而是逐步将其当作设计工具,这带来决定性的变化。

以华尔街为例,过去纽约证券交易所曾到处都是大声吆喝,互相比划复杂股票买卖手势的交易员。仅10年时间,整个交易所就发生翻天覆地的变化,过去的交易员被如今的计算机所取代。从那以后,行业就争相提供更快的服务器,更稳固的连接网络,更短暂的等待时间,旨在实现几微秒内的高效交易。我们再不会回到过去的时光。

或者可以将此同招聘棒球员进行比较。招聘棒球员在过去是个非常主观的过程,通常是10个经理坐在室内商量要聘请哪个人员来填补团队空缺,他们并不清楚自己需要寻找什么样的球员及如何评估能够促使团队顺利运作的客观因素。“我们不能聘请那家伙——他的女朋友很丑;说明他没有自信。”

在电影《点球成金》中,你会看到Billy Beane如何聘请耶鲁大学的经济学博士及采运用统计建模方法,带领Oakland Athletics获得一连串的惊人胜利,改变传统的棒球员招聘模式,将此变成定量学科。我们无需再回到过去的时光。

如今的游戏设计领域就像是没有电脑的纽约证券交易,或是Billy Beane尚未出现时的棒球员招募模式。

行业将发生改变,我们不妨拭目以待。摩尔定律依然有效,在云计算群集中我们依然能够进行大量的运算。

运算成本逐步降低,而游戏设计师的雇佣成本则逐步提高。

如今我们能够运行众多玩法模拟内容,高效进行基因算法分析,在几小时或几天内将这些信息反馈给游戏设计师,这无疑将改变整个设计过程。

那么未来我们是否能够完全替换掉所有的游戏设计师?

我希望未来行业能够呈现众多更强大的工具,应对具体的游戏设计问题。未来这些强大游戏引擎将继续存在,但行业将会出现类似GPS和导航系统的开发工具,引导开发者做出满足其设计目标的决策。

The Lightning Tower from aigamedev.com

The Lightning Tower from aigamedev.com

这将提高优秀设计师的评判标准。所以优秀设计师将享有更强大的工具,能够在更短时间内设计出更杰出的作品,而假冒设计师会发现自己完全脱离圈子,需要另谋新职位。

这些情况都颇令人高兴。

目前你将AI视作游戏设计的辅助工具。你觉得运用基因算法优化游戏参数,而非只是找准优势策略,这是否具有可行性?

当然,我的策略是在一个平衡问题中运用一个基因算法。这是个起点;不是唯一的解决方案,也不是最佳解决方案。

Phillipa Avery和Julian Togelius共同撰写了一篇关于如何通过基因算法调整传统塔防游戏参数的文章。

华盛顿大学的Alex Jaffe也写了一篇有关此话题的优秀文章。我已提前阅读其中内容,但我答应他会进行保密。

我知道有家公司也基于相同理念——+7 Balance Engine。但他们采取的策略截然不同。据他们网页的资料显示,他们希望避免出现优势策略。

我认为这不是什么有效的方法。你需要赋予设计师权利,让他们决定是否应该存在优势战略。也许他们希望存在一个优势策略!你不能拿着别人的作品说:“让我替你平衡这些内容”,因为你根本就不知道“平衡”的意思。你无法事前预测他们所有的设计目标。

这涉及向设计师呈现富有价值的反馈信息,这样他们才能够确保游戏玩法满足设计目标。

总体目标是创建能够将设计师的视角扩展至玩法动态元素的机制。我们通常希望创建这样的机制:能够呈现适应性情况及各个设计决策对应的玩家战略。

我们想要让他们获悉他们所做出的设计调整决策将如何影响这些动态机制,这样设计师就能够提出这样的问题:“若我调整这里的参数会出现什么情况?”或者“我需要如何调整参数才能够让策略X的表现更令人满意?”

幕后情况

相比整体基础架构/过程,你如何评定潜在进化算法的重要性?执行这部分的代码需要花费多长时间?

这大约占据我所有编程时间的10%。9月份时,我腾出3-4天时间专门制作Evolver的代码——我们不得不将Evolver划分成独立结构,中断渲染器,提高更新循环的时间步,优化玩法代码,然后运用微软PPL存储库让其实现平行运作。这使得我们一晚上就顺利运行众多游戏内容。

10月份以后,我们就只需每天花30-90分钟进行调试。我们专门腾出一台Dell桌面PC,负责全天24小时运行Evolver。每隔1-2天,我们会终止Evolverand,根据适应性情况抽出两位最佳玩家。然后我们会将其插入游戏中,重新进行体验,查看他们的体验情况,接着根据游戏回合决定如何做出调整。然后我们会调整代码,重新进行运作,只有在少数情况下我们会进一步深入内容,就Evolver本身做出调整。

所以这非常方便。这就像是拥有免费的虚拟游戏测试团队。这花费我几天的时间,但相比全职工作人员所需投入的成本,这完全是小菜一碟。

The Grenade Launcher from aigamedev.com

The Grenade Launcher from aigamedev.com

当然,这并不是什么万全之策。你依然需要手动进行游戏测试,偶尔将参数插入Excel中,描绘它们同Solver的曲线关系依然是个不错的主意。但这些目前都是次要任务,不是平衡过程的核心。

所以只要我把握Evolver的输出内容,我就能够将此重新放回我的游戏中,目睹红蓝玩家一决雌雄。若我看到的是这样的情况:双方玩家都过度使用特定建筑或升级道具或是建筑组合,那么我就需要“削弱”这些功能(游戏邦注:或降低它们的力量,或提高首先购买这些道具的资金成本)。

同样,若游戏存在玩家从未购买过的建筑或升级道具,我们就需要询问自己其中原因。是否过于昂贵?是否未能发挥相应作用?或者还有别的情况?为什么Evolver觉得其无法在生存过程中发挥足够作用?所以我将通过提高道具的效用,或降低其运用成本改善这种情况。

最后,我检验各道具的具体运用情境,查看其是否符合我的设计目标。有时Evolver会运用我预料之中的策略;而有时则会出乎我的意料,此时我需要判定自己是否需要调整游戏阻止这种情况的出现,或是将其视作意料之外的新策略。

几位和我交谈过的AI研究者称我的方法只能够一次辨别一个优势策略。例如,你也许拥有两个策略A和B,二者都处于支配地位,按照我的操作方式,电脑只能辨别一个优势战略。

这点没错,但你要记住的是,调试是个持续的过程——而非几分钟就能完成。我持续运行Evolver,每隔1-2天就基于输出内容反复进行调试。所以如果今天碰巧发现A变成优势策略,没有关系,这意味着我将对此做出回应,A将受到削弱,这样B就可能在下次运作中变得更显眼。

你能否列举若干Evolver向你呈现的有趣结果?

这更多像是个漏洞查找过程,而不像是平衡问题,但我在10月30日发现一个有趣的现象:最佳红色玩家脚本总是在多数情况下胜出。查看逐步进化的脚本,这似乎不是什么罕见情况;在我看来,这里唯一的特别之处是,角色在游戏一开始就在某个特殊位置创建Laser塔楼。

现在,Laser塔楼在《City Conquest》是个特别的建筑。这是唯一没有范围限制的防御塔楼——它可以沿着整个地图开火。这个设计旨在给众多单元带来轻微破坏性。这里的诀窍在于要以正确方式设定塔楼,确保它能够攻击尽可能多的敌人。

所以当我开始回放游戏的脚本时,我感到颇为吃惊——红色玩家的Laser处在正确的位置,在此它能够沿着地图瞄准蓝色玩家的Collector。玩家需要进行8-9次射击方能将这些角色消灭,但只要顺利消灭,蓝色玩家的Collector就会被彻底摧毁,无法再重建,所以它非常需要资源,它每次都会流失一定数量的资源。

同样,这更多像是个漏洞,而不是游戏功能——Collector本该无懈可击,Laser甚至应该无法锁定它们,但刀枪不入和锁定调整还没有得到执行,这被安排在更后面的设计计划中。但它很好地呈现,计算机智能如何在设计空间中发现路径,而这种方式是人类设计师所无法想象的。

Laser进攻蓝色玩家,红色玩家的Laser从画面外进攻

Laser进攻蓝色玩家,红色玩家的Laser从画面外进攻

下面这个例子呈现的是《City Conquest》聪明的玩家所采取的若干举措。这里是游戏最近运作情况的两幅截图。记住你在这些截图中看到的内容完全由电脑完成:

这里是蓝色玩家的基地:

screenshot from aigamedev.com

screenshot from aigamedev.com

此基地发生的五件趣事:

* 基地背部的单点路口:蓝色玩家要求所有敌人单元汇集至Capitol背后的单个进攻点。这是进攻点的最佳位置,因为它要求所有敌人单元在进攻前要环绕四周,这赋予蓝色玩家更多挫败它们的机会。

* 分离敌人的间隔物:蓝色玩家分离和打败红色玩家的军队,将它们拆分成南北两支路线,散落城市中。由于其中一条路线更长,因此敌人单元会于不同时间到达,这样要攻下它们就更容易,因为他们会在不同时间点达到Capitol。

* 靠近顶部入口的适当干扰器能够让他在恰当时间点摧毁红色敌人的庇护,打断其自复原效应,这样它们就开始遭遇到蓝色玩家的防御打击。

* 画面左上方的升级Shield Generator单元让蓝色玩家那个保护附近的单元免受攻击。

* 3个Lightning Tower巧妙地分布在蓝色玩家基地的前后左右,确保这些昂贵的建筑不会浪费时间同时瞄准相同单元。

这里是红色玩家基地的截图。注意它如何运用类似于蓝色玩家的智能策略,这些是我期望玩家在《City Conquest》中采取的巧妙策略。

screenshot 02 from aigamedev.com

screenshot 02 from aigamedev.com

* 单点入口:和蓝色玩家一样,红色玩家要求敌军进攻其Capitol后方。基地的设计促使蓝色单元向北前进,越过Mining Facilities上方,然后回到画面左侧的后方入口。

* 拆分敌军的分离器:和蓝色玩家一样,红色玩家也将敌军拆分成不同路线(游戏邦注:基地前方的东西路线),以打乱他们的到达时间。

* 适当位置的干扰器让红色玩家能够在基地前方和两侧摧毁敌人的庇护,打断自复原效应,包括敌人单元环绕至后方,最终围住Capitol的时候。

* 置于正确位置的Lightning Tower能够覆盖广泛空间,允许红色玩家在基地前后左右攻击敌人单元。这个配置让此相对昂贵的塔楼能够发挥尽可能多的效用。

* 一组Ground Slammer,分别处在红色玩家基地的前后方,让红色玩家能够创造地震效应,同时带给所有穿过其中的地面单元一定的区域破坏性。

* 一组Grenade Tosser也以相同方式分布,升级Grenade Tosser处于前方,另一非升级Grenade Tosser则处在后方。这些都让红色玩家能够以适中区域破坏性攻击大范围内的所有地面单元——足以显著弱化它们,让其他塔楼能够摧毁它们。

你运用什么类型的进化及个人表现方式?简单地说,你逐步演变的是什么?

在此具体情况中,我们只是想让有足够能力的玩家了解优秀玩家所采取的策略。我们从中获得的是构建顺序的脚本。

《City Defense》是塔防游戏的变体,所以它只涉及3个动词:“创建”、“升级”和“销售”(我排除“游戏特效”所涉及的动词,例如施咒)。同样,出售建筑不是个好主意,因为你无法获得全额退款,所以Evolver没有理由这么做。

所以Evolver玩家脚本归根到底只是一系列创建和升级命令——在(X,Y)创建T类型建筑。或在(X,Y)位置更新建筑。命令没有相关的定时标准;Evolver会保证所有玩家都尽快执行命令。

由于《City Conquest》中的所有建筑都有资源成本(游戏邦注:以金子或水晶形式体现),因此它能够高效处理创建顺序,以两个并列独立列表形式执行,一个对应一种资源。

我们首先给红色和蓝色玩家各创建500种随机创建脚本。然后我们运行4种比赛模式,各种比赛都会要求500种红色玩家脚本和500种蓝色玩家脚本进行正面对抗,采用的是随机配对方式,这样在每场比赛中我们就能够让各蓝色玩家脚本与不同的红色玩家脚本进行较量。

我们可以轻松呈现C++伪代码:

static int const skNumTournaments = 4;
static int const skPopulationSize = 500;

// Run 4 tournaments which pit all 500 blue scripts and all 500 red scripts
// against each other
for ( int tournament_iterator = 0; tournament_iterator < skNumTournaments;
++tournament_iterator )
{
int random_offset = rand() % skPopulationSize;

for ( int i = 0; i < skPopulationSize; ++i )
{
// This ensures that ALL members of BOTH populations will play a game
Script & blue = blue_player_scripts[ i ];
Script & red =
red_player_scripts[( i + random_offset ) % skPopulationSize];

// Now play a full game, and adjust the scores for “blue” and “red”
// as appropriate
RunSimulation( blue, red );
}
}

我们进行各包含500场子游戏的比赛,以确保我们能够在各个比赛最后获得广泛的覆盖范围及准确的适应性数值。比赛完成后,我们就会根据适应性情况给各玩家分类脚本内容,然后运用标准的遗传运作元——随机配置最低得分的脚本;随机变化各脚本;采用“交叉模式”结合两种脚本。

“变异”将随机改变建筑类型,改变命令的位置,交换脚本两种命令的位置,或在既有命令的顶部复制命令。

我们还添加额外保障措施,以确保脚本保持灵活性。例如,若Evolver试图执行建构命令,而此处已有建筑,那么在多数情况下,玩家会出售既有建筑,重新创建新内容,继续执行脚本。

当然,这是个相当简单的应用,这操作起来很简单,因为作为塔防游戏,《City Conquest》非常适合进行此操作。不是所有游戏都能够以此方式应对非反应性的创建脚本。多数游戏更为复杂,你需要通过整个AI机制,方能获得合理的模拟玩家。

在下个项目中,我们打算演变整个行为树,这样我们才能够形成更为复杂的反应行为,而不是固定的创建命令。我们还计划将Evolver移至云计算群集,通过包含元数据的“进化岛”方式区分人口群体。

你如何判断个人的适应性?进化算法通常对适应性函数的定义非常敏感。

是的,的确如此。这取决于具体的游戏,及你打算演变的内容。

在《City Conquest》中,这取决于反复的塔防玩法,Capitol是唯一能够进攻的建筑。当玩家的Capitol健康值变成0时,游戏就结束。所以两个Capitol的健康值比例是说明谁获胜的绝佳指示器。当游戏结束时,获胜玩家的Capitol健康值就会告诉你众多有关游戏如何结束的信息。在势均力敌的游戏中,双方玩家都能够给对方的Capitol带来众多破坏性;而在缺乏平衡性的游戏中,赢家通常能够在不受众多伤害的情况下破坏对手玩家的Capitol。

各脚本都有对应的适应性指标。每个游戏回合后,Evolver就会将获胜玩家的Capitol健康值添加至获胜玩家的脚本中,而从失败玩家的脚本中扣除相应数值(游戏邦注:这带来这样的效应:玩家所获得的成就越大,双方脚本适应性积分所受到的影响就越大)。

当我着手制作Evolver时,我和几位研究AI设计和自动化游戏平衡的顶尖学术人员谈论我的计划。

他们有些建议我通过标准的“已知”玩家策略创建“原型”脚本,以确保自己演变的是广泛的合理人类策略。我在执行6个标准策略的过程中记录自己的游戏玩法,然后在在接下来的几个礼拜里我尝试通过Evolver测试这六种原型脚本。

但我发现,随着我反复基于Evolver反馈重新平衡游戏内容,所有“原型”脚本都逐渐变得毫无意义。我无法每隔1-2周重新记录这些脚本,因为我的平衡状态发生改变,所以我放弃这个方法。

在开发阶段后期,我还尝试添加额外适应性限定词,鼓励逐步发展的脚本尝试更多我觉得人类玩家应该会进行的操作,这些操作有些未能得以实现,最终有完成的是将插入因素添加至创建和升级Skyscraper的适应性函数。这是《City Conquest》成功的关键,因为Skyscraper扩展玩家的可创建区域,升级这些建筑不仅能够巩固玩家在领土中的控制地位,而且能够带来更多额外资源。但这类内容无需在适应性函数中立即获得回馈,因为当Skyscraper被添加后,逐步发展的脚本不会立即利用额外领域和额外资源。

Skyscraper适应性调节有效填补创建&升级Skyscraper,然后操作要从中受益所必须进行的游戏内容间隙存在的正常拖延。这纯粹就是告诉Evolve,“相信我,你会希望进行此操作,我们不想要每次都坐等演变过程进行核实。”

结语

对于其他考虑在游戏中运用机器学习算法的开发者,你有什么建议?

我是自己初创公司的创始人,所以我首先能够得到公司高层管理人员的支持,能够自由进行实验,从中认识到其中包含的主要益处。

但我很难想象,自己在曾任职过的公司里进行这样的尝试。

机器学习还有待证明自己的作用所在。这点很难瞬间就实现。行业对于机器学习的怀疑态度让开发者很难在内部发动这类计划。这多半会由外部世界导入,而不是由工作室内部发起(游戏邦注:工作内部会产生巨大的文化抵制情绪)。

这令我回想起自己在Ion Storm Austin制作《Thief: Deadly Shadows》和《Deus Ex: Invisible War》时的情境。我需要竭尽全力说服大家在探索内容中融入导航网格。我觉得几乎所有程序员和设计师都反对这个想法。有些人觉得我有些疯狂;有些人把我当作孜孜不倦的学术研究者,在他们看来若我过度沉浸于优质探索内容之类的晦涩学术问题,就应该暂时远离实验室。

现在,这类构思非常普遍。运用导航网格元素的成功AAA游戏不胜枚举。

但当时这处于公开论战状态,我很难论证那些在我看来再浅显不过的观点。

机器学习和AI游戏设计的未来情况也将是如此。

所以我的建议是开发者首先确保自己享有控制权。若你无法在各个阶段得到充分的支持,就不要轻易进行尝试。否则一切都是徒劳。

游戏邦注:原文发布于2012年2月6日,文章叙述以当时为背景。(本文为游戏邦/gamerboom.com编译,拒绝任何不保留版权的转载,如需转载请联系:游戏邦

Making Designers Obsolete? Evolution in Game Design

By Alex J. Champandard

As the games industry struggles with designing increasingly complex systems and mechanics, AI is proving to be a great tool for handling this complexity. In particular, there’s an increasing number of researchers and companies looking into using AI as a tool for assisting game design and consequently reduce time taken fine tuning the particular properties of game elements.

In this interview with Paul Tozour we will take a look at his current kickstarter project City Conquest, a tower defense game that breaks off the usual pattern by including an offensive part as well. Yet what we will be covering over this interview won’t be the novel mechanics or features but instead we will take a look at the evolutionary process Paul used to tune up the difficult parameters that keep the game balanced.

The Big Picture

Q: Did you find it necessary to try to express ‘player fun’ or ‘player experience’ into your fitness function? Is that even possible to achieve in practice?

For the first 16 years that I worked in the industry, I thought machine learning in games was pointless. I used to love to say “There’s no fitness function for fun.” Whenever a conversation turned to machine learning techniques like neural nets or genetic algorithms, I’d just say “There’s no fitness function for fun” and argue that that makes machine learning useless for games, and that would be the end of it.

And if you take it at face value, it’s absolutely correct! I mean, imagine that you had a computer program that could actually tell you how much “fun” any given part of a game was. How could a program like that even work? How could it know that a certain part of your game is too repetitive? How could it know that fighting the Swamp Boss is more fun than the Lava Boss, that level 5 isn’t engaging because it has too much traversal, that the Plasma Gun isn’t fun to use while the Lightning Blaster is a thrill?

Writing a program like that would be like solving the Turing Test, only harder. You’d have to be able to fully model the player’s perception of the game and their interaction with it and accurately model all of the thousands of little reward centers buried in the human brain to know what actually makes a human player happy, and why we get such a bizarre thrill from making plumbers jump and clicking on cows.

So of course, that’s impossible.

And saying “there’s no fitness function for fun” is actually more than correct – it’s also useful! Particularly in game AI development, people who are new to the industry sometimes come in with wildly unrealistic expectations for what machine learning can do and where it’s appropriate to use it. If they come from an academic background, sometimes they seem to think it’s “wrong” to solve game AI problems by hand, and they look for a magic wand they can wave to create the whole AI at once.

You have to push back against that mindset and shatter those illusions, and remind such developers that sometimes you can use machine learning to make a good AI system better, but it’s a terrible tool for trying to make a good AI in the first place.

And for a long time, I was satisfied with that answer.

But I eventually came around to the realization that while it’s correct, it’s also irrelevant!

And I had a huge change in perspective. I came to the realization that machine learning isn’t a tool for game AI – it’s a tool for game design.

You may not be able to write a fitness function to put an exact number on entertainment value, but if you can state your design goals clearly, then you can very often write a fitness function that measures whether some part of your game satisfies them.

And then it’s about selecting the right design goals in the first place to create the kind of entertainment experience you’re looking for, and determining which of those design goals are appropriate targets for any kind of AI-assisted tuning.

For example, with City Conquest, I had a specific tactical role in mind for every offensive and defensive building, and a lot of ideas as to what I expected would constitute effective offensive and defensive strategies. But I also had specific design goals for how the buildings should work together:

* Uniqueness: Each of the nine defensive buildings (towers) and offensive buildings (dropship pads for units) should be distinct from all the others

* Minimum bound on utility: No building type should be “underpowered” – every building should be important, and it should have some clear tactical role, and there should always be some scenario in which that building is the most important to winning.

* Maximum bound on utility: No building type should be “overpowered” – in other words, no single building type, or combination of a limited number of building types, should allow the player to win the game against axi non-reactive opponent.

* Cost equivalence: Every building in the game should have a resource cost equivalent to its actual utility in the game.

* Combined arms: Combinations of several different building types should be far more effective than building only a small number of building types (both offensively and defensively).

* Reactivity: The optimal strategy for each player should depend significantly on the other player’s strategy.

[To be clear, my system doesn’t actually explicitly measure these in its fitness function right now, and I don’t want to sound like I’m overpromising. But it’s a good example of the kind of design goals that computational intelligence can help you optimize against, if not always directly.]

Q: Do you feel that machine learning algorithms have been unfairly prejudiced against in the game industry?

Not necessarily. There’s a bias, but it’s not completely unjustified.

Some forms of ML (neural networks) are genuinely useless; many of them (genetic algorithms) are much too slow to be used at runtime; the results aren’t predictable; and as I mentioned, newcomers to the industry are sometimes very unrealistic about its potential and give it a bad reputation. There’s nothing worse than an AI developer who wants to wave the magic wand of “machine learning” instead of writing code.

But I’m not doing this because I’m any kind of ivory-tower academic or a starry-eyed idealist. I’m an industry veteran; I worked on two Metroid Prime games and was employee #3 at Gas Powered Games. I’m doing it because it works, and because I’m a capitalist and I believe in raising quality and lowering costs.

Q: What led you to use computational intelligence for game balancing?

I got tired of designing AI, and I realized it was time to “AI” design.

Here’s a thought experiment:

How are people going to design games 100 years from now?

Never mind that we have no idea what games themselves will look like that far into the future. A “game” probably won’t look anything like it does today. Maybe they’ll be holodecks or something. But it doesn’t matter.

In the year 2112, how will game designers get their jobs done?

What kind of tools are they going to use?

I don’t know the answer, but there’s no question it’s not going to look like what we have today. Game design is still in its infancy – or maybe its messy and chaotic pre-teen years.

A century from now, what it means to be a “game designer” will have evolved beyond recognition. We won’t have designers giving hand-wavy answers to basic design questions, making sweeping design changes without understanding the side-effects, or spending inordinate amounts of their team’s time (and their company’s resources) exploring the design space in the name of “iteration”.

At a minimum, we’ll have a shared design language across the industry, a shared understanding of what constitutes “good design”, and vastly more powerful design tools.

Imagine that you wanted to build a bridge. You’d need to hire a licensed structural engineer, wouldn’t you? That engineer would have to have earned a civil engineering degree with 3-5 years of study under their belt, and then satisfy a host of requirements to become certified to build bridges.

And that happens because everybody understands that bridges can fail, and earthquakes and wind and traffic accidents can topple them if they’re not designed correctly, and things like that can kill people and destroy the bridges. So you have to spend a lot of time studying to understand how to build a bridge the right way.

Everybody appreciates all of that, so everybody accepts that certifying civil engineers is the right thing to do to make sure they build good bridges.

But there’s nothing like that for games.

We consistently hire people who are enthusiastic rather than people who are genuinely qualified – we prefer charismatic designers who can talk the talk over cautious practitioners who can walk the walk. We take million-dollar projects and put them in the hands of people with no adequate design training, and then we fail to train them further or build a shared design vision between them. We let them endlessly throw ideas against the wall to see what sticks, and then we wonder why we end up with burned-out teams, failed projects, and studio failures out the other end.

And mostly that happens because we don’t talk about what happens when designers fail. We hide the devastating consequences of design failure. When a bridge collapses somewhere, it’s too big to hide and the whole world knows about it … but in the game industry, a design collapse is painful and embarrassing and we have the luxury of hiding it in our own studios.

And that has to stop. It’s just too expensive and too risky to keep doing design that way. Design has become the bottleneck.

So we need two things to happen.

First, game design needs to grow into a real discipline. I’m not saying we need actual certification like you have in civil engineering (though that might not be a bad idea), but the discipline of game design needs get out of its infancy (or chaotic pre-teen years) in a hurry.

And secondly, we need vastly better tools.

If you were to go back in time 30 years and asked designers what tools they’d like to see in the future, they would have described some amazing tools for building levels and doing scripting in three entire dimensions. And now those tools exist! We have engines like Unity and Unreal and Gamebryo that are everything a designer in 1981 would have asked for and more.

But these aren’t the only tools we need. Unity and Unreal can’t give you any insights on your game design. They can help you build your game, but they can’t tell you anything at all about the design decisions you make.

There’s so much more that can exist, and that will have to exist someday. It’s just that we don’t see that, because those tools haven’t been built yet – so we don’t realize they should exist, and we don’t realize they’re missing!

We have game “engines,” but we don’t realize that the rest of the car is missing.

Again, I used to think machine learning techniques were useless. But I got to a point where I stopped thinking of machine learning as a game AI tool and started thinking of it as a design tool, and that changed everything.

Take Wall Street as an example. There used to be a time when the New York Stock Exchange was full of traders yelling and making complicated hand gestures to each other to buy and sell shares. It only took a decade for all those pits to be abandoned and replaced with computers. Since then it’s been an ongoing arms race of ever faster servers and ever tighter connections and lower latency for high-frequency trading within microseconds. And it will never go back to the old days.

Or I could compare it to baseball recruiting. Baseball recruiting used to be a very subjective process. It was ten managers sitting in a room arguing about who to recruit to fix a broken baseball team, without really understanding what they should be looking for and no clear idea how to properly evaluate the factors that make a baseball team work. “We can’t hire that guy – he has an ugly girlfriend; that means no self-confidence!”

And in the book and movie Moneyball, you see how Billy Beane (played by Brad Pitt) hired an economics PhD from Yale and put together a statistical modeling approach that ultimately drove the Oakland Athletics to a staggering string of victories and changed baseball recruiting into a quantitative discipline. And it will never go back to the old days.

Game design right now is like the NYSE before computers, or baseball recruiting before Billy Beane.

And it will change, so we might as well embrace it. Moore’s Law is still in effect, and we have a massive amount of computing power at our disposal in cloud computing clusters.

Every day, computing power becomes a little bit cheaper and game designers become a little bit more expensive.

We’ve finally arrived at the point where we can actually run the millions of gameplay simulations we need to run to do GAs effectively and get those insights back into the hands of designers in a few hours or days, and it’s inevitable that that will transform the design process.

Q: So can we get to a point where we replace designers once and for all?

My hope is that the future will bring much more powerful tools to bear on game design problems. We’ll have all the powerful game engines we have today, but we’ll also have the equivalent of a GPS and navigation system for game designers to guide them toward the decisions that satisfy their design goals.

And that will also raise the bar for what a game designer is expected to do to work effectively. So the best designers will have more powerful tools that help them design better games, faster … and the pretenders will find themselves out of their league and will probably have to find new jobs.

Both of those are good things.

Q: Currently you’re using AI as a tool to assist the design process. Do you think it’s possible to use genetic algorithms to optimize the game’s parameters rather than just finding the dominant strategy?

Definitely. My approach is just one particular way of using GAs for one particular kind of balancing problem. It’s only a starting point; it’s not the only approach and it’s not necessarily the best one.

Phillipa Avery and Julian Togelius wrote a really neat paper (with Alistar and van Leeuwen) on using GAs to tune parameters for a more traditional tower defense game, where they evolved the parameters for their creeps and towers directly.

Alex Jaffe of UW also has a really terrific paper coming out on this topic (along with several co-authors). I’ve gotten a sneak preview but I’m sworn to secrecy.

There’s one company I know of with a broadly similar overall concept – the +7 Balance Engine. But their approach is very different. Their webpage says that they want to ensure that there are no dominant strategies.

I don’t think that’s a valid approach. You have to empower designers, and let them decide whether there should be any dominant strategies. Maybe they do want one dominant strategy! You can’t take someone else’s game and say “Here, let me balance that for you” because you can’t know what they mean by “balance.” You can’t anticipate all of their design goals in advance.

It’s about giving valuable feedback to the designers so they can ensure that their gameplay meets their design goals.

The big-picture goal is to build systems that dramatically expand designers’ visibility over the dynamics of the gameplay they create. We’d like to build systems that shine a floodlight on the fitness landscape and which player strategies emerge from every design decision.

And we want to be able to give them visibility on how any design change they could make will affect those dynamics, so designers can ask questions like, “what happens if I tune this parameter here – how does that change the set of viable player strategies?” Or alternatively, “What would I need to change in my parameters to make strategy X more (or less) desirable?”

Behind The Scenes

Q: How would you rate the importance of the underlying evolutionary algorithm compared to the overall infrastructure/process? How long did it take to implement this part of the code?

It’s been ~10% of my total engineering time. There were 3-4 days of dedicated coding to create Evolver in September – we had to split Evolver into a separate build, disable the renderer, increase the time step in the update loop, optimize the gameplay code, and then make it parallel using Microsoft’s PPL library. That got us to a point where we can run several million entire games overnight.

Since October, it’s only been 30-90 minutes a day of tuning. We have a dedicated Dell desktop PC running Evolver 24/7. Every day or two, we terminate the Evolverand extract the two best players by fitness score (one each from the red and blue player populations). Then we plug them into the game, play them back, and watch how they play each other and decide how to tune it based on that gameplay session. Then we make the code changes and run it again, and only occasionally dig deeper and make tweaks to Evolver itself.

So that’s very handy. It’s like having a virtual playtesting team that works for free. It cost me a few days of development, but when I compare that to the salary I’d have to pay somebody to do that full-time, it’s a no-brainer.

Of course, it’s not a magic bullet. You still have to do manual playtesting, and it’s still a good idea to plug your parameters into Excel every now and then and fit them to a curve with Solver. But those are now small, secondary tasks, not the core of the balancing process itself.

So once I get Evolver’s output, I can plug it back into my game and watch the red and blue players duke it out. If I see a pattern where both players are over-using certain buildings or upgrades or combinations of buildings, then I probably need to “nerf” those features somehow – either I need to make them weaker, or I need to increase the resource cost to buy them in the first place.

Similarly, if there are certain buildings or upgrades that they never buy at all, I need to stop and ask myself why. Are they too expensive? Are they not useful enough? Or is there something else going on? Why is the Evolver not finding it useful enough to survive evolution? So I’ll increase its utility or decrease its cost to compensate.

And finally, I look at the context in which everything is being used and see if it matches my design goals for each unit. Sometimes I find the Evolver adapts the strategies I expected it to; other times, it surprises me, and I have to figure out whether I should tune the game to stop it from happening, or embrace it as an unexpected new strategy.

Some of the AI researchers I’ve been speaking with have raised the issue that my approach can really only identify one dominant strategy at a time. For example, you might have two strategies, A and B, that are both dominant, and with the way I’m doing it, the computer can only identify one of those dominant strategies.

And that’s true, but you have to keep in mind that tuning is an ongoing process – it’s not something you do in a single pass! I’m running the Evolver continuously and re-tuning every day or two based on its output. So if it only happens to find that A is the dominant strategy today, that’s fine, because it means I’m going to respond to that, and A will get nerfed so that B has a much greater chance to stick out like a sore thumb the next time I run it.

Q: Can you give us some examples of interesting results Evolver has given you?

This is more of a bug than a balancing issue, but I had a really interesting phenomenon on October 30 where the best red player script was always winning the game by a huge margin. And looking through the evolved script, it didn’t seem to be doing anything really unusual; the only unusual thing I could see was that it was building a Laser tower at a particular spot fairly early in the game.

Now, the Laser tower is a very unusual building in City Conquest. It’s the only defensive tower that doesn’t have a range limit – it can fire all the way across the map. It was designed to do a very small amount of damage to a very large number of units. The trick is to set it up the right way to ensure that it hits as much of the enemy army as possible.

So when I started playing back the scripts in the game, I was floored – the red player’s Laser was positioned in just the right spot where it could target the blue player’s Collectors from all the way across the map! It took 8-9 shots to take them out, but once it did, the blue player’s Collectors were destroyed with no way to rebuild them, so it would be starved of resources and it would lose every time.

Again, it’s more of a bug than a feature – Collectors were supposed to be invulnerable, and Lasers shouldn’t even be able to target them anyway, but invulnerability and targeting tweaks hadn’t been implemented yet and were scheduled for a later date. But it goes to show you how computational intelligence can find a path through the design space that a human designer might never even imagine.

Here’s a different example that shows some really good behaviors that I think closely mirror what a smart City Conquest player would do. These are two screenshots from a recent run that I discussed in Update 4 of my Kickstarter campaign. Again, keep in mind that everything you see in these screenshots (excluding the pyramid-shaped Mining Facilities on the crystals) was evolved entirely by the computer.

Here’s the blue player’s base:

There are five really interesting things going on with this base:

* Single point of entry in the back of the base: The blue player forces all the enemy units to funnel around to a single attack point at the rear of its Capitol. This is the best possible positioning for the attack point, because it forces enemy units to go all the way around before they can attack, and it gives the blue player many more opportunities to defeat them.

* Dividers to separate enemy forces: The blue player divides and conquers the red player’s forces, splitting them between a northern and a southern route around his city. Since one route is longer than the other, enemy units will arrive at different times, and this makes them much easier to deal with as they will reach the Capitol at different times.

* A well-placed Disruptor (the floating sphere with yellow stripes) near the top entrance allows him to interrupt the red army’s cloaking, shielding, and healing effects at exactly the right time, just as they’re beginning to get truly pounded by the blue player’s defenses.

* A fully-upgraded Shield Generator unit toward the upper left (the dropship pad with three aqua-colored lights) allows the blue player to protect nearby units from damage.

* A set of 3 Lightning Towers cleverly dispersed at the front, side, and rear of the blue player’s base, ensuring that these expensive buildings rarely waste time targeting the same units simultaneously.

Here’s a screenshot of the red player’s base. Notice how it employs many of the same intelligent strategies as the blue player, which are exactly the behaviors I intended to be good strategies in City Conquest.

* Single point of entry: Like the blue player, the red player forces the opposing army to attack at the rear of his Capitol. The design of the base forces blue’s units around to the north, above the Mining Facilities, and then back through the rear entrance on the left side of the image.

* Dividers to separate enemy forces: Like the blue player, the red player splits the opposing army into two different paths (east and west paths at the front of his base) to upset their arrival timing.

* A perfectly-placed Disruptor (the floating sphere with yellow stripes above the Capitol) allows the red player to interrupt cloaking, shielding, and healing effects all along the front and side of its base (all along the north edge five Mining Facilities you see), as well as when enemy units circle around to the rear and finally close in on the Capitol.

* A Lightning Tower at just the right spot (black spiral just above and to the right of the Capitol) gives excellent coverage, allowing the red player to zap enemy units at the front, side, and rear of its base. This placement ensures that this relatively expensive tower will see as much utilization as possible.

* A pair of Ground Slammers, one at the front of the red player’s base and another at the rear (shaped like tall pillars wearing metallic thumpers), allow the red player to create earthquakes and give area-effect damage to all ground-based units that pass by.

* A pair of Grenade Tossers (deep holes in the ground on the left and right sides) is also nicely distributed in a very similar way, with an upgraded Grenade Tosser in the front and another non-upgraded Grenade Tosser in the rear. These allow the red player to hit all ground units in a wide area with a moderate amount of area-effect damage — enough to weaken them considerably and allow the other towers to take them out.

Q: What type of evolutionary approach and individual representation did you use? Simply put, what did you evolve?

In this particular case, we’re just trying to evolve highly competent players to see what strategies good players use. We essentially evolve scripts of build orders.

City Defense is a tower defense variant, so it only has three player verbs: “build,” “upgrade,” and “sell” (I’m excluding the verbs for “game effects,” which are more like spells that players can cast). Also, selling buildings is usually a bad idea since you can’t get a full refund, so there’s no reason for Evolver to do it.

So an Evolver player script just boils down to a sequence of build and upgrade commands – build a building of type T at (X,Y), or upgrade the building at position (X,Y). There are no timing values associated with the commands; Evolver will ensure that each player executes a command as soon as it can afford to do so.

Also, since every building in City Conquest has a resource cost in terms of gold or crystals, but not both, it effectively processes the build order as two separate lists in parallel, one for each resource.

We start by generating 500 random build scripts for each of the red and blue players. Then we run four tournaments, each one pitting all 500 red player scripts against all 500 blue player scripts, using a random offset so that we’re pitting each blue player script against a different red player script in every tournament.

It’s probably easiest just to show you the C++ pseudo-code:

We do 4 tournaments of 500 games each (pitting each red player script against a random blue player script) to make sure we get good coverage and reasonably accurate fitness values at the end of each tournament. Once the tournament is completed, we sort the scripts for each player by their fitness score, and apply standard genetic operators – random replacement of the lowest-scoring scripts (falling off linearly to a fixed minimum value over the first few hundred tournaments); random mutation of each script; and performing “crossover” to combine two scripts (biased toward crossover with higher-scoring scripts rather than lower-scoring ones).

A “mutation” has a random chance to change a building’s type, change a command’s location, swap the ordering of two commands in the script, or copy a command on top of an existing command.

We also added additional safeguards to ensure that scripts would stay flexible. For example, if Evolver tries to execute a build command and there’s already a building there, in most cases it will sell the existing building to allow it to build the new one and continue executing the script.

Of course, this is a relatively simple application, and it’s made easier by the fact that City Conquest lends itself to this as a TD game. Not every game can get away with a non-reactive build script like this. Most games are more complicated, and you need an entire AI system to get a reasonable simulation of a player.

For our next project, we’re planning to evolve entire behavior trees to allow us to generate more complex reactive behaviors rather than just a fixed build order. We’re also planning to move the Evolver into a cloud computing cluster and separate the populations out using an “evolution islands” approach with metadata.

Q: How do you determine the fitness of an individual? Evolutionary algorithms are often very sensitive to the definition of the fitness function.

Yes, they are. It depends on the particular game and what you’re trying to evolve.

In City Conquest, it’s back-and-forth tower defense gameplay, and the Capitol is the only attackable building. The game ends when one player’s Capitol gets to zero health. So the ratio of the health of the two Capitols is actually a VERY good indicator of who’s winning. And when the game ends, the health of the winning player’s Capitol tells you a lot about how close a game it was. In an evenly-matched game, both players will do a lot of damage to each other’s Capitols; in an uneven game, the winner will destroy the opposing player’s Capitol without allowing much damage to his own.

Each script has a fitness score associated with it. After each game, Evolver takes the health of the winning player’s Capitol and adds that to the score of the winning player script and subtracts it from the score of the losing player script. This ensures that the greater the victory, the greater the effect on both scripts’ fitness scores.

When I started development on Evolver, I discussed my plans with a number of the top researchers in AI-based design and automated game balancing – Phillipa Avery of the University of Nevada – Reno, Gillian Smith and Adam Smith of UCSC, Alex Jaffe of UW, and a few others.

A few of them suggested that I set up “archetype” scripts using standardized “known” player strategies to ensure that I’d evolve against a broad set of possible human strategies. I recorded my own gameplay as I executed half a dozen standard strategies, and then for a few weeks I tried using Evolver to test against all six of those archetype scripts.

But I found that as I rebalanced my game every day based on the feedback from Evolver, it would invalidate all of those “archetype” scripts. I couldn’t commit to re-recording all those scripts every week or two as my balancing changed, so I had to abandon that approach.

Later in the development cycle, I also tried adding additional fitness qualifiers to encourage the evolved scripts to do more of what I felt a human player should do. Some of these didn’t work out, but one that did was adding a fudge factor to the fitness function for building and upgrading Skyscrapers. This is critical to success in City Conquest, since Skyscrapers expand your buildable territory, and upgrading them not only consolidates your hold over your territory but also gives you additional resources over time. But it’s the type of thing that isn’t necessarily immediately rewarded in a fitness function, because an evolved script won’t immediately capitalize on the additional territory and additional resources right when the Skyscraper is first added.

The Skyscraper fitness adjustment helps compensate for the natural delay between building and upgrading a Skyscraper and then evolving everything else you need to actually reap the benefits from it. This is purely a way of telling Evolver, “trust me, you want to be doing this, and we don’t want to wait for evolution to prove it every time.”

Final Thoughts

Q: What advice would you give to other developers thinking about using machine learning algorithms in their games?

I was the founder of my startup, so I already had the buy-in I needed from the upper management to run the experiment. And I undeniably realized major benefits from it.

But I can’t imagine trying to push something like this at any of the studios I’ve ever worked at.

Machine learning will take time to prove itself. It’s not going to happen overnight. The (often justified) skepticism toward machine learning in the industry makes it a tough sell to launch this kind of initiative internally. It’s probably going to have to enter from the outside, disruptively, rather than from inside studios that would offer enormous cultural resistance.

I’m reminded of when I worked on Thief: Deadly Shadows and Deus Ex: Invisible War at Ion Storm Austin. I had to fight tooth and nail to gain acceptance for the use of navigation meshes for pathfinding. I think almost every single programmer and designer in the studio argued against it at one point or another. Some of them treated me like I was crazy; others treated me like I had to be some kind of pointy-haired academic who ought to be off in a research lab somewhere if he cared so damned much about obscure academic issues like good pathfinding.

And now, they’re very common. I can make a list as long as my arm of successful AAA products that used navigation meshes.

But at the time, it was open warfare, and it was very difficult to make the case for something that seemed to me blindingly obvious.

It will be the same for machine learning and AI-based game design for some time to come.

So my advice is to make sure you have buy-in. Don’t put yourself through the pain if you can’t get full support and commitment at every level. Otherwise, you’re just painting a target on your back.

And if you can’t get it, and you see the possibilities, consider quitting your job. Launch a startup, or join mine!(Source:aigamedev


上一篇:

下一篇: