长文开发者谈互动角色行为设计的12条原则

发布时间：2021-10-20 14:44:07 Tags：开发原则,游戏互动逻辑

开发者谈互动角色行为设计的12条原则

原作者：Wendelin Reich & Werner Schirmer 译者:Willow Wu

（本文的两位作者以及Sophie Peseux共同创建了Virtual creatures，这是一家专注于行为设计的初创公司，致力于创造出具有深度互动性的NPC。）

简介

在科幻电影《她》中有个十分有趣的片段：主人公Theodore在玩一个AR游戏，突然，一个NPC朝他辱骂了起来。原本平淡无奇的路人角色突然展现出了真实的个性，这种意外的展开促使Theodore笑了出来，也引发了他的思考。他意识到这种行为是个谜题。于是他也骂了回去，最终Theodore解决了谜题，游戏冒险继续。

八年后，最先进的互动角色仍然无法提供类似于这样的体验。专业人士和玩家都一致认为，在2005年《极度恐慌》发行之后，角色AI一直没有实现质的突破。更糟糕的是，最近出版的游戏AI教科书明确指出角色AI的创新基本上是停滞不前了。如今，人们更感兴趣的领域是AI驱动的艺术创作、系统层面的AI应用等等。

现今大多数热门游戏都有某种类型的NPC。因此我们认为，角色AI创新的匮乏会成为未来游戏的创意瓶颈，但对于那些愿意尝试新方法的人来说，它或许会成为一个巨大的机会。因此，把这篇文章当作是互动角色探讨系列的开端，我们认为是比较合理的。如果说人们创造NPC这样的影响因素（agent）是为了让他们以特定的方式行动从而让玩家沉浸到游戏中，那么我们首先就要问自己影响因素的行为（agent behavior）具体是什么（这里的影响因素可以指动物也可以指人类）？

心理学和人工行为（artificial behavior，AB）是我们学术研究的根基所在。在AB引擎上投入了这么多年的时间，我们总结出了行为的12种特性。我们可以把它们称作是“原则”——算是对迪士尼著名的“动画十二原则”的致敬吧。

有些是显而易见的，有些不太容易能够想到。重要的是，结合这12条原则，你能够将那些似是而非的东西排除掉。更重要的是，如果你想要创建一个AB引擎，让互动角色变得更“鲜活”一些，你就得确保引擎能够支持这12条原则。

原则1：行为是可目击的

首先，你可能会想到的是行为与用到的肌肉有关。就比如说你坐在一家高档餐厅里，等待着你的约会对象。你的手指在不断敲击桌面，你的心脏（这也是肌肉）在狂跳。

这意味着你在同时做两件事吗？并不算是。人的身体里有很多复杂的东西在做复杂的事情，但大部分都是外界无法感知的。就我们的目的而言，行为仅包括无需特殊仪器（如核磁共振扫描仪）就能观察到的事件。所以，如果你的心跳快到让你神经紧绷，最后不慎打翻了一杯橙汁，把自己的衣服弄脏了，这就是可以观察到的，也正是我们所说的行为。

Final Fantasy 12（from polygon.com）

对AB引擎来说，这一原则带来了一种非常乐见的简化——我们不需要试图再造生命体，只需要再现它的外观。迪士尼称之为“生命的幻象”，而我们更进一步地称之为“互动生命的幻象”——在后续的系列文章中会详细展开。

原则2：行为是连续的

生命体从出生一直到死亡都有行为。人类的语言为我们提供了大量的术语，可以用来形容没有任何动作或发出任何可听见的声音的人。例如，我们可以说这个人在睡觉、坐着不动、屏住呼吸、装死，等等。

但这难道不与原则1冲突吗？并没有。因为即使一个影响因素什么都不做，我们也能从中观察到一些东西，比如Stephen Colbert坐在气球雨中的画面，你可以很容易地看出他保持坐定是一种熟练（而且可能是排练过）的行为表现。单纯的坐直动作需要协调使用几十块肌肉。从更笼统的意义上来讲，我们可以说影响因素产生了连续的行为流。因此，AB的问题在于如何在个体行为中生成这样的流，而且它要能起到承上启下的作用。

原则3：行为是交互式的

现实生活中并不存在没有互动的行为。比如说，与朋友玩耍需要回应他们的行为，攀岩需要调整手势。即使是最独立的行为也会涉及到情景，需要跟它互动。以呼吸为例，呼吸速率取决于大气中的氧气密度。如果我们把情景（氧气）剔除，行为（呼吸）就不再有意义了。

影响因素通过行为与世界建立联系，这就是为什么所有的行为都要是可以交互的。这也意味着互动行为、适应性行为或响应性行为之间是没有区别——这些词本质上都是在说“行为必须是有情景的”。对于AB引擎来说，这意味着所有行为需要程序生成——不幸的是如今大多数游戏都在做完全相反的事，这些游戏开发者倾向于从预先配置的行为包中组装行为流，如站立-循环、行走-循环、跳跃等等，它们之间的过渡很生硬。

原则4：行为是受到约束的

情景对行为施加了许多约束，以各种条件的形式塑造行为。目前来看，最重要的一种就是世界的物理构成——它给行动主体制造的阻力、它允许的声音传播的方式等等。

约束本身可以是主动的也可以是被动的，由此可以动态地引导影响因素的行为，往意想不到的方向发展。因此，AB必须超越单纯的程序选择行为，并为程序动画提供全面的支持，允许行为流动态地适应约束。

原则5：行为是有序的

AI教科书经常会区分“脚本化”和“无脚本化”的行为，暗示后者更好、更自然。在我看来是有点没道理的，因为实际中影响因素的行为多数是二者的结合。

人类大脑有专门的回路（游戏邦注：尤其是小脑）来存储参数化运动序列的巨大数据库。这些序列促使大脑更容易配置行为的标准形式。同时，这种序列对具体环境和动态情景的适应性很强。这就构成了一个强大的组合。它可以使用模板，在运行时只需要填入几个参数，而不是每次都要重新决定要移动哪些肌肉，何时以及要移动多少，来制作……比如说一个探戈。除了降低复杂性外，这种方法还促进了个体之间的行为同步，这也能够在一定程度上解释为什么真实的行为有时感觉像是脚本化的。像Rascal这样的AB引擎是从神经科学汲取了灵感，并将参数化、自适应的序列融入到架构中。

原则6：行为是可中断的

即使是计划最完善的行为，也不一定能在第一次与现实接触时就顺利执行。影响因素会一直改变他们的想法，他们的行为也得随之改变。我们在现实世界中观察到的这种细枝末节对AB来说却是一个艰巨的挑战，主要是因为原则2。中断不能是直接打断行为流，然后再开始一个全新的行为流。

对连续性和可瞬间打断的两种要求就像是拔河的双方，会造成一种紧绷的局势。此外，这种缺乏控制的感觉可能正是引擎使用者想要达到的效果（例如喜剧效果），这就是AB引擎还面临的另一种挑战。我们的引擎Rascal是通过一种层级控制结构来实现的——这个设计是从机器人领域获得的启发，我们或许会在之后的文章中详细说说。

原则7：行为的变化是模式化的

你不能两次踏入同一条河流，同样的行为你也不能展现两次。给一些变化——无论多小，但是总是会有，这就是让行为变得更自然的有效方法之一。

重要的是，这些“相同”行为的不同表现方式往往是随机、结构化的。就比如说，《哈利·波特》系列中韦斯莱家的双胞胎在同步说“什么”的时候，他们抬头、张嘴的方式可能会有点不同，但他们不会在……比如说在需要张开嘴的时候闭上嘴，反之亦然。演化生物学家称这种现象为模式化变化（patterned variation）。无论是哪个场景，你都可以看出这些变化是基于已经设置好的生成原则或规则——例如，控制发声器如何产生单词“什么”的规则。这并不意味着（比如说）AB引擎需要模拟整个发声装置来产生更自然的变化。在实践中，潜在变化的维数通常是有限的，可以用更浅显的方法来实现近似。

原则8：行为是有层级的

当一个影响因素在展现行为时，如果你仔细观察就会发现有很多事情通常是同时发生的。这个原则，还有后续的几条能够帮助你建立秩序。

行为几乎都是有层级结构的，让我们先从运动学角度入手。握手这个动作就很好地说明了这点。尽管它是“握手”，但这个小小的仪式涉及到许多身体部位的协调——它们都处于层级关系中，从属部位会受到上级部位的影响。

所有的动作都是从躯干开始的，这使得手臂（从属于躯干）在摇晃时向前倾。同时，头部（也依附于躯干上）朝向对方，眼睛（依附于头部）最初需要向下看从而调整最初的抓力。然后抬头，跟对方做出回应。

当我们开始在行为中寻找层级关系时，我们就会发现它是无处不在的，更棘手的是，它们会随着时间的推移而迅速变化（回想一下原则5）。

原则9：行为是平行的

你看到《广告狂人》（Mad Men）里的Peggy Olson在一边步行一边吸烟，同时做几件事是极为常见的。这些东西甚至不需要有等级关系（你可以不走路就抽烟，反之亦然）。尽管如此，原则8和原则9对AB的影响是相同的。它们要求行为流必须由多个具有层级关系的子行为组成。这些子行为可以控制身体不同或重叠的运动部位（眼睛、嘴巴、四肢……），让AB的复杂性又提升了一个层次。举个不同部位的例子：上面提到的Peggy Olson，她的吸烟行为完全不影响她走路。至于重叠，你可以想象一下Peggy一边行走一边因恐惧而颤抖——这两种行为会对相同的身体部位产生影响，但方式不同且可能很复杂。

原则10~12：行为是由认知引起的、可被观察的、可解读的

最后三点我们可以放在一起讨论，因为我们要讲认知和行为之间的关系。

影响因素拥有控制其行为的中枢神经系统。他们看到其它影响因素的行为会自动解读。在进化过程中，我们已经习惯于从可观察的行为中硬解读出某种（不可观察的）认知原因，并赋予其目的性和意义。
因此，在上面的动图中，你看到的不仅仅是一个女主持人举起然后放下她的手臂——你看到的是她一开始试图与同事击掌，但没有引起对方的注意，最终为自己的失败感到丢脸。大量的心理学研究表明，这种归因是自发的，是不可抑制的。对于AB来说，这意味着我们不可能将人工创造的行为与它所引发的意义剥离。行为总是在表达某种意思，不管你的本意是否如此。因此，AB引擎的开发不仅是一个工程上的挑战，也是一个心理上的挑战。这是关于如何说服玩家相信生成行为的（人造）意义，我们计划在未来的几篇文章中继续讨论这个话题。

本文由游戏邦编译，转载请注明来源，或咨询微信zhengjintiao

This post was co-authored by Wendelin Reich and Werner Schirmer. Together with Sophie Peseux we’re founders of Virtual Beings, an artificial behavior-startup that develops mobile games with deeply interactive non-player characters (NPCs). See here for posts #2, #3 and #4

Introduction

Clip from ‘Her’ (2013)There’s a delightful little scene in the science-fiction movie Her (2013) where Theodore, the protagonist, plays a video game in augmented reality. At one point, an NPC turns towards him and starts insulting him. This sudden display of real personality from an otherwise bland character is so unexpected that it makes Theodore laugh as well as think. He realizes this behavior is a puzzle. By insulting the NPC back, he ends up solving it and the game continues.

Eight years later, the state of the art in interactive characters still doesn’t provide anything close to Theodore’s experience. Among professionals and players alike, there’s a strong consensus that character AI hasn’t seen qualitative breakthroughs since about 2005, the year F.E.A.R. was released. Even worse, recent textbooks on game AI state explicitly that innovation in character AI has essentially come to a halt, the field nowadays being more interested in AI-driven art production, system-level AI and so on.

A clear majority of popular games today features NPCs of some kind. We therefore believe that this dearth of innovation in character AI is both a creative bottleneck for future games and an immense opportunity for folks who are willing to approach the problem with a fresh look. This first article of our four-part series on the future of interactive characters therefore starts at what we see as the logical beginning. If the purpose of artificial agents such as NPCs is to behave in ways that engage players, we need to ask ourselves just what agent behavior is in the first place (where ‘agent’ refers to both animals and humans). Our academic roots are in both psychology and artificial behavior (AB). Over many years of development on Rascal, our AB-engine, we have found that behavior is characterized by twelve properties. We call them ‘principles’ as a small homage to Disney’s famous twelve basic principles of animation.

Some of them are obvious, others less so. What’s important is that these twelve principles, taken together, sharply delimitate behavior from anything that resembles it without matching it. And more importantly If you would like to create an AB-engine which allows interactive characters to feel ‘alive’, you’d want to make sure that it supports all twelve.

Principle 1 Behavior is observable

At first sight, behavior seems to be about muscles that move. Let’s say you’re sitting in a fancy restaurant, waiting for your date to show up. Your fingers are nervously tapping on the table and your heart (also a muscle) is racing.

Does that mean you’re doing two things at the same time here Not quite. Living bodies are full of complicated stuff doing complicated things, but most of this isn’t perceivable from the outside. For our purposes, behavior includes only events that are observable without special instruments (such as an MRI scanner). So if your racing heart contributes to your overall nervousness and you end up knocking over your glass of orange juice and ruining your shirt – that would be observable, hence behavior.

For AB, this first principle entails a welcome simplification We don’t have to try to recreate life itself, just its appearance. Disney called this ‘the illusion of life’. We’ll go one step further and call it the illusion of interactive life – something we’ll cover in a later blog post.

Principle 2 Behavior is continuous

Stephen ColbertLiving beings behave all the time, from birth all the way to their death. Our language recognizes this by providing us with an arsenal of terms we can apply to someone who isn’t showing any movement or making any audible sound. For example, we may say that this person is sleeping, sitting still, holding their breath, playing dead, and so on.

Doesn’t this conflict with principle 1 No, because even when an agent is seemingly doing nothing, we can observe something In the GIF on the left, you can tell effortlessly that sitting perfectly still under a shower of balloons is a skilled (and probably rehearsed) display of behavior. The mere act of sitting straight requires coordinated use of dozens of muscles. In a more general vein, we may say that agents emit continuous behavior streams. The problem of AB is thus to generate such streams from individual behaviors that are connected to preceding and subsequent behavior.

Principle 3 Behavior is interactive

Cat pokeThere is no real life behavior that is not interactive. For example, playing with a friend involves responding to their actions, and climbing a rock requires adapting one’s hands to its shape. Even the most self-involved behavior takes place in a context and needs to interact with it. Take breathing as an example, where the respiration rate depends (among other things) on the density of oxygen in the atmosphere. If we take away the context (oxygen), the behavior (breathing) ceases to make sense.

Behavior is how agents relate to the world, and that is why all behavior needs to be interactive. This also means that there is no difference between behavior that is interactive, adaptive or responsive – these words just add different flavors to the fact that behavior is necessarily contextual. For AB, this means that all behavior needs to be procedurally generated – which is unfortunately the exact opposite of what happens in most games today, which instead tend to assemble behavior streams from canned packages of pre-configured behavior stand-loop, walk-loop, jump and so on, with awkward transitions between them.

Principle 4 Behavior is constrained

Agility dogContext imposes lots of constraints on behavior, in the form of conditions that shape it in various ways. By the far the most important one is the physical makeup of the world – the resistance it offers to the agent’s body, the way it allows sound to propagate, and more.

Constraints can be passive or active themselves, thereby directing an agent’s behavior dynamically and somewhat unpredictably. AB must hence go beyond mere procedural selection of behavior and offer full-fledged support for procedural animation, allowing the behavior stream to adapt to constraints on the fly.

Principle 5 Behavior is sequenced

TangoAI textbooks often distinguish ‘scripted’ from ‘unscripted’ behavior, implying that the latter is somehow better and more organic. This seems a bit pointless to me, because real agent behavior is always a combination of both. In fact, our brains have dedicated circuitry (notably the cerebellum) to store gigantic databases of parametric motion sequences.

These sequences make it much easier for the brain to deploy standard forms of behavior. At the same time, such sequences are highly adaptable to concrete environments and dynamic context. This makes for a powerful combination. Instead of having to decide freshly each time exactly which muscles to move, when, and how much, to produce, say, a tango, it can use templates that leave only a few parameters to be filled in at ‘runtime’, so to speak. Apart from reducing complexity, this approach also facilitates synchronization of behavior between several individuals, and it explains in part why real behavior can sometimes feel scripted. Modern AB engines such as Rascal take their inspiration from neuroscience and incorporate parametric, adaptive sequencing into their architecture.

Principle 6 Behavior is interruptible

LeBron JamesEven the most perfectly planned behaviors won’t always survive first contact with reality. If they do, agents change their mind all the time and their behaviors will have to follow suit. This is an almost trivial observation about the real world but a hard challenge for AB, mostly because of principle 2. Interruptions can’t just break off the behavior stream and start a fresh one.

The requirements for continuity and for rapid interruptibility pull in opposite directions, creating a tension that even an athlete like LeBron James can’t always resolve gracefully. AB engines are faced with the added challenge that such a perceived lack of control may be precisely what the user of the engine wants to achieve (e.g., for comic effect). Rascal achieves this via a layered control architecture that’s inspired by robotics – something we might discuss in a future post.

Principle 7 Behavior shows patterned variation

Weasley TwinsYou cannot step twice into the same river, and you cannot display twice the same behavior. Some difference, however small, will always persist – and that’s part of what makes natural behavior, well, natural.

Importantly, these different expressions of one and the ‘same’ behavior tend to be both random and structured. The Weasley twins may hold their heads and open their lips in slightly different ways when they ask ‘What’, but they cannot go so far as to, say, close their mouth when it needs to be open, or vice versa. Evolutionary biologists call this phenomenon patterned variation. Whenever it’s found, it indicates that the variations are due to underlying generative principles or rules – for example, rules governing how the vocal apparatus can produce the word ‘what’. That doesn’t mean that AB engines need to simulate (say) an entire vocal apparatus to produce believable variations. In practice, the dimensionality of possible variations is often limited and can be approximated in more superficial ways.

Principle 8 Behavior is hierarchical

Handshake The closer we look at an agent’s body while it’s displaying behavior, the more we see that several things usually occur at once. This and the following principle help to establish some order here.

Let’s start with the observation that from a kinematic point of view, behavior is almost always hierarchically organized. A handshake illustrates this nicely. Despite its name, this little ritual involves coordination of many body parts that are in hierarchical relationships, where subordinate parts are affected by superior ones.

In the GIF on the left it all starts with the torso, which positions the arms (which are subordinate to the torso) and leans forward during the shake. Meanwhile, the head (which also depends on the torso) orients towards the other party and the eyes (which depend on the head) need to look downwards initially to coordinate the initial grip. They then look up and connect with those of the other.

Once we start looking for hierarchies in behavior, we find them everywhere, and to make things worse, they evolve rapidly over time (recall principle 5). The consequences for AB are significant, but (fortunately) identical to those of the next principle.

Principle 9 Behavior is parallel

Peggy from Mad MenWhat is Peggy Olson from ‘Mad Men’ doing She is walking. She is smoking. The fact that there are (at least) two perfectly good answers bothers no one because it’s normal to do several things in parallel.

These things don’t even have to be in a hierarchical relationship (litmus test you can smoke without walking, and vice versa). Still, the consequences of principles 8 and 9 for AB are identical. They entail that the behavior stream must be composed from multiple sub-behaviors that can be hierarchically organized. As an added complication, these sub-behaviors can control distinct or overlapping motor domains of the body (eyes, mouth, limbs, …). For an example of distinct domains, look no further than Peggy, who’s smoking behavior doesn’t interfere at all with her walk. For overlapping domains, imagine that Peggy were walking as well as shaking from fear – two behaviors that will effect the same body parts, but in distinctive and potentially complex ways.

Principles 10-12 Behavior is cognitively caused, monitored, and readable

High-fiveThe final three principles can be discussed together for the purposes of this overview, as they are about the relationship between behavior and cognition.

The things that emit behavior (i.e., agents) are also the things that have central nervous systems which control their behavior. And the things that see this behavior (i.e., other agents) also automatically interpret this behavior. We have been hardwired by evolution to ‘read’ (unobservable) cognitive causes into observable behavior and thereby give it intentionality and meaning.

Thus, in the GIF above, you don’t just see an anchorwoman who is lifting and then lowering her arm – you see a lady who is trying to high-five her colleague, failing to solicit her attention, and ultimately ashamed about her failure. Tons of psychological studies have shown that such attributions are automatic and irrepressible. For AB, this implies that it’s impossible to separate the behavior emitted by artificial agents from the meaning it elicits. Behaviors always express something, whether you want it to or not. AB engine development is therefore not just an engineering challenge, but also (and foremost) a psychological one. It’s about convincing the player of the (artificial) meaningfulness of generated behavior, which is a topic we plan to talk about in several future posts.

(source: game developer )

分享到： QQ空间新浪微博开心网人人网

上一篇:休闲化趋势下的力量剥夺&高难度游戏的魅力

下一篇:小团队和小项目将会成为大多数开发者的未来选择