一般来说，电脑游戏要求玩家控制屏幕上的一个或以上的单位。在早期的游戏中，玩家通常控制只一个单位。随着CPU性能提升，玩家控制的单位越来越多。今天，玩家可能要控制 上百个单位，每个都必须独立操作。这种基于单位的UI不再有效了。本文将指出另一种思考UI的角度，讨论如何比较UI的优劣或将一种UI与理论上最有效的UI作比较，最后解释如 何改进游戏。我使用的例子主要是策略游戏的，但这些UI案例同样适用于其他类型的程序。
甚至对于资深《文明》粉丝，这款游戏的乐趣也不能超过1800年。点击太多了。我数了我玩到《文明3》的1848年时共经历了多少次点数、鼠标移动和按键盘。许多个小时之后，当 那个回合结束，我得到的结果是：422次鼠标点击，352次鼠标移动，290次按键盘，23次鼠标滚轮和18次屏幕滚动。如果不是充分利用了《文明》的所有快捷操作、自动操作和集合 移动，以上数字还要翻倍。
你可能会问，为什么我谈的是《文明3》，而不是已经出了好几次个月的《文明4》。我从来没有买过《文明4》。我一直在等着更好玩的《文明》游戏。而我最终等到的是需要同样 多点击操作的《文明4》，只是换上了新的3D UI罢了。
RTS的UI从《Total Annihilation》(1997)以后就没有什么突破。这款游戏的单位自动化比现在的许多游戏都更实用。同时，我们的电脑可以控制和动画的对象也增加了，并且还在 以指数速度增加。老式UI模型没有达到突破点——更是破了。
策略游戏通常让玩家扮演如军队指挥官这样的角色。但想一想现实世界的指挥官是怎么工作的。他不会一边观察地图，一边命令战场上的每一辆坦克。在真实的战争中，他手下有 几个直接听从他的命令的人。在美军中，1个小队长和2名火力小队长一起带领约9名士兵。1个排长通常指挥4个小队长再加上3-4名参谋。1个连长通常对4个排长和参谋直接发号施 令。1个中校可能指挥4个连的部队加参谋，等等，形成一个完整的指挥系统。
美军的指挥范围目的是，指挥官可以命令和监督第一层级（他的下属），同时跟进第二层级的情况（他的下属的下属）。国际象棋的玩家掌握两个层级的控制：第一，选择这些结 构体各自的子目标（比如，“控制中心”或“打破对方的防御结构”）；第二，选择执行各个子目标的步法且选择其中一个步法。这样，玩家就不是跟进所有棋子，而是选择每一 个步法，这就破坏了“7”原则。如果你希望你的游戏进展得比国际象棋快，且仍然保持策略性，那么你就必须遵守“7”原则。
游戏设计师可能会认为，如果允许玩家控制所有单位，但不是必须控制所有单位，那么游戏世界就完美了。不幸的是，事实并非如此。在经济学中有一条法则叫作“Gresham’s Law of Money”，也叫作“劣币逐良币法则”。Gresham解释道，当一个国家使用两种货币，即具有固有价值的金属货币和不具有固有价值的纸币时，纸币会把金属货币驱逐出市场 ，直到所有人都只使用纸币。
Smalltalk（游戏邦注：为 Xerox PARC在1970年代所发展出来的一种使用功能表单驱动的高级程序语言；常与鼠标器合并使用）用户称对象为“objects”，称方法为“verbs”（ 动词）。自那以后，许多制作对象导向型程序的程序员都把“objects”解释为“noun”（名词）。我曾与其他程序员争论，因为他们认为游戏不是对象导向的，除非游戏中的物理 对象在代码中是OO对象（面向对象）。当我表示，让游戏中的verbs成为代码中的objects，这样组织代码从而强制执行游戏的一致的物理学。他们回答道，“objects就是objects ，verbs就是verbs。”至今，我们仍在根据游戏中的物理对象来组织游戏代码和UI。
对于《文明3》的单位导向的界面，这需要点击各个工人单位，把他们分配到的铁路线路的各个路段。各个工人必须分配到线路的某个短的路段，因为如果你让一个单位从南端作业 到北端，让另一个单位从北端作业到南端，那么完的路段就会从他们的任务的开头部分计算，而不会计算另一个单位同时作业的部分——导致大量非重叠的平等铁路线路，以及军 队将无法越过你的边界来击退罗马军团的侵略。
为了在被侵略以前完成，我必须分配约一百个工人去建造铁路线。对于各个工人，选中必须点击一次；然后输入“g”开始移动，滚动到铁路线的起点，再点击。之后，当它到达那 个点，我必须输入“ctrl-r”以开始建筑铁路和，滚动到那个单位的铁路部分的末端，再点击。每个单位都经过这三次移动，三次按键，和三次鼠标点击。我尝试过把工人组成三 人小组，尽管这不是任何时候都可行的。所以我大约执行了600次点击、按鼠标和滚动才完成那条铁路、
指挥官之所以能够命令7名下属的一部分原因是预先准备。他事先设定好了任何行动的情节，并通过在他的军队行动中命令不同情节间的转变而实现快速且巨大的改变。他的服务分 支带有一个标准的战术库，即基于团队水平区别他可以在行动过程中用于向下属解释自己的目的。他的下属面对着交战规则去判断如何对各种类型的敌人以及非战斗式行动作出回 应。他可以事先将这些规则带进战斗中。他带有许多野外演习，在每次演习后，他会告诉下属他们哪些做对了而哪些则做错了，而他的长官也将对他的表现做出分析。这便减少了 战斗中所需要的直接监管的分量。
如果你使用统一的形式去设计游戏AI，如基于规则的系统或有限状态自动装置，你便可以将其作为指导玩家单位的另一种方式而面向他们开启。在真正的军事野战游戏模拟器（如 JSAF或OneSAF）中，面向半自动单位创造交战规则是必要的。实际上，SAF可以代表“半自动化兵力，”这是能够提供非常复杂的任务和交战规则的单位，所以操作员便能够监管它 们并且不会过多进行干涉，同时还能为真正控制每个对立单位提供现实的培训环境。
对于像《文明》这样需要花数个小时体验的游戏，你应该更加重视玩家的消耗。对于每次使用同一个手指，比起按压按键，鼠标点击需要花费更大的力量，所以这会对手腕造成更 大的压力。（其实人们所归咎的按键造成的疼痛往往是鼠标点击所造成的。）所以我将计算每次鼠标移动和按键作为1个UI点；每次鼠标点击作为3个UI点；每次滚动滑轮作为6个UI 点；每次鼠标平移作为9个UI点。如此每次移动总共会得出2208个UI点。不同的UI可以与分数进行比较。
例如，为塑造一个虚拟粘土考虑一个UI。你有大小n n n的体素数组，每个体素是可以打开或关闭的。我们将考虑四个可能的用户界面。在第一个界面中，用户想要打开每个体素坐 标轴中的用户类型。用于明确从1到n的按键次数是与log2(n)成比例的，所以这让操作与3log2(n)成正比去明确每个打开的体素。（从现在起，我将不再使用“成比例”这一短语并 编写W=f(x)，即它意味着W=O(f(x))。）如果我们假设美术人员将明确一个界面，并且不填满整个雕塑，那么整体的操作将变成W=3n2log(n)。（界面的大小与n2成正比。）
在第二个用户界面中，用户在三维空间中使用三维鼠标（如3DConnexion的Spacebaoll）选择一个点，并点击鼠标去切换打开/关闭状态。操作需要前进到三维空间的一个特殊点上 ，然后结合三次移动；每次移动是基于1到n的坐标轴范围，并伴随着平均n/2值似乎它将利用操作W=n5/8去创造一个雕塑。然而，让我们假设在打开一个体素后，用户会移动到26 个相邻体素中的一个。我们会说这与明确26个选择中的一个（即log2(26)）所需要的信息成正比。我们将其粗略估算为log2(33=27)，因为在这里以及我们之前关于W的值中的常数3 是源自雕塑的三维属性。之后整体的操作是W=3n2log2(3)。这一界面看起来就像是对于第一个界面的完善。
在第三个界面中，我们将让用户从一个与预期雕塑同样大小的球体粘土开始，用户将使用一个普通的2D鼠标绕着粘土的表面移动光标，并点击左键在光标下推动左键（垂直于界面 ），并按压右键将其往上拉。我们将使用一个加速推/拉界面（声明体素的数量）粘土朝着同样方向移动时传达双倍推或拉的体素数量，并在朝相反方向移动时减半数量，所以我们 可以通过二进位检索让时间与log2(n)成正比而发现适当的位置。再次假设用户只需要在界面上面向每个点进行一次操作。那么操作需要移动到下一个体素便是2log2(3)，因为这一 移动是二维的。然后总体的操作便是W = n2 ′ 2log2(3) ′ log2(n) = 2n2log2(n)log2(3)。尽管我们将移动限制在一个界面上，但这却比之前的UI还糟糕，这主要归咎于用于推 并拉动虚拟粘土的点击数。
在第四个界面中，用户将基于2D鼠标绕着界面移动，并像之前那样推/拉进/出点数。然而，粘土的界面将会有张力，所以推进/出一个点数将同时拖曳所有邻近的体素。结果便是界 面将基于与你使用控制点和齿条定义曲线的方式那样进行雕刻。然后在W = 2c2log2(n)log2(3)中，控制点数c便成为了雕塑的不规则函数，而不是体素的数量。对于高分辨率建模 ，n>>c，并且W=O(log2(n))。这是非常优秀的用户界面。
为了将认知工效学整合到W中，你也要计算玩家用于记住每个键盘快捷键和可点击图标的含义，并整合操作的衡量将显示信息转换到相关可用信息所需要的内存量。你同样也可以为 内存检索时间添加一个条件。面向不同内存类型的内存检索时间是明确的；为了记住一列选择汇总的一个，时间便是K+an，这里的n是选择的数量而K和a则是常数。如果你不知道如 何按比例分配认知条件让它们能够与非认知条件成比例，你可能会对它们进行分开概括。
我们有可能计算用户界面与其理论上最有效果成比例的效能。信息理论呈现的是如何计算一系列数字或其它符号中呈现了多少信息，这需要考虑到当前的情境，历史，可能的符号 的相关知识。如果你可以估算所有玩家移动的可能性，你便可以根据信息理论估算移动中所呈现的信息I。你的UI能够做得最好的情况便是让W呈现出与I一样的计算复杂性次序，也 就意味着O(W) = O(I)。
在带有许多单位的游戏中，你会发现在W这种，面对基于单位的用户界面，比起最大值，你的最小值将伴随着单位数更快速地增长。这是因为大多数情况下，许多单位都是在做着同 样的事，并知道少部分玩家的单位正做着那些能让有技能的玩家精确地预测到剩下的单位在做什么的事。这与我们雕塑粘土的用户界面类似：雕塑界面上的大多数点都带有界面切 线，其周边的点也是如此。要求你各个移动每个单位的界面相当于雕塑一个让你能够移动每个体素的界面。信息理论的使用让你能够估算I并知道你的UI所需要的真正维度，甚至当 你找不到一种简单的方法去显现单位间的连接时。
如今，比起玩家真正想要控制的，计算机总是会触发更多单位，并且这一数值还将继续以指数的方式增长着。这将导致玩家逐渐失去乐趣，并最终受挫。在一个优秀的用户界面设 计中，玩家不应该控制超过7个游戏实体。为了做到这点，UI可能会让玩家控制一些更抽象的内容而不是屏幕上的单位。这便要求对象导向型开发者将代码对象作为抽象对象，而不 仅仅是屏幕上的单位。UI也会给予玩家一个机会去指定离线行为，从而减少所需要的在线监管数量。
Too Many Clicks! Unit-Based Interfaces Considered Harmful
by Phil Goetz
Computer games traditionally have a player control one or more units on the screen. In early games, each player controlled one unit. As CPU power grew, players controlled more and more units. Today, a player might have hundreds of units, each one of which they must control individually. The unit-based user interface (UI) is no longer sufficient. This article will suggest a different way of thinking about UIs, and will discuss how to compare one UI to another, or one UI to the theoretical maximally efficient UI, to tell if your game can be improved. I’ll use examples primarily from strategy games, but it applies to UIs for programs of all kinds.
Approaching a Thousand Clicks a Turn
My favorite game of all time is Civilization, in all its incarnations. When I introduce friends to the game, they’re enthusiastic for one or two thousand years – which, in Civ, means about a hundred turns. By then the cities, the units, and the waiting have multiplied so much that it becomes, for the novice, a chore more than a game.
Even for a Civ addict like me, the game isn’t much fun after about 1800. Too many clicks. I counted the clicks, mouse movements, and keystrokes that it took me to get through one move of Civilization III in the year 1848. Many hours later, when that turn was done, I’d counted 422 mouse clicks, 352 mouse movements, 290 key presses, 23 wheel scrolls, and 18 screen pans to scroll the screen. This was making full use of all the Civ shortcuts, automation, and group movements. I probably would have made twice as many movements if I hadn’t been counting.
You may wonder why I’m talking about Civ III, when Civ IV has been out for months. I never bought Civ IV. I’d been waiting and hoping for a more playable Civ. What finally arrived was a Civ that takes just as many clicks, but with a new animated 3D UI.
Don’t get me wrong – Civ IV has important new gameplay aspects. Firaxis did far better than companies who create some new units, artwork, and cut scenes, and call it a new version. But I didn’t stop playing Civ III because I was tired of the game, or because it wasn’t pretty enough. I stopped because it takes too long to play a game. Civ didn’t need a prettier interface – it needed a more efficient one.
Overclick isn’t limited to Civilization. Real-time strategy games will leave you with even worse carpal tunnel. That’s why I don’t play Warcraft or its descendants online. In terms of clicking skills, I’m over the hill. Strategy is irrelevant in today’s real-time strategy games when you’re playing against a fourteen-year-old who can click twice as fast as you.
The RTS user interface hasn’t improved since Total Annihilation (1997), which had more useful unit automation than many current games. Meanwhile, the number of objects our computers can control and animate has increased, and continues to increase, exponentially. The old UI model isn’t at the breaking point – it ’s broken.
This article is about how to design a UI that lets players communicate their intent with fewer clicks. I’m not going to address UI ergonomics (physical ease of use) or cognitive ergonomics (issues such as eyestrain and human memory and processing requirements). The energetic reader should incorporate those as well into their UI evaluation, but it’s too complex for this short article.
The Rule of Seven
Strategy games typically place the user in the role of something like a battalion commander. But think about the job of a real battalion commander. He doesn ’t sit watching a map and barking orders to each tank on the battlefield. In combat, he has a small number of people he gives direct orders to. A squad leader in the US Army leads about nine soldiers, with the help of two fire team leaders. A platoon leader typically commands 4 squad leaders plus 3-4 other staff. A company commander typically gives orders to four platoon leaders plus staff; a lieutenant colonel might command a battalion of four companies plus staff. And so on, up the chain of command.
This is no accident. It is a fundamental principle of military organization doctrine that a commander can effectively manage only a limited number of subordinates. The number of directly-reporting subordinates that a commander has is known as his span of control. The most efficient span of control is believed to be the same for a platoon leader as for a theatre commander. 19th-century European armies settled on seven as the maximum span of control, and that number hasn’t changed much since.
The rule of thumb in the US military today is that span of control should be from 5 to 7. A supervisor in FEMA is supposed to oversee no more than seven subordinates during a disaster-relief effort. According to US House Report 104-631, the average span of control across the entire US government bureaucracy in 1997 was seven. Not coincidentally, seven is also the number of items that the average person can keep in memory at the same time (Miller 1956). This is
why phone numbers within an area code are seven digits.
This rule should apply to strategy game design as well. A player who is controlling more than seven entities can’t effectively supervise any one of them. (A corollary is that, if a turn takes a minute, and a player makes a move more than about once every 10 seconds, that player probably isn’t focused and isn’t getting an opportunity for the kind of deep, strategic thought that is supposed to be the source of enjoyment in a strategy game. Games that require a click
per second are arcade games, regardless of their complexity.)
The seven subordinates that a field commander controls often include one or two who don’t actually act in combat, but who merely relay information to the commander. This means that different information displays (such as the city screen and the technology screen in Civ) count towards the seven.
There are complications to this rule. Take chess – each player controls 16 pieces, and must be aware of 16 enemy pieces. That’s why chess is so frustrating to a beginner. But chess experts can do it. Are they breaking the rule of seven?
Well, yes and no. Gaining expertise in chess doesn’t consist of learning to keep track of more and more pieces in your head. It involves learning to break board positions down into separate, familiar structures – a pawn structure, a castled king behind three pawns and a knight, a set of pieces exerting influence on the center squares – in order to bring the number of concepts down to a manageable number (say, oh, seven).
The span of control in the US military is intended to be such that a commander can command and supervise one level down (his subordinates), and keep track of everything happing two levels down (his subordinates’ subordinates). The chess player is responsible for two levels of control: first, choosing the subgoal for each of these structures (e.g., “control the center” or “break up his defensive pawn structure”); second, choosing a move to implement each subgoal, and choosing just one of those moves. Thus, it isn’t keeping track of all the chess pieces, but having to choose every move, that violates the rule of seven. If you want your game to play faster than chess, and still involve strategy, you must observe the rule of seven.
You Can’t Have Your Cake and Eat It Too
A game designer might think she can have the best of both worlds by making a game in which the player can control every unit, but doesn’t have to. This, unfortunately, is not so. There’s a rule in economics called Gresham’s Law of Money: Bad money drives out good money. Gresham explained why, when a country tries to use both metal coins that have real inherent value, and paper bills that don’t, the paper money drives the coins out of the marketplace, until
everyone is using only paper money.
In gaming, bad players drive out good players. In roleplaying games, the bad roleplayers, who emphasize accumulating wealth and power over playing a role well, advance faster and eventually drive out the good roleplayers. In a game which allows control of individual units, adrenaline-filled 14-year-olds who can make three clicks a second will beat more thoughtful players who rely on the computer to implement their plans, because we’re still a long way from the
day when a computer can control units better than a player.
There is a player demographic that enjoys click-fests and micromanagement, and it may be the same 14-year-old males that the game industry’s magazines, advertisements, and distribution channels are aimed at. Trying to step outside that familiar demographic is always hazardous. (I believe games won’t be mainstream until they’re sold at Borders; however, that’s a separate rant.) No producer would want to lose this market share, so it might be good to have individual unit control available as an option.
However, the market of players who do not enjoy carpal tunnel, which I suspect is much larger than the market of 14-year-old males, is not just underserved by today’s games; it is completely unserved. If the game is to also allow control of individual units, it must be a separate game option, and players should be able to set up multi-player games that disallow individual unit control.
By now, you’re probably questioning my sanity and the wisdom of the Gamasutra editors. Am I saying that strategy games should only allow a player to build seven units? Not at all. I am saying that the player shouldn’t control them all directly. We need to conceptualize an intervening level of control. It isn’ t hard to do, but is hindered by a common misconception about object-oriented programming.
Objects Don’t Have To Be Objects
Smalltalk users called objects “objects”, and, what’s worse, they called methods “verbs”. Ever since, many object-oriented programmers have interpreted the word “object” as something like “noun”. I had arguments with other adventure programmers in the 1980s who insisted that a game wasn’t object- oriented unless the physical objects in the game were OO objects in the code. When I suggested organizing the code so that verbs in the game were objects in
the code, thus enforcing a consistent physics on the game, they said, “Objects are objects; verbs are verbs.” To this day, we organize our game code, and the user interface, around the physical objects in the game.
There’s no need to do so. Objects, in the OO sense, can be any abstraction you choose. In the case of Civ III, an object could be a military action or an engineering project. Consider Figure 1. In this figure, my civilization had recently developed the technology for railroads. I was attempting to construct a railway line connecting the north end to the south end of my civilization.
With the unit-centered interface of Civ III, this requires clicking on individual worker units and assigning them to individual sections of the railway line. Each worker must be assigned to a short section of the line, because if you start one unit from up north toward the south, and another unit from the south toward the north, they compute the path they will follow at the beginning of their assignment, and don’t adjust it to account for the work done in the meantime by other units – leading to a multitude of non-overlapping parallel railway lines, and armies that can’t get to your borders before you’re overrun by Roman legions.
I needed to assign about a hundred workers to building the railway line in order to get it built before being overrun. For each worker, I had to click on it once to bring it into focus; then type ‘g’ to begin a movement, scroll to its starting point on the railway line, and click again. Later, when it reached that point, I would have to type “ctrl-r” to build a railroad, scroll to the end of that unit’s portion of the railway, and click again. That’s three mouse movements, three keystrokes, and three mouse clicks per unit. I tried to keep the workers in groups of three, although this was possible only about
half the time. So it probably took me 600 clicks, keystrokes, and scrolls to build that railway.
Imagine if I’d been able to say that I wanted to build a railroad, click on its start, and click on its end. The computer would then have directed workers, as they became available, to work on sections of the railway. The entire railroad could have been constructed with the same amount of supervision that it took me to direct one worker.
The railway was needed to move troops to my borders to defend against the Romans. Again, each new unit built had to be individually routed to some point along the border to defend. Imagine how much less pain my wrists would be in if I could simply define the border, cities, or points to defend, to which the computer would direct surplus troops as they were built. But to implement this cleanly, the programmers would have to have conceived of railroads and borders
as first-class objects.
Off-Line Vs. On-Line Control
Part of the reason that a commander can get by with commanding only seven subordinates is prior preparation. He has drawn up scenarios in advance of any action, and can cause a quick and dramatic change in his battalion’s actions by ordering a switch from one scenario to another. His service branch has a standard library of tactics, from the squad level on up, which he can use during an action to explain his intent to his subordinates. His subordinates have rules of engagement to help them decide how to respond to a wide variety of enemy and non-combatant actions without his intervention. He can add to these rules prior to entering combat. He has many field exercises, and after each one, he tells his subordinates what they did right and wrong, and his superior tells him what he did right and wrong. This reduces the amount of direct supervision needed in combat.
If you design your game AI using a uniform formalism, such as a rule-based system or finite-state automata, you can open it up to your players as another way of directing their units. Creating rules of engagement for semi-autonomous units is considered necessary in real military wargame simulators such as JSAF or OneSAF. SAF, in fact, stands for “semi-automated forces,” which are units that can be given fairly sophisticated missions and rules of engagement, so that
an operator can supervise many of them and intervene as little as possible, while still providing a realistic training environment for the real humans controlling each of the opposing units.
Some games, like Quake, allow players access to the AI to program enemies; others, descended from Robotwar, give players units that must be completely programmed and that cannot be directed during gameplay. None provide an interface for semi-automated forces. Providing two user interfaces – one to be used off-line to provide rules of engagement, and another to be used on-line – could reduce the stress of handling individual units.
User Interface Profiling
To detect areas where your user interface is inefficient, you can play-test your game with a user-interface profiler. A UI profiler is like a code profiler, but instead of reporting CPU cycles, it reports user interface events. It should show you exactly how many clicks, keypresses, and mouse moves the player made within each part of your code. User interface profilers can present more sophisticated information, such as graphs showing the sequences of actions users took (Ivory & Hearst 2001).
You may be able to use your IDE’s profiler to count I/O events or function calls, but this won’t usually tell you what the player spends most of her time doing. You can get more information if you roll your own UI profiler by having your code call a routine to report UI events.If the developers of Civ III had used a UI profiler, they would have found an inordinate number of clicks and keystrokes being used in the negotiation portion of the game. This would have revealed a bug in the game – a missing scrollbar in the city-selection menu – as well as the need for the game to recommend a minimally acceptable number of gold pieces to offer in trade, rather than making the user conduct a binary search to find it. These two tasks accounted for over a hundred of the UI events in my count.
To compare different potential UIs, you need a way of keeping score. Going back to my Civ III UI counts, I can come up with a total UI score by assigning a point value to each type of UI action. How you assign points depends on what you want to measure. For an arcade game, speed might be the primary criterion, and so you might count a mouse click as being nearly as fast as a keystroke – faster, if the user is already using the mouse anyway.
For a game such as Civ, which takes hours for a single game, you should weigh wear and tear on the player more heavily. Clicking a mouse button takes more than an order of magnitude more force than pressing a key, and uses the same finger each time, so that it causes a great deal more stress to ligaments in the wrist. (Pain that people blame on typing is usually caused by mouse use.) So I’ll count each mouse movement and keystroke as 1 UI point; each mouse click as 3 UI points; each wheel scroll as 6 UI points; and each mouse pan (scrolling the map) as 9 UI points. This gives a total of 2208 UI points for the turn. Different UIs can then be compared by score.
A UI profiler can be used to evaluate and refine a UI that’s already been built. With a little math, though, you can evaluate a UI before writing any code. I’ll touch briefly on that next.
You can compute the work W that a player must do to specify a move using your user interface. If you measure W in terms of game variables, such as the size of the gameplay area and the number of player units, you can then compare different possible user interfaces, even if you can only estimate W.
For an example, consider a UI for sculpting virtual clay. You have a voxel array of size n′n′n, and each voxel can be on or off. We’ll consider four possible user interfaces.In the first interface, the user types in the coordinates of each voxel that she wants turned on. The number of keystrokes it takes to specify a number from 1 to n is proportional to log2(n), so this takes work proportional to 3log2(n) to specify each on-voxel. (From now on, I’ll drop the phrase “proportional
to” and write W=f(x), with the understanding that it means W=O(f(x)).) If we suppose the artist will efficiently specify only a surface, and not fill in the entire inside of the sculpture, then the total work will be W=3n2log(n). (The size of the surface is proportional to n2.)
In the second user interface, the user chooses a point in three-space with a three-dimensional mouse (such as 3DConnexion’s Spaceball), and clicks the mouse to toggle its on/off state. The work needed to go to a particular point in 3-space is then the combination of 3 movements; each movement is on an axis ranging from 1 to n, with (let’s suppose) an average value of n/2. This seems as if it would then take work W=n5/8 to create a sculpture. However, let’s
suppose that after turning on one voxel, the user moves to one of the 26 neighboring voxels. We will say that this takes work proportional to the information needed to specify one choice out of 26, which is log2(26). We’ll approximate it as log2(33=27), because the constant 3 in both this and in our previous value for W come from the three-dimensional nature of the sculpture. The total work is then W=3n2log2(3). This interface looks like an improvement over the
In our third interface, we’ll start the user off with a sphere of clay of about the same volume as the desired sculpture, and the user will use an ordinary 2D mouse to move a cursor around on the surface of the clay, and click the left button to push the voxel under the cursor down (perpendicular to the surface), and the right button down to pull it up. We’ll use an accelerating push/pull interface which states that the number of voxels pushed or pulled doubles when the clay is moved in the same direction as the last click, and halves when moves in the opposite direction, so that the proper position can be found with a binary search taking time proportional to log2(n). Suppose again the user only needs to work each point on the surface once. The work needed to move to the next voxel is 2log2(3), because this movement is in 2 dimensions. The total work is then W = n2 ′ 2log2(3) ′ log2(n) = 2n2log2(n)log2(3). This is worse than the previous UI, even though we’re restricting movement to be on a surface, because of the number of clicks that it takes to push and pull the virtual clay.
In our fourth interface, the user will move around the surface with a 2D mouse, and push and pull points in and out as before. However, the surface of the clay will have tension, so that pushing a point in or out will drag all the neighboring voxels along. The result is that a surface can be sculpted in a way similar to the way you can define a curve using control points and a spline. Then W = 2c2log2(n)log2(3), where c, the number of control points, is now a function of the irregularity of the sculpture, not of the number of voxels. For very high-resolution modeling, n>>c, and W = O(log2(n)). This is a vastly superior user interface.
To incorporate cognitive ergonomics into W, you would also count the amount of memory the player needs to remember the meaning of each keyboard shortcut, clickable icon, etc., and incorporate a measure of the work done to convert displayed information into relevant usable information (this is the tricky part). You could also add a term for memory retrieval time. Retrieval time estimates for different types of memories are given in (Anderson 1974); for remembering one of a list of options (say, possible commands for a unit), the time is K+an, where n is the number of options and K and a are constants. Cognitive terms may be summed separately if you don’t how to scale them so as to be comparable with the non-cognitive terms.
Theoretical Interface Efficiency
It is possible, although usually difficult, to compute how efficient a user interface is relative to its theoretical optimum. Information theory shows how to compute how much information is present in a series of numbers or other symbols, given the current situation, the history, and knowledge of what symbols are likely. If you can estimate the probabilities of all of the possible player moves, you can estimate the information I that is present, according to information theory, in a move. The best your UI can possibly do is for W to be of the same order of computational complexity as I, meaning that O(W) = O(I).
I/W is a measure of the efficiency of your interface; it can be at most O(1). For every variable in the expression for I, it will have the same or a larger exponent in W, so it is easier to think in terms of W/I, which is a measure of how badly you abuse your players. Estimating I is much harder than estimating W; you can often obtain only an upper bound on I. In order for W/I to be meaningful, then, it should be a lower bound.
What you find, in a game with many units, is that for unit-based user interfaces, your lower bound on W grows much faster with the number of units than your upper bound on I does. This is because, most of the time, a lot of the units do pretty much the same thing, and knowing what a small fraction of a player’s units are doing would enable a skilled player to predict with good accuracy what the rest of their units are doing. This is exactly analogous to our user interface for sculpting clay: Most of the points on the surface of the sculpture have a surface tangent almost the same as do the points near them. An interface that requires you to move every unit individually is the equivalent of a sculpting interface that makes you move every voxel. The use of information theory allows you to calculate I, and learn the true dimensionality required of your UI, even when you don’t have a simple way to visualize the connection between the units in play.
To explain how to compute W and I in general would take another article. You can figure it out from a book on probability and information theory. A good primer is the book version of the 1949 paper that defined information theory, The Mathematical Theory of Communication, reprinted in 1998 by the University of Illinois Press. In difficult cases, I can be estimated by a combination of theory, and statistics gathered during playtesting.
Computers can now animate more units than any player could reasonably want to control, and the number will continue to increase exponentially. This leads to player frustration rather than fun. In a good user interface design, no player should control more than seven game entities. To enable this, the UI may let the player control something more abstract than an on-screen unit. This requires object-oriented developers to think of code objects as abstractions beyond the mere units on the screen. The UI may also give the player a chance to specify behaviors off-line in order to reduce the amount of on-line supervision needed.
Game developers can evaluate their user interfaces using a user-interface profiling tool, and by computing the work involved in different interfaces. They can even estimate their theoretical efficiencies, to know for sure whether there’s room for improvement. The ultimate goal of game design is to increase the game’s FPS – fun per second. The easiest way to do that is to pack the same action into fewer seconds, and the easiest way to do that is usually to improve the user interface