游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

游戏用户研究人员谈为什么反感趣味性评分

发布时间:2019-12-06 09:10:59 Tags:,

游戏用户研究人员谈为什么反感趣味性评分

原作者:John Hopson 译者:Willow Wu

游戏行业的用户研究人员经常需要为正在开发的游戏打分,也就是所谓的“趣味程度(fun scores)”——对游戏整体情况做个评价,也能让游戏团队对未来的评分有个概念。有些发行商会保留以往测试过的所有游戏的评分,从而能够得出“这个游戏比其它95%的游戏都好玩”,类似这样判断。很多人认为它是一项重要的参考数值,利用不同阶段的评分可以绘制出一条充满希望的上升曲线,显示出游戏的整体质量和优化成果。

作为一个拥有15年从业经验的专业游戏用户研究人员,这类问题我已经问过无数次了。我的游戏基本上都能获得不错的分数。尽管如此,我也思考过这种趣味性评分是否会对一个优秀游戏的发行过程造成不好的影响。别误会,我也把趣味性看作是游戏最重要的特质,这就是为什么我要做游戏,以及为什么人们愿意玩我的游戏。但是我不认为趣味性评分在绝大多数情况下都有极其重要的价值。

我提出这个观点并不是因为趣味程度很难衡量或者我们不应该去评价。问玩家觉得这个游戏有多好玩,由此你可以对用户体验先有个大概了解,这就像问某人是否太冷或太热可以推断温度的高低。独立用户的评分能够帮助你了解游戏中哪些设计有效果,哪些没效果。但我们这个行业的问题所在就是把所有独立玩家的评分加一加除一除,用一个平均数来代表游戏。

C.A.T.S.:Crash Arena Turbo Stars(from pocket gamer.biz)

C.A.T.S.:Crash Arena Turbo Stars(from pocket gamer.biz)

我讨厌这种平均分的原因是:

1.它阻止了对话

给游戏整体打个分数,对话就在这里终结了。如果是高分,团队就会感到非常骄傲,忽略其它问题,毕竟分数都这么高了,再糟能糟到哪去呢?如果是低分,团队就会立刻开始寻找原因、抛弃当下的计划——也许是目标用户不对;也许是他们玩的时候不够长,还没体验到有趣的部分等等。这么低的分数不可能是真的,所以测试肯定是有问题的。

当你给某个游戏整体评分时,你是在同时评判他们在游戏中所做的一切,你是在攻击(或美化)他们作为开发者的自我形象。更细化的分数,创作者就不太容易产生这样的感受。

对开发中的游戏进行测试,真正目的在于开启对话,而评分则是封锁对话的元凶。我们不仅仅是想看其他人对游戏的评价,我们想开启一场多人参与、富有成效的对话,让我们了解怎样才能让游戏变得更好。

2.它是不可执行的

玩家说这个游戏不好玩,但单凭这样的反馈,你无法做出什么明确有效的应对措施。当然,我们也希望自己做的游戏很好玩。但仅仅知道它不好玩并不能决定我们下一步的具体行动。不好玩是因为太难还是太简单?玩家的感觉是无聊还是困惑?我的键盘上可没有“增加趣味”的按键。

知道游戏不好玩,正确的应对办法是深入分析其它数据,找到问题的蛛丝马迹。

3.特别容易受偏见影响

在面对面时人们一般都会说你的游戏好玩,除非它是真的一无是处。因为在大众看来,好不好玩是游戏最重要的指标,想要让开发者高兴,最少都得有一句“你们的游戏真有意思。”

尤其是像我这样在内部测试游戏的,参与者知道他们是在与创造游戏的人交流。这些人走进一幢有标有工作室名字的大楼,进入大厅,这里通常挂满了大幅印刷的游戏原画,还有摆满各种奖项的展示柜。在那种情况下想要保持中立是不现实的。

带上一点点的偏见当然不会取消他们的测试资格。那些对游戏有好感的人仍然会在控制和寻路方面遇到困难。但是对游戏整体做趣味性评价恰恰最容易受到偏见的影响,这是一种模糊的判断。

4.多人游戏的乐趣不能算在内

只要能跟其他人一起玩,就算是井字棋这种简单的游戏也能充满乐趣。只要有另一个人参与,即使是世界上最无聊的活动也会变得有趣。任何可以多人一起玩的游戏都是如此。如果你的游戏只有在多人情况下才能体会到乐趣,这就算不上真正的有趣。

这对于那些一开始就先做PvP模式的游戏来说,这就是个尤为棘手的的问题。因为AI往往是游戏的后期添加,大多数计划包含PvE和PvP的游戏都倾向于先创建PvP,并且很容易让自己误以为游戏本身很有趣,而实际上这种乐趣是来源于享受跟同事一起玩。要避免这种陷阱,你可以先规定开发者在内部测试时不能交流,但跟熟知的人一起玩游戏的好处依然还是存在的。

关于这个话题,我承认我是个离群值。游戏行业的人是真心想要让自己的作品变得很有趣,并希望他们的研究人员能够向玩家发问并提供一个平均数。

下面是一些如何更好应对趣味性评分的建议:

1.询问然后放一边去

请求别人对游戏做趣味评价的确是一件有压力的事。玩家在期盼,游戏团队也在期盼,如果我们不去问,他们之中可能会有人用一种不理想的的方式提出这个问题,甚至会对项目研究的其它目标造成负面影响。

因此,有时最聪明的做法就是先让测试玩家给出综合评价,但之后不要把这个数字放在心上。即使我们不对这个数字做任何回应措施,让玩家给出综合评价有利于他们在后续给予更清晰、明确的反馈。或许他们甚至觉得把问题说出来会更舒服些,毕竟他们刚对游戏给了一个总体好评。

我们要这个分数,并不代表着以后会将这个数字看得非常重。你问完了,然后把结果放一边去,把注意力聚焦在那些真正能够让游戏变得更好的细节上。别让这个分数成为研究的核心。

2.晚点再问

处于开发阶段的游戏一般都没有那么好玩,游戏中的很多部分——比如菜单,本身并不是为了起到增加趣味的作用。在很多研究中,乐趣并不是问题所在,我们可以直接把评分环节推到开发的后期。

说“我们现在不会测试游戏的趣味性,几个月后才会”的优势在于这样能表现你的真挚(我们最终确实需要问这个问题),而不是直接告诉别人这个问题是没有意义的,引起冲突。游戏团队提出这个要求并没有错,只是时机不是最好的。

3.利用它来帮助团队面对真正的问题

用户研究有时就像洪流冲击,负面评价一个接着一个,不断指出游戏的问题所在。为了应对这一点,在呈现结果时你可以先从积极的东西说起。在深入研究所有需要修复的问题之前,要让大家明确“我们都是同一条船上的好伙伴”。

当然,趣味评分可能会高于实际,这并无大碍,尽管这并不是一个准确的数字,但你还是可以利用它产生一些积极的影响。把这个漂亮的分数先亮出来,让每个人都有一个好心情,然后在深入分析需要解决的问题。

对开发团队而言,相比“一切都糟透了”的正面打击,带着“大家觉得游戏好玩,但是还有一些东西需要改进”的心态去着手接来下的工作会轻松很多。简而言之,过高的评分仍可以为游戏服务,它可以促使大家以一种更好的心态迎接难题。

4.鼓励“但是”

参与者们经常会说“游戏挺好玩的但是……”,有人会说“但是有点让人摸不着头脑”或者“但是不是我的菜。”这后半句对游戏的提升、改进来说可能非常重要。

我们得引导他们给出更详细的意见。先问好不好玩的问题,然后再问细节方面的问题。给出一个好分数后,参与者也许会更愿意谈论游戏里不满意的地方。

5.缩小问题范围

跟上一点类似,通过针对性询问游戏的某一部分,让玩家更容易说出不满的地方,不会让他们感觉自己是在批评游戏开发者。玩家愿意说“游戏挺好玩,但是任务没什么意思”或者甚至“任务挺有意思的,但是结尾的那场战斗感觉挺无聊的。”我们要提供整体评价机会,也要提供细节评价机会。

6.分析玩家

要挖掘综合评分的价值,你需要做的不是分析这个平均数,而是看看说游戏非常有意思的玩家和说游戏优点屈指可数的玩家,他们之间有什么不同。觉得好玩的人有用什么特别的武器吗?他们之前玩过同类型的游戏吗?他们探索了地图的另一边吗?每一个不同点都相当于是一条线索,让你了解游戏中的哪些部分是有效的,哪些部分需要改进。

把所有人的评分混加在一起,只得出一个平均数,这是不得要领的做法。不同的玩家有不同的经历,这些不同点就是让游戏变得更好的关键。

7.细节很重要

虽然我们一直在谈论趣味性,但这一观点也适用于其它类型的游戏整体评分。游戏整体是否有趣、总体平衡状况是否良好、整体是否容易上手,这都没关系,人们总是会基于他们当下的体验继续玩或退出游戏。我们要赢得玩家每一秒的时间。游戏整体可能不错,但玩家体验过程中还是会遇到无法跨越的bug。“之后就会好的”这种答复对乐趣中断的玩家来说无疑没有任何帮助。

当专注于帮助设计师实现游戏愿景时,游戏用户研究能够发挥非常大的作用,直接给这一愿景评分是一种违反职业精神的行为。相反,当我们将这种愿景的最终价值作为一个既定的目标,围绕发现玩家没有经历过的所有(开发者精心设计的)小细节来展开工作,我们便能获得最好的成果。没有人会因为给游戏打出合适的分数而获得奖励,但是改进游戏会,开发团队和玩家都能获得实质性的回报。

这听起来可能会觉得像个悖论,但是让游戏整体变得有趣的最好办法是在开发过程中无视游戏的整体趣味程度。先不看大目标,专注于完善游戏的精彩时刻和机制,我们所创造的不仅仅是乐趣。

本文由游戏邦编译,转载请注明来源,或咨询微信zhengjintiao

User researchers in the games industry are often asked to produce “fun scores” for games in development — overall grades for how well a game is doing and an indicator of review scores to come. Some publishers keep extensive databases of the fun ratings for every game they’ve tested, so they can say things like “this game is more fun than 95% of games.” It’s thought of as a key metric that can be plotted in a hopefully upward curve towards release, showing the improvement and quality of the game as a whole.

As a professional games user researcher for more than 15 years, I’ve asked this question many times and my games have generally scored pretty well. In spite of that, I’ve come to consider overall fun ratings to be detrimental to the process of shipping a good game. Don’t get me wrong, fun is absolutely the most important attribute of my games; it’s why I make games and it’s why people play my games. But I don’t believe measuring the fun of a game as a whole is especially valuable in most circumstances.

My objection is not that fun is difficult to measure or that we shouldn’t measure it. Simply asking players how much fun they’re having is a reasonable first approximation of their experience, the same way asking someone if they’re too cold or too hot is a good approximation of temperature. And scores for individual users can be useful in understanding what’s working or not working in a game. But where we as an industry get into trouble is averaging player scores together into a single number to represent the game as a whole.

Here’s why I hate average fun scores:

It stops the conversation

Once you give a game an overall score, the conversation ends. If it’s a high score, the team pats themselves on the back and tunes out all the other issues, because how bad could they be? If it’s a low score, the team immediately starts looking for reasons to dismiss the entire study — this wasn’t the target audience, they didn’t play long enough to get to the fun part, etc. The bad score can’t be true, so the entire test must be bad.

When you give an overall score to someone’s game, you’re judging everything they did on the game simultaneously; you’re attacking (or buffing) their self-image as a developer. With smaller, more granular scores, it’s easier for a creator to separate themselves from the project.

Overall fun scores block the conversation that is the real purpose of testing a game in development. We don’t just want to grade the game; we want to have an engaged, productive conversation about how it can be made better. Fun scores can inhibit that discussion.

It’s not actionable

When players say a game isn’t fun, there isn’t a clear next step based on that statement alone. Sure, we want the game to be fun, that’s what my games are for. But just knowing it isn’t fun doesn’t dictate a particular course of action. Does that mean it’s too hard or too easy? Are players confused or bored? There’s no “add more fun” button on my keyboard.

The correct response to knowing a game isn’t fun is to dig down into other data, looking for clues as to what the problem is. Since you’re going to be doing that anyway, the fun question becomes redundant.

It’s particularly subject to bias

People always tell you to your face that the game is fun unless it’s absolutely awful. Because fun is perceived to be the most important thing about a game, players who want to please their hosts will at a bare minimum say the game is fun.

This is especially true of the kind of in-house playtesting I do, where the participants know they’re talking to the people who made the game. They’ve just walked up to a building with the studio’s name on it, entered a lobby that’s usually covered in gigantic prints of art from the game and a case full of awards. It’s unrealistic to expect a neutral position after that kind of build up.

A bit of bias certainly isn’t disqualifying in a playtester. People who are inclined to like a game will still have trouble figuring out the controls or navigating the game world. But an overall fun rating is exactly the sort of vague, holistic question that will be most affected by bias.

Multiplayer fun doesn’t count

Even Tic-Tac-Toe is fun if it’s played with other people. The most boring activities in the world can be fun if there’s another human involved. The fun generated by other people is a given, something that would be equally true of any other game that group of friends could be playing. If your game is only fun when played with fun people, then it’s not actually fun.

This is a particular challenge for games where the PvP mode is built first. Because AI tends to be a later addition to games, most games that intend to include both PvE and PvP tend to build PvP first and can easily fool themselves into thinking the game is fun when they really just enjoy playing with their co-workers. You can somewhat offset this effect by making sure the developers can’t talk to each other during an internal playtest, but there’s still an inherent lift from playing with good people you know.

Now, I recognize that I’m an outlier on this topic. People in the games industry honestly want their games to be fun and expect their researchers to ask about it and provide an averaged metric.

Here are a few suggestions for ways to better handle overall fun scores:

Ask it then set it aside

There can be a lot of pressure to ask about fun. The players expect it, the team expects it, and if we don’t ask it one of them will bring it up in a way that might sabotage the other goals of the study.

Therefore, sometimes the smartest thing we can do is ask upfront for an overall evaluation of the game and then ignore it. Even if we do nothing with that data, asking it clears those general impressions out of the way and the participants’ subsequent feedback will be much cleaner. They may even feel more comfortable about bringing up problems now that they’ve given a good overall grade.

Just because we asked it doesn’t mean we have to emphasize the data from the question. Ask it early, then set it aside and dive into the details where the real work of making a better game happens. Don’t make it the centerpiece of the study.

Ask it later

Games simply aren’t fun to play at many points in development, and there are many parts of a game such as menus that aren’t intended to be fun on their own. There are a lot of studies where fun isn’t at issue, and we can simply put off asking about it until later in the development process.

Saying “we’re not testing for fun yet, that’s something we’ll do in a few months” has the advantage of being true (we do need to ask the question eventually) and less confrontational than straight up telling someone that asking about fun isn’t helpful. The person making the request of you is correct, just not right now.

Use it to help the team accept the real issues

User research on a game can sometimes feel relentlessly negative, continually pointing out what’s wrong with the game. To offset this, presentation of results should open with something positive and establish that we’re all buddies on the same team before digging down into everything that needs to be fixed.

Sure, the overall fun score will be higher than it probably deserves to be, but that’s fine; just because it’s not an accurate metric doesn’t mean it can’t be deployed to good effect. Put that nice, high fun score upfront to get everyone in a good mood, then go into all the details of what needs more work.

It’s much easier for a development team to operate from a mindset of “it’s fun but we still need to fix these things” than from “everything is awful.” An inflated fun score can still serve the game by putting everyone in the right mental space for absorbing the harsher details.

Enable the “but”

Participants often say “it’s fun but…” as in “it’s fun but a little confusing” or “it’s fun but it’s not for me.” The second half of those statements can be extremely useful in making a better game even if the first part isn’t.

We have to make it easy for people to qualify their overall statement. Ask them the big fun question first, then ask all the little detail questions. Having given a good overall score, participants may be much more willing to criticize specific aspects of the game.

Narrow the focus

Similarly to the “fun but…” concept, by asking about a narrow section of the game we make it easier for players to say critical things without feeling like they’re criticizing the game makers. Players are willing to say “the game was fun, but this mission wasn’t fun” or even “this mission was fun, but that one fight at the end wasn’t fun.” We just have to give them the chance to rate both the whole and the parts.

Score the player, not the game

The way we make fun scores valuable is not to analyze the overall average score, but to look at what differs between the players who said the game was very fun and those who said it was merely kinda fun. Did the ones who had fun use particular weapons? Did they have more prior experience in the genre? Did they travel around a different side of the map? Each of those differences is a clue into what parts of the game are working and what needs improvement.

Mushing all the ratings together into an overall score misses the point. Different players have different experiences, and those differences hold the key to making a better game.

Details matter

While we’ve been talking about fun, this argument applies to every type of global rating for a game. It doesn’t matter if the game is fun as a whole, well balanced on average, or easy to learn in general. People keep playing or quit based on their current experience right now. We have to earn every second of ongoing player time, moment by moment. A game can be good overall while still having giant potholes in the player journey. “It gets better later” is cold comfort to a player who isn’t having fun now.

Games user research is at its best when it focuses on helping designers achieve their vision for the game; directly grading that vision is counter to the spirit of the profession. Instead, we succeed best when we take the ultimate value of that vision as a given and frame our work in terms of discovering all the tiny places where players aren’t experiencing the designers’ intent. There’s no prize for correctly grading the game, but making the game better has tangible rewards for the development team and the player.

It can seem paradoxical, but the best way to make a game fun overall is to ignore the overall fun of the game while it’s in development. By setting aside the larger goal and perfecting the individual moments and mechanics, we produce something more than the fun of its parts.

(source:gamesindustry.biz


上一篇:

下一篇: