游戏测试过程之决定测评内容

发布时间：2011-08-03 17:14:44 Tags：决定测评内容,定性数据,定量数据,游戏测试过程

作者：Paul Sztajer

游戏测试过程的第三个步是决定测评内容（前两个步骤分别是弄清测试目的与确定测试地点和测试对象）。测评标准多种多样，今天我们就来谈谈测评内容。

playtesting from blogspot.com

游戏测试中的测评工作相当重要。这是区分优秀测试和糟糕测试的关键因素之一。你或许完美设置游戏内容，但若未收集正确数据，就会漏掉重要内容。

所需数据多半取决于测试目的。一般来说，初期的现场测试最好参照定性数据（游戏邦注：这部分是因为我们很难根据少量定量数据下结论），而后期的在线测试最好参照定量数据（因为琢磨定性数据需耗费更多时间）。

这并不是说你在某阶段仅参考定性数据或仅参考定量数据：有时提问“你最喜欢/最不喜欢的部分”之类的问题纯属徒劳，而有时监测死亡数量根本毫无用处。最终，你需要综合二者。

那么，若你有众多需测评的内容，不妨从定性测评着手：

* 有趣和枯燥内容

* 困难和简单内容

* 令人沮丧的内容

* 玩家是否把握所有必需内容？

* 玩家是否存在不理解内容？

* 难度曲线是否幅度过大？

* 硬核玩家是否觉得内容枯燥/过于简单？

* 新手玩家是否觉得内容过于困难？

接着是定量测评：

* 死亡数量（玩家在哪里/哪个关卡失败）

–这是决定挑战的好方式（或者决定玩家在哪里开始变衰弱）

* 玩家位置

–借助这些元素创造热门地图以判断是否存在制胜策略。

* 道具/武器的使用数量

–有助于发现制胜策略，或者武器是否有效

* 何时获得晋级/经验积分/积分/能量

–另一监测挑战的方式

* 完成关卡的时间

–这就决定体验时间，决定游戏时序而言非常重要。同时还突显挑战理念（游戏邦注：虽然其他测评标准在此略胜一筹）。

* 选项

–这在非线性游戏中非常重要。你不会希望大家都选择相同内容。

* 何时退出

–这在在线测试中最有效，因为用户很少会选择在自己成功立足某关卡时退出游戏。这让你获悉某关卡是否真的令人感动沮丧/困难。

* 评级

–玩家评级（游戏邦注：通常就难度、有趣性和通俗性进行评级）能够帮助快速、轻松获得数据，这也把定性问题转变成定量性质。单一评级毫无价值，因为用户具有不同测评规模（这受问题表述的影响）：你希望获得每个关卡的评级，这样你就能够进行比较。

* 评级

–让玩家根据内容难度、趣味性和通俗性由低到高排列关卡/机制/游戏目标。

从以上两个列表中你会发现，很多定性数据都包含在定量数据中。但定量数据在测试玩家何时感到开心、悲伤和沮丧方面最有效，而从定性数据中，我们通常能够更有效发现玩家感受以及如何进行修复。（本文为游戏邦/gamerboom.com编译，如需转载请联系：游戏邦）

Playtesting 103: What to Measure

by Paul Sztajer

This is part 3 in a series on how to playtest games (part 1).

The next step in our playtesting journey is deciding what you should measure. Measurement comes in many forms, and so today we’ll mostly be looking at what you’ll be measuring. We’ll look at how to collect that data next time.

Measuring during playtesting is pretty damned important. It’s one of the major differentiators between a good playtest and a bad one – you can set everything up perfectly, but if you aren’t collecting the right data, you’ll miss something important.

The data you need will (shock horror!) depend a lot on what your test purpose is. As a general rule, early stage, on-location testing will benefit more from qualitative data (partly because it’s harder to come to conclusions from small amounts of quantitative data with a small sample), while later stage, online testing will benefit more from quantitative data (because wading through that qualitative data will take much more time).

This isn’t to say that you should have only qualitative or only quantitative data at any point: there’s no point at which a ‘What was your least/favourite part’ question will be wasted, and no point at which monitoring how many deaths won’t be supremely useful. In the end, you’ll probably want a mix of each.

So, now for a long list of things you might want to measure, starting with the qualitative:

* What they found fun and boring

* What they found it difficult and easy

* What parts were frustrating

* Did the player learn everything they needed to?

* What didn’t the player understand?

* Did the difficulty curve increase too dramatically?

* Do hardcore gamers find it boring/too easy?

* Do novice gamers find it too difficult?

…and continuing with the quantitative:

* Number of deaths (and where/what level they were in)

–A good way of determining challenge (alternatively, where they lose health can be useful)

* Player positions

–Use these to create heat maps in order to determine if there’s any winning strategies

* Number of uses of powerups/weapons

–Great to find out winning strategies, or if your weapons are even useful…

* Time of when upgrades/experience points/points/energy is gained

–Another way of monitoring challenge

* Time taken to finish a level

–This is really important for working out how long your game will actually take, deciding on timing etc. Also gives an idea of challenge, though other measures are probably a bit better.

* What options were chosen

–Valid for nonlinear games etc. You probably don’t want everyone taking the same branches.

* When they quit

–Most useful for online testing, as people are less likely to quit when you’re standing right there. Gives you an idea of whether a certain level is really frustrating/difficult

* Ratings

–Get players to rate things (in terms of difficulty, fun, understanding, etc.) for a quick and easy data grab, and a way of turning qualitative questions into more quantitative ones. A single rating is usually rubbish as everyone has a different scale (which is influenced by how you word the question): you want to get ratings per level to be able to compare them (how you compare these is something we’ll discuss later)

* Rankings

–Get players to rank levels/mechanics/game objects from easiest to hardest, least fun to most fun, understandable to confusing.

What you’ll notice from these two lists is that many of the qualitative data can be covered, at least partially, by the quantitative data. However, quantitative data is best at helping you to guess when someone is happy, sad, frustrated etc., while with qualitative data, you usually get a much more reliable read over how the player is feeling, and some better suggestions on how to actually fix it.

Again, I’ve probably left some of the data you might want to consider out – this is supposed to be a formative list to give you some ideas to start with rather than an exhaustive one. If there’s anything major missing, however, I’ll be sure to update the post.

Next time, we look at how you’re gonna collect this precious, precious data: starting with the quantitative.（Source：gamasutra）

分享到： QQ空间新浪微博开心网人人网

上一篇:以《愤怒的小鸟》为例解析游戏易用性标准

下一篇:分析玩家购买虚拟商品的真正原因