游戏测试过程之如何进行定性测试

发布时间：2011-08-08 11:16:26 Tags：定性测试,观察,访谈,询问,问卷调查

作者：Paul Sztajer

本文是游戏测试过程内容系列的第5部分，前面我们已谈到测试目的，测试对象和测试地点，测评内容以及如何进行定量测试。

这里我们主要讨论其他无法量化信息的测试方式（游戏邦注：包括玩家反应和界面反馈卡点），文章由两部分内容组成：各定性数据测试方式及如何制作优质访谈内容和问卷调查。

questionnaire from blogspot.com

1. 定性测试方式

定性数据通常包括玩家反应和情绪：他们感到受挫或开心，他们在何处受困，在何处粘性最高。此部分内容还包括收集玩家反馈和建议。

收集此类信息有两种方式：观察和询问。

观察包括观看玩家体验过程以及记录玩家在游戏进展过程中的情绪状态或所说话语。HCI循环中的一个重要方法是让玩家在测试过程中“边想边说”，这样你就能获悉玩家在到底在想什么（这在探索“认知启示”方面作用显著，也就是界面传递给用户的功能信息）。边想边说方法存在的问题是，特别是在游戏领域，其会变更反应时间：在动作游戏中，其将提高游戏难度；而此方式在益智游戏中效果特别显著，玩家通过边想边说能够更快找到解决方案。

因此边想边说是个应谨慎使用的有效方式。它能够帮助我们深入分析玩家体验过程的精神状态，且能够从根本上影响此精神状态。

另一直观观察玩家情绪的方式颇有难度，部分是因为玩家对观察本身就有所反应。记录玩家信息有其价值，这让你无需亲自观察就能检查其反应。但你只有腾出足够时间浏览所有片段，此方式才能够生效。

另一获悉玩家定性信息的方式是询问：通过访谈和问卷调查。以问卷调查开始游戏测试非常重要，因为这能够帮助你获悉测试玩家的相关信息：游戏经验和知识以及特定测试内容。你需询问玩家在各关卡或环节的体验情况，以便获得最新相关信息。

那么你是否使用访谈或问卷？问卷非常适合直接询问某特定问题，能够快速落实，无需你亲自参与。访谈更适合探索性测试，你能够调整问题，更好发现某不满存在的原因。但这需花费更长时间，需要你亲自参与。

采用这两种方法你都需要设计问题（你可以选择匆忙进行访谈，但最好还是准备些基本问题，以确保活动顺利开展和覆盖所有内容）。

2. 问题设计

询问问题有两个基本方式：“选择”型和“描述”型。

选择型有3种基本形式：排名、评级和李克特量表。排名包括玩家地位及道具在挑战、效能和趣味等方面的排名情况。评级和李克特量表主要让玩家就某内容进行1-5的评级（游戏邦注：例如从“非常简单”到非常“困难”）。

李克特量表和评级通常包括5个选项（例如，“非常困难”、“困难”、“中等”、“简单”和“非常简单”），且需是奇数选项（这样玩家就能够选择中间立场）。李克特量表让玩家能够在两个极端值之间选择某标准，而评级则用于评定某内容的等级。最好能够制作系列相关内容列表（让玩家连续评定若干内容，例如，理解程度或让玩家评定某设计在告知玩家相关参数，以便进行有效体验的表现情况）。

在上述两种方式中，需注意避免所用语言将玩家导入某特定方向。采用“通过1-5标准评价挑战关卡的”而非“评价关卡难度如何”，“困难难”属于标准刻度中一个极限标准（这会促使玩家偏向此方向）。标准刻度的两端尺度也非常重要（1是“简单”还是“困难”？），同时在询问过程中要保持刻度一致性（游戏邦注：在问题中将1由“简单”转变成“困难”会使玩家感到困惑）。在某些情况下，转变标签效果非常必要，但只是个别情形。

而评级能够帮助我们获悉哪些游戏内容更值得关注：哪些内容最需完善。

这些基于选项的方法仅完成一半任务——其能够有效瞄准内容，但却无法说明其中原因。这就是为什么多数问卷调查都会在每个评级问题后加上“为什么”，以弄清是何原因导致内容获得此不尽如人意的分数。这里有个非常有效方法，就是将问卷调查和访谈结合起来：玩家书面回答“选择”问题，然后通过访谈让其陈述其中缘由。

接着是“描述”型问题。这里你会希望玩家能够提供些更自由的评论。需注意的是问卷调查不要融入过多此类问题（或把这些问题设置成选择性模式），特别是就在线测试而言。某些值得借鉴的问题是：

* 你最喜欢《XX》哪部分？为什么？

* 你最不喜欢《XX》哪部分？为什么？

* 游戏是否存在哪个部分内容令你感到沮丧？（注意此问题的提问方式）

* 是否觉得道具/技能/物品非常有用？哪个最突出，为什么？哪个最糟糕？为什么？

* 是否有其他意见？（这需出现在所有问卷调查中）

（本文为游戏邦/gamerboom.com编译，如需转载请联系：游戏邦）

Playtesting 105: How to measure Qualitatively

by Paul Sztajer

This is part 5 in a series on how to playtest games (part 1).

This time, we’ll be looking at how to measure those things that can’t easily be put into numbers: from player reactions and sticking points to interface feedback.

This is going to be a longer post, as there are really two things that I’ll be going through: different ways of measuring qualitative data, and how to write good interviews and questionnaires.

1. Ways of measuring the Qualitative

Qualitative data usually involves player reaction and emotion: whether they are frustrated or happy, where they have trouble and where they are engaged. It also involves gathering responses and suggestions by the player.

There are two main ways of gathering this data: observation and questioning.

Observation involves watching the player as they play. It can involve noting down their emotional state as the game progresses, or what they say. In HCI circles, a notable technique is to ask the user to ‘Think Aloud’ during the test, so that you know exactly what the user is thinking (this is especially useful in discovering ‘perceived affordances’ – that is, what an interface tells a user about its function). The issue with the think-aloud technique, particularly in the realm of games, is that it alters response times: in an action game, it will likely make the game prohibitively difficult; while the technique is often helpful in puzzle games, where the player thinking aloud will allow them to deduce the solutions more quickly.

Think-aloud is, therefore, a very useful technique which you should use with caution. It can give you a much deeper insight into the player mindset as they play, but can fundamentally affect this mindset as well.

The alternative, straight observation of emotion, is fraught with difficulty, not least because the player will react to the observation itself. Recording the player (audio and video) can help, as it allows you to review their reactions without watching them personally. This only works, however, if you have time to go over all the footage.

The other way to get qualitative data from players is by asking them: through interviews and questionnaires. It’s important to start off every playtest with a questionnaire that will tell you important data about the tester themselves: their experience and knowledge in gaming and the specific aspects you are testing. You should then ask the players questions about their experiences with the game after each level or section, so as to get the freshest view on each of these.

So do you use an interview or a questionnaire? Questionnaires are good at asking direct, specific questions. They are often faster to perform, and can be done without your presence. Interviews are generally better for exploratory tests, and allow you to adapt your questions to better find the cause of a specific gripe or issue. They do, however, take longer to perform, and require your presence.

Both of these techniques require you to design questions (you can perform an interview on the fly, but it’s better to have some basic questions to both start things rolling and ensure you cover everything you wanted to.

Which nicely segues us to part 2 of this mega-post:

2. Designing Questions

To continue the theme of ‘two things to talk about’ that seems to be threading its way through this post, there are two basic types of questions you can ask: the ‘make a choice’ kind and the ‘describe’ kind.

Making a choice comes in three basic forms: Rankings, ratings and Likert scales. Rankings involve the player ordering levels or powerups in terms of challenge, usefulness, fun or any other number of aspects. Likert scales and Ratings are about asking the player to rate a property from 1-5, ‘Very Easy’ to ‘Very Hard’, and so on.

Likert scales and ratings should always have at least 5 options (for instance, ‘Very Hard’, ‘Hard’, ‘Neutral’, ‘Easy’ and ‘Very Easy’) and should always have an odd number of options (this always allows the player to take a middle ground). Likert scales allow the player to choose a value on a scale between two extremes, while ratings are used to give magnitude to a specific property. It’s often useful to create lists of each of these (either asking the player to rate a number of aspects in a row in, for instance, how well they understood them; or asking them to rate how well the design informs the player of different metrics they might need to keep track of for effective play).

In both of these, it’s very important that the language you use to describe the choice doesn’t lead the player in a particular direction. Use ‘Rate the challenge of the level from 1-5′ rather than ‘Rate how hard the level is’, as ‘hard’ is one end of this scale (thus pointing the player towards that end of the scale). It’s also important to label the ends of the scale (is 1 ‘easy’ or ‘hard’?), and to beconsistent about your labels throughout the questions (changing 1 from ‘easy’ to ‘hard’ between questions only confuses the player). There are certain circumstances in which it’s useful to swap these, but these are incredibly rare.

Rankings, on the other hand, are about being able to pinpoint which of a number of aspects of a game requires the most attention: which levels are most in need of work, for instance.

All of these choice-based measures only give you half of the story – they’re usually good at pinpointing what you need to work on, but are much less useful for working out why. This is why most questionnaires will include a ‘Why’ after each rating question, to find out what, particularly, lead to a less than perfect score. A great technique to use here is to combine a questionnaire with an interview: get the players to answer ‘make a choice’ questions on paper, and then get them to justify these in the interview.

Thus, we reach the ‘describe’ form of question. This is where you want the player to make some more freeform comments. It’s probably good not to clog up your questionnaires with too many of these (or to make them optional), especially if players are testing online. Some good questions you can use here include:

* What was your favourite aspect of *? Why?

* What was your least favourite aspect of *? Why?

* Did you find any aspects of the game frustrating? (note that this question is deliberately leading)

* Did you find the powerups/skills/units useful? Which was the best one and why? Which was the worst and why?

* Any other comments? (This should be in EVERY questionnaire)

And that’s about it. I had to cover a lot of stuff here in a single post, and so some of it might be a little generalised (please let me know if you find this the case – I can return to some of these topics later to explore the subtleties of them if needed). Next time we’ll look at the actual running of the playtest, and what you do on the day!（Source：gamasutra）

分享到： QQ空间新浪微博开心网人人网

上一篇:开发者应优化“零能量时间”实现游戏盈利

下一篇:移动广告能为应用开发商创造可观盈利