游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

分析苹果应用评价中出现频率最高的词汇

发布时间:2013-05-28 17:28:41 Tags:,,,

作者:Igor

Appfigure最近深入观察了应用商店中用户评价信息,并从中发现了一些有趣的结果。

数据

尽管这个市场上存在多种语言的用户评价,但我们在此仅以英文用户评价为例。多数文本分析工具是基于英语而创建,因此英文评价内容更易于分析,并且这也是Appfigure最熟悉的语言。

我们决定针对来自下列英语国家的iOS和Mac App Store用户评价进行分析:美国、加拿大、英国、澳大利亚、新西兰。我们的样本约为2500万条用户评价。

可视化所有数据

下一步我们就要确定如何分析并呈现如此多内容。经过一番头脑风暴之后,我们采用了一个简单而有效的方法:词汇云。我们将2500万条评价导入混合器,就产生了以下结果:

all_ios_reviews_word_cloud1(from appfigures)

all_ios_reviews_word_cloud1(from appfigures)

我们意外地发现这一词汇云极具积极性。作为应用开发者,我们非常清楚用户有多挑剔,并认为其评价不会那么热情洋溢,甚至更具批判性。所以我们再次运行了数据,但还是得到了同样的结果。“好棒”、“大爱”、“有趣”、“不错”等词汇的出现频率远超过“差劲”、“没用”、“浪费”、“糟糕”等消极词汇。

其他发现

不能因为某个词具有积极或消极性,就认为该评价句子中就不存在其他修饰性的词语。虽然人类进化令我们得以鉴别微妙的语言,但计算机却还无法实现这一点。因此我们又开捣鼓数据,看看我们是否能够更为了解每个词语的语境。

我们先根据星级划分用户评价,发现星级(1-5颗星)通常可以准确反映用户评价的整体感情色彩。

我们再次借助混合器,创造了5星评价的词汇云:

five_star_ios_reviews_word_cloud(from appfigures)

five_star_ios_reviews_word_cloud(from appfigures)

并将其与仅有1星评价的词汇云进行对比:

one_star_ios_reviews_word_cloud(from appfigures)

one_star_ios_reviews_word_cloud(from appfigures)

很显然,在极为消极的评价中不太可能出现“大爱”、“好看”等词汇,在此颇为盛行的是“崩溃”、“浪费”等消极词汇。我们还对比了2星与4星评价,发现正如我们所料,这其中也有一些积极与消极词汇用量的变化。

添加颜色

我们再次进行了一些疯狂的尝试:我们根据出现频率为每个词汇分配“积极”分数或“消极”分数。然后重新创造了原始词汇云,令高分值的词汇显示为绿色,低分值的显示为红色,介于中性的呈现灰白色。

all_ios_reviews_colored_word_cloud(from appfigures)

all_ios_reviews_colored_word_cloud(from appfigures)

我们无法确定这种试验性的分析可以得出什么结论,但其结果无疑极具准确性。我们很惊讶地发现这一算法十分可行,它让所有消极评价的词汇都呈现红色,而显眼的积极词汇则呈现绿色。

由此可见,我们原先认为用户评价都十分挑剔和苛刻的想法是错的,第一个词汇云就很能说明问题:iOS和Mac App Store用户评价中的正能量远超过负能量。(本文为游戏邦/gamerboom.com编译,拒绝任何不保留版权的转载,如需转载请联系:游戏邦

Data Bits: App Reviews Are Warmer And Fuzzier Than You May Think

By Igor

We recently introduced a brand new section to the blog called Data Bits. This is where we choose an interesting set of data, analyze it, and turn it into bite sized blog posts for your reading pleasure.

Last time we looked at app reviews and how many of them a typical developer can expect to have. This time we’re diving a bit deeper into the data to find out exactly what sort of message those reviews are trying to communicate.

The data

While there are tons of reviews in almost every language, in this post we’ll be focusing specifically on ones written in English. Most text analysis tools are built around English which makes those reviews easier to analyze. It also happens to be the language we’re most familiar with.

After some internal discussion over the best way to slice the data by language, we decided to grab a slice of iOS and Mac App Store reviews from these major English-speaking countries: US, Canada, UK, Australia, and New Zealand. Our sample comes out to roughly 25 million individual reviews–more than enough to give us an idea of what’s going on in there.

Visualizing it all

Next we needed to decide how to analyze and present so much content. After much brainstorming we settled on a method which is both simple and effective: word clouds. Once all the pieces were in place, we threw all 25 million reviews into the blender and here’s what came out:

We were pretty surprised at how positive this word cloud seems to be. Being app developers ourselves, we’re quite familiar with how picky reviewers tend to get, and we assumed that reviews would have been a bit less glowing and slightly more critical. So we ran the numbers again but the same results came out. ‘Great’, ‘love’, ‘fun’, and ‘good’ are used way more often than words like ‘poor’, ‘useless’, ‘waste’, and ‘sucks’.

And that’s it… NOT

Just because a word is positive or negative on its own doesn’t mean there aren’t other words in the sentence modifying it. While evolution has fine-tuned us humans to identify such language nuances, it’s not so easy for a computer. So we started tinkering with the data to see if there’s anything clever we can do to get a better idea of the context around each word.

We started off by sectioning the reviews according to their star rating. We figured that the star rating (1 – 5) of a review is usually a good indication of its overall sentiment.

We turned to the blender once more, this time creating a cloud of words from only 5-star reviews.

Compare that with a word cloud of all 1-star reviews:

There’s a definite contrast here, showing that words like ‘love’ and ‘beautiful’ aren’t thrown around as much in very negative reviews, while words like ‘crashes’ and ‘waste’ aren’t very popular in positive ones. We did the same breakdown with star ratings 2 through 4, and, as expected, there was a gradual change in the use of positive and negative words.

Adding some color

Armed with this new information we decided to try something crazy: We’ll assign a ‘positivity’ score to each word depending on how often it appears in positive (highly rated) reviews and how often it appears in negative (low rated) reviews. We then recreated the original word cloud, this time coloring words with a high score green, those with a low score red, and everything in the middle gray.

We weren’t sure what to expect out of this experimental analysis method, but it turned out to be pretty spot-on. We were surprised at how well the algorithm does at coloring words with a negative connotation (such as ‘crashes’, ‘waste’, and ‘useless’) red, while highlighting the positive ones (like ‘great’, ‘love’, and ‘good’) in green.

So it looks like what we suspected originally about the critical and picky reviewer was wrong, and that the first word cloud above was pretty telling on its own: there are way more positive things being said about iOS and Mac apps than negative. Who would have thought?(source:appfigures


上一篇:

下一篇: