游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

独立研究者希望Facebook社交网站开放用户数据

发布时间:2010-12-24 11:48:41 Tags:,,,

近年来,社交网站Facebook得到了全球用户的广泛关注。与此同时,通过对该热门社交网站中匿名用户资料的研究,我们可以得出许多关于人类的有趣发现。

最近,在一篇名为“What’s on your mind?”的博客文章中,Facebook公司公布了一份基于100万匿名信息的分析结果。其中主要的发现包括:1,年轻人发誓的频率高于年长的人;2,年长的人倾向于谈论他人而非自己;3,热门人物较为频繁地谈论他人,电影和电视,起誓和使用宗教词汇;4,冷门的用户则用词谨慎。然而较之以上发现,本次Facebook的数据研究反映出了一个更为重大的问题:独立研究者们何时才能加入到Facebook数据的研究之中?

Facebook wordgraphs

Facebook wordgraphs

(上图,左侧表格中的词语年长的用户使用较为频繁,而右侧表格的词语则是年轻人和热门人物使用较为频繁。)

据游戏邦了解,长期以来独立数据研究者们都期望有朝一日可以对Facebook数据进行一系列研究。同时,这对于Mark Zuckerberg而言也是一种新的商业机遇。

数据研究者们之所以如此看重社交网站的用户数据是有一定的历史缘由的。几年前,当美国人口资料和银行住房贷款资料实现交叉互访和电脑分析时,独立数据研究者们以此发现许多美国主要城市的非裔美籍家庭在住房贷款上受到了歧视待遇。正式由于实现了数据的共享,研究者们才能及时发现社会中存在的各种问题,促使政府及早解决问题。

因此,虽然Facebook公司本身实行了对用户数据的分析研究,然而如果能将用户数据与广大独立数据研究人员共享无疑将更有利于实现对Facebook数据的全面研究。

对此,福布斯的Oliver Chiang也表示赞同Facebook应该开放用户数据。否则,数据研究人员们只能得到作用不大的表面结论,市场上甚至可能形成Facebook数据黑市。Oliver Chiang认为Facebook用户数据应该使用到有利于社会发展的方面。

上个月,Slate.com的Michael Agger也在一篇文章中谈论到Facebook用户数据。文章中他表示Facebook数据中关于交通状况的抱怨有助于交通管理人员发现拥堵路段;而关于当地学校的言论也可以促进教育家们及时发现问题,采取措施。

针对这一问题,HP实验室的社交科技研究员Bernardo Huberman(早几年当Facebook还未受到如此广泛关注时,Bernardo Huberman有权获得部分Facebook用户数据)认为“Facebook用户数据具有十分重要的商业价值。然而,Zuckerberg是一名商人而非研究者。社交网站Twitter的情况也大致与Facebook一样。现在,社交网站用户数据的研究处在一种停滞状态。近年来,研究者们无法获得过多的用户数据资料。对此,Zuckerberg等人的看法与数据研究者们不同。我们只能静待未来社交网站松懈这一限制。”

在此,游戏邦希望未来Facebook数据团队可以继续对Facebook用户数据的研究,并让广大用户有机会更深刻地了解自己。(本文为游戏邦/gamerboom.com编译,转载请注明来源:游戏邦)

Facebook’s data team proved once again today that when you analyze a large set of anonymous user data from the world’s biggest social network, you can learn some very interesting things about the state of humanity.

In a blog post titled What’s on your mind?, the company disclosed the results of its text analysis of 1 million anonymized messages. Among the findings: Young people swear more than older people and older people talk about other people more than just themselves. Popular people are more likely to talk about other people, TV and movies, to swear and use religious words. Less popular people are more likely to talk about work, sleeping, eating and thinking. These are but a few of the many observations made by the in-house data team. The biggest question about the data remains unanswered, though: what could a world of independent researchers discover in this data?

Above: Facebook found that the words on top of the left chart appeared more in profiles from older people, on the right, from more popular people. The company’s blog post contains 5 more graphs concerning other word correlations.

For Facebook to make bulk, anonymized data available to independent researchers has long been a hope of mine and I’ve argued about how important an opportunity this is all the way up to Mark Zuckerberg himself.

My favorite example of how data like this can be important is from history. When U.S. census data and bank home loan data were both made available for computer analysis and cross referencing for the first time, independent researchers unearthed a pattern of discrimination against African American families seeking to buy homes in big sections of major U.S. cities. This practice was called Real Estate Redlining and it was exposed thanks to aggregate data analysis. I am of the belief that social injustices of comparable significance, as well as opportunities for significant economic development, could be discovered in the patterns hidden across millions of Facebook status updates, friend connections, Likes and more.

It’s great that Facebook is investing some of its resources into analyzing this data itself, but great opportunity is lost if the company fails to allow outside researchers to analyze this data as well.
It’s great that Facebook is investing some of its resources into analyzing this data itself, but great opportunity is lost if the company fails to allow outside researchers to analyze this data as well.

Oliver Chiang, at Forbes, agreed with my argument in an article this month: “But really, what Facebook should do… is open up its data for research. Because they don’t, we get highly sanitized findings (like these top trends, or the finding that being active on Facebook leads to increased happiness), and even, reportedly, a black market for Facebook data. The company collects the thoughts, images and content of more than half a billion users – that data could be used for good.”

Slate.com’s Michael Agger wrote last month in an article discussing the opportunities latent in Facebook’s data, “It would be helpful for transportation planners to know the places where people complain the most about traffic. Educators could see the data and sentiment analysis around how a community feels about its local schools.”

Bernardo Huberman, a social technology researcher at HP Labs who was able to gain access to bulk Facebook data years ago, before the site was as large, controversial and armed with lawyers as it is today, is both understanding and hopeful.

“This data is amazingly important from a commercial point of view,” Huberman told me in a telephone interview last week.

“But [Zuckerberg], he’s not a researcher, he’s just a businessman. I have a feeling that Twitter’s situation is roughly the same; all this research stuff and so on is gravy. [In recent years] I’ve had very little traction in terms of getting access to their data. They are busy with other things, with keeping their business viable.

“They have a different view of it. Perhaps in a few years, Zuckerburg will relax and say ‘I want to be the kind of public figure that wants to release data’….but right now I don’t think that will motivate these people.”

I hope that’s not correct. I hope that every time the Facebook Data Team performs another batch of analysis on anonymized, bulk Facebook data and gives us an opportunity to look into our own souls – the potential that lies untapped in that data will be taken all the more seriously. That potential will never be realized if analysis of it is limited to the eyes, minds, interests, skills and perspectives of the company’s own researchers.(Source:readwriteweb)


上一篇:

下一篇: