# 关于Zynga平均每付费用户收益多寡问题的分析

attrition-rates(from genengnews.com)

Collins和Reil认为原分析方法过于简单，这是正确的。原分析结果是基于完全未考虑用户流失率的模式而得出的。他们声称Zynga完全通过营销资金来获得逾40万的所有付费用户，在这个前提下，他们的结论是正确的。

What is Zynga making per paying user? Nobody, not even Zynga, will ever know.

DARIUS KAZEMI

There was a post on some news site last week with an article by Louis Bedigian quoting an analyst (Arvind Bhatia) claiming that “Zynga loses \$150 on every new paying customer.” I read it, thought to myself, “That’s absurd linkbait,” and then assumed that nobody would take the bait. I was wrong: it got picked up everywhere. Sigh.

This morning, Andrew VandenBossche alerted me to an article by Dylan Collins, quoting industry CEO Torsten Reil, that responds, no, you idiots! Your methodology is wrong! “Zynga is probably MAKING \$30 on every paying user!” So… here’s what I think: nobody knows what the fuck is going on. (For those of you wondering why I’m writing about this, before I did HTML5 stuff full time, I spent 6 years as a data analyst for game studios, both MMO and Facebook games.)

Surface analysis

Collins/Reil are absolutely right to call the original analysis oversimplified. It was based on a model that completely failed to account for attrition — they’re correct when they state that Zynga certainly gained far more than 400k paying users for their marketing money.

Unfortunately, Collins/Reil pick a number out of thin air (20% attrition rate) which results in a rough estimate where Zynga spends \$120 per paying user, and makes \$150 per paying user, resulting in a net profit of \$30 per paying user. I say unfortunately because if the number is 10%, then by Reil’s metric they’re losing \$21 on every paying user. If it’s 30% they’re earning \$57 per acquired paying user. It all hinges on their attrition rate, which we don’t know! Some games see 10%. Some games see 90%. 20% seems like a roughly correct ballpark for a mix of successful and unsuccessful games, but honestly we have no idea what it is because we’re on the outside looking in. But the truly weird thing to consider is: Zynga doesn’t know what their attrition number is either.

Models and black boxes

All numbers like this are built on models that analysts put together, and models are built on assumptions. Simple example: when we talk about attrition, what phenomenon do we refer to? Typically we mean “the moment when someone is no longer a player of the game.” Yet in the context of a social game, how do you define that? Facebook users don’t typically uninstall an app — they usually just stop using it. So you have to pick an arbitrary cutoff point. Does someone fall into an “attrition” bucket after 1 week of inactivity? 2 weeks? A month? Remember, this number is arbitrary, so you can adjust that number all you like (within reason, you’re not going to pick 100 years) until you come up with an attrition percentage that meets your criteria. Whether those criteria are “seems more realistic” or “would appease our shareholders” is another question!

But regardless, this attrition percentage then affects all of your other calculations. Now, ideally you want to remain internally consistent once you pick this number, but a dirty secret is that even if you maintain perfect internal consistency in always using “2 weeks of inactivity” as your cutoff for attrition, there will always be dozens of other fiddly and less directly consequential definitions that you can tweak. And the thing is, on some level you have to tweak these numbers! Otherwise you might find yourself stuck with a model that doesn’t reflect what looks like the reality of your game.

To put it another way: the internal game studio analyst’s job is to assemble a black box known as the concept of “attrition” — to the CEOs and CFOs and shareholders and external analysts and pundits at home, this concept seems pretty straightforward: it’s the people who leave your game. End of story. The black box behaves and does its job, reporting a number between 0% and 100%, and presumably you panic if the number is closer to 100%.

But the internal studio analyst needs to assemble this concept from a variety of sources. They might ask the game designers what they see as a “normal” or “natural” amount of time away from the game — if a game is designed to be played during the work week, then you shouldn’t sweat it when someone isn’t playing over the weekends or on Christmas. They might look at historical data for the game and notice that 80% of players who are inactive for 12 days never come back. And 90% of players inactive for 15 days never come back. So maybe we pick 90% and say 15 days is our cutoff. But of course we’re looking at historical data for the current game, which is different today than it was back then, so it’s not a perfect analogy! So maybe we want to rely on data from the last 30 days, when the game was most similar — but now our definition of “never come back” really means “people who were inactive for 15 of the last 30 days and haven’t been back.” But of course, those people “haven’t been back” for a maximum of 15 days since we’re looking at a 30 day window. So now that our historical data is more representative of the current state of the game, our very definition of “never” comes into question!

An infinite regress of assumptions

In summary: games are very complex systems, and the numbers that get thrown around in the media are built on black-box-style assumptions. These black boxes can always be broken down into components, and those components into subcomponents, forever and ever into an infinite regress. If this seems mind-bogglingly weird, well: it is. On some level you need to stop digging into the infinite and come up with assumptions about the way the game works that become the foundation for your models. There’s nothing wrong about that in principle: science does this all the time, and manages to come up with some great models to describe the world. But there’s a huge difference between science and analyzing the metrics for social games. Scientists do not work in a vacuum within their universities or corporations. Scientists do not work with “proprietary data” and they do not run the risk of getting fired for sharing their results and even their methodologies with other scientists. An internal game studio data analyst does in fact work in a vacuum, and will get fired for sharing with outside analysts. This means that the chances that our assumptions are off-base are pretty good. And it means that the numbers that different companies throw around can’t even be compared. “Average revenue per user,” which sounds straightforward, can be based on entirely different foundational assumptions at different companies and on different games.

This whole mess is one of the main reasons I stopped being a data analyst for games. I did not feel comfortable coming up with assumptions that weren’t, on some level, complete bullshit. Now, the level on which these assumptions operated was often very low-level, fiddly stuff. But it was an art, not a science. Which, again, nothing wrong with that — except that the black boxes that I generated were being treated as science rather than as art.

In the end, for the purposes of arguments about how much money a company is making, the only numbers that matter are: how much money is coming into the company each month? How much money is leaving the company each month? Everything else should be viewed with utmost suspicion. (Source: Tiny Subversions)