Monday, September 7, 2015

Data obsession and the politics of facts

The use and abuse of data of all kinds in the games industry is a fascinating topic for me. As developers, we naturally need to consider a wide range of data when working on any project: data on our players, our game, our development process, our industry trends, our competition, and many more. With technology improvements we now have access to more data than ever before, but also more noise to dig through. Many have pointed out that immediate access to data can help focus and democratize development. Where before you may have had a political environment where the ideas that won were that of the person who is the most senior/loud/ charismatic/well connected, now it’s possible to evaluate ideas using data and have the best ideas naturally win. Or at least that’s the theory, and like many theories it works under certain contexts and fails in others.
On this post I’d like to talk about situations I’ve seen where the excitement about the potential of data can cause multiple issues on otherwise good teams, including the support of the bad politics environment they are supposed to prevent. From observations and conversations with colleagues, I believe such issues are relatively common in the industry overall, and thus worth talking about.

Data obsession pitfall #1: Relying exclusively on data to decide what to do next

"We don't want opinions here. We just want facts."

The above phrase will always stay with me from my time in the games industry. I heard it from the moderator at a high profile series of meetings, where a mix of executives and developers were discussing whether it makes sense for the company to start making games for a new platform. It was a completely unexpected response to my suggestion that we should have everyone openly state and support their opinion. As a result of this mandate, during the rest of those meetings I witnessed an often comical attempt by everyone to pretend they are being “objective” by leaving out personal opinions and preferences and presenting random pieces of data that would somehow magically tell us what the right answer was. The presented data had to be hard numbers or generally accepted truths (what the quote is referring to as “facts”), otherwise it would most likely have been disqualified as an opinion. So if someone said “There were x# of companies last year releasing games on this platform, and they made y# of dollars in revenue,” that was a good data point/fact that was considered progress. On the other hand, saying, “We shouldn’t go into this platform because we have nobody on the team who understands it” would have been an opinion, and those were explicitly judged irrelevant early on.
No matter what good intentions may have been behind it, this attempt to find the truth by promoting data and supressing the expression of opinion was a true disaster. The motivation for doing such a thing is usually to ensure everyone maintains “objectivity”. But the idea that there is a single well defined “objective” attitude in questions that involve high uncertainty in a dynamic and complex environment, like the question we faced in that group, is a faulty one. Similarly faulty is the idea that once all data related to such a complex question is available, it can be dissected in a standard way to give a single “correct” answer.
I define bad politics in the workplace as any attempt to promote personal or department goals at the expense of the overall project or company. I’m not looking to demonize anyone with the term. Of the bad politics I’ve seen, none were caused by genuinely evil or manipulative people. But for complex reasons, it’s possible for various people on a team, especially bigger teams, to prioritize goals that simply do not benefit the team. An environment that suppresses opinion in favor of “facts” is the worst kind of environment for addressing this kind of bad politics. Whether they know it or not, most people in such an environment are still biased by their opinions, and this affects how they present and discuss data. This causes a mess of issues:
  • The people who are causing bad politics are free from having to actually explain anything about their behavior. They just cherry pick the data that supports their goal. Without an open discussion of opinion, there’s no hope of uncovering unhelpful biases.
  • Some people, worried that their opinion may cloud their judgment, overcompensate by presenting data just because that data is opposed to their own opinion, even if they don’t think that data is as relevant. By legitimizing such data by presenting it as relevant, other people not as familiar with the problem will give it too much weight
  • In hopelessly number-obsessed cultures, people ignore data points that cannot have a number associated with it. (i.e. when judging the quality of people on a team). Just because you can’t measure something, that doesn’t automatically mean it’s not important.

The biggest tragedy of this “data is king” attitude is that numbers are put ahead of people. Zynga is referring to its process of making data-driven decisions as its “secret sauce”, a proprietary method of making successful games by following some sort of predefined steps. Think what that means for the people who work there: They are only hired to dig through the data and perform this magical ritual that will lead to success. As long as they can do that, they are completely interchangeable. It’s a dangerous way to run a company that aims to eliminate any kind of diverse thinking that strays from the recipe.
I understand that the notion that there’s some higher truth that will tell you the right answer if only you work hard enough to find hidden patterns in the data must be comforting in environments of uncertainty, as is any creative endeavor. Don’t fall for it. In any creative industry, there are risks. Instead of blindly following where you think the data takes you, and pretending that practice removes the risk, you’re better off using data as one of many councils to face those risks in a calculated way that won’t be fatal in the likely case of failure. Unfortunately in some companies the political environment doesn’t tolerate any kind of risk and failure in the first place, causing everyone to take cover behind the comfort of data.

Data obsession pitfall #2: Losing focus and wasting time on small data-driven improvements

Jesse Hull talks about a pretty common idea, especially in social network and mobile game companies: That A/B testing can be used to grow your business to levels of success that would not have otherwise been possible. And a requirement for that is that such A/B testing is “holistic”, meaning everyone in the organization is focused and in the mindset of doing such tests all the time and around every intended feature big or small. A particular example given is that if you manage to find 10 ideas that each improve a key metric by 4%, you end up with a big 50% improvement.

This idea has become so common that many developers and press have adopted it as the one truth that is so obviously true it doesn’t even need debate: Everybody seems to just know it and present it as the truth. Some random examples out of many:
“Wooga is a new type of game developer, one that emphasises metrics over creativity. Its core discipline is A/B or split testing, in which new features are introduced to a selection of users, and their reactions measured. Features remain only if users engage with them. If they don't respond, Wooga tries new features until they do.”
Eli Hodapp from TouchArcade says in a Time magazine interview:
"A lot of the games that make the most money are quite literally scientifically engineered with the help of actual psychologists to design things down to the color of a button, which is then A/B tested based on what makes the most money."
When I ask for more clarification, he expresses the belief that any game near the top end of the top grossing charts is there because they A/B test everything, including trivial details like the color of buttons (emphasis mine).
The shape and color of buttons seems to be a popular example, as it’s also found in this ESPN report.
Employees with job titles like "data scientist" study whether a player is more likely to click on a button if it's square or round”
It continues:
“The tiniest improvement can have fortune-changing effects for a game studio. “If you can make a change to, say, a menu color that results in your 10 million players spending an average of just a penny more every month, it adds up fast," one analyst tells me.”
I consider myself extremely lucky to have been given an opportunity to work with some of the most successful teams in mobile games, because that experience allows me the clarity to see all the above statements as the garbage that they are.
After I joined the Clash of Clans team, one of my first questions about A/B testing received the following answer by the team lead: “I don’t believe in A/B testing. Just do what’s right for the game.” I was also informed that no A/B test was ever run on that game. This was in late 2013, when the game was already so popular that it was hard for many to imagine how it could possibly get any more successful. For many other data obsessive companies, the answer would have been simple: We A/B test trivial details like colors of buttons, slightly different gameplay, or tweaking virtual currency prices on the massive user base. To use the ESPN report line of thinking, even a menu color change that would result in the millions of players spending a penny more every month would have been a meaningful improvement to the game.
Instead of doing all that not very fun-sounding work, the team instead chose to focus on Clan Wars, a new, relatively complex feature that they believed would make a significant improvement to the overall experience. During the time that feature was under development, I didn’t see anyone make a decision solely based on the wealth of available data on existing player behavior, though such data was sometimes used on-demand to validate assumptions (mainly around clan sizes and play patterns throughout the day). While all of us were excited about the feature as something we’d want to experience ourselves with our clan mates, none of us really knew if it would have been successful or not. Having come straight from the highly political environment described in the opening section, I couldn’t help but think that such an effort would never get any support in many other companies. Why risk messing with something that works, especially when there is a total absence of data that would confirm that the new feature will indeed be successful?
Against popular belief, not all successful mobile games run A/B tests
After Clan Wars came out, it was a massive success. But let’s do a thought experiment and imagine the team instead had opted to put all of that work on a robust A/B testing framework that allows them to A/B test anything in game (yes, including the apparently very important shape of buttons). After months or years of figuring out small experiments and collecting data, they may have gotten lucky and hit the 10 small changes that each increased a key metric by 4%, for a total improvement of ~50%. Everyone would pat themselves on the back for being so data driven and celebrate the big improvement. The massive opportunity cost of not implementing Clan Wars would have been completely missed: No kind of data anyone could collect would tell them what they had missed out on.
It would seem to me that in this case, ignoring all the orthodox advice about A/B testing was exactly what led to huge success.
I have no doubt there are creatively bankrupt games companies that are obsessing over whether a square or round button is better for monetization, or rapidly shoveling shoddily made games they don’t actually believe in into their A/B testing pipeline. And I have no doubt some of them are extracting short term benefits from this approach. But I know for a fact that suggesting that such obsessive data usage is the only way to achieve any meaningful success is a lie. Make sure you recognize it as such when you see it. If you make the choice to work for a company that puts data before anything else and painfully A/B tests every single detail, make sure you’re not doing it because you think that’s how everyone does it.

Data obsession pitfall #3: Not understanding the short-term predictive nature of most data

Zynga has been famously driven by numbers. That’s pretty easy to see for any outsider, from their CEO’s infamous way of thinking about achieving numbers by any means possible, to their open admission that they care more about numbers than games. Though I don’t have firsthand experience, I’ve heard stories about how their teams wouldn’t consider important tasks done until they deploy the change and see the desired change in numbers in real time.

A lot of the data analysis we do in games is predictive in nature. “If we implement feature X, we expect to see a Y% improvement in retention.” “If we add multiplayer to our game, we can expect it will have a longer lifetime.” When done properly, an A/B test will successfully predict the aggregate behavior of millions from a much smaller sample. But what does that tell us about the how sustainable this behavior is over the long run? Not much.
Anyone involved in such predictions should understand that predicting the future is hard. The further out in the future you try to predict, the more inaccurate your predictions will be. Having a lot of data helps make more accurate predictions, but the amount of data (not to mention the complexity of the analysis) that you need to make predictions even weeks ahead, let alone months or years, grows exponentially.
This isn’t such a hard concept to get, and isn’t specific to games. It’s the reason the weather man can’t seem to ever get right what the weather will be like 6 days from now. In that light, I am not surprised at all that a company like Zynga, who relied almost exclusively on data to grow, had very good short term results but is struggling to sustain that success in the long run. What is very surprising is how often I still see developers assume that Zynga-style short term improvements will naturally extrapolate indefinitely into the future.
Good developers know that the long term effect of any data driven change is not necessarily tied to its short term performance. The long term effect needs to be evaluated separately, and very often, it’s very hard or impossible to do that evaluation using hard numbers.
One of the most obvious reasons to reject a data driven change, regardless of how enticing its benefit may seem, is because such a change is actually degrading the player experience, while focusing on improving revenue or another company-focused metric. Yet there’s many examples of companies not culling such changes: Zynga-style virality in the early Facebook days did nothing but frustrate existing players but was really good for getting new signups. A full screen banner with a special offer is very annoying to the majority of the players that are not interested in the offer, regardless of whether revenue may go up after such a change. A UI change that causes the users to accidentally spend more premium currency will obviously increase statistics around premium currency usage but is hurtful to building trust and getting loyal long term customers.
This doesn’t seem like a particularly user-friendly feature
In a political data-driven environment, it’s very hard to speak against such changes. If you do, you will be asked “Where’s your data?”, and your numbers-free answer about the long term will be marked as wishy-washy, easily defeated by the hard short-term numbers the data driven change has to back itself up.
Unless you are a company that wants to make a quick buck any way possible, you want to be successful in the long run. Many teams say they only care about the long run. The ones that actually mean it will never implement any kind of change suggested by the data, unless they are certain they can be changes that bring success in the long run. If they are not certain of this, they will not focus on such changes, no matter how enticing the short term benefit might seem. So when making any changes inspired by data, make sure to evaluate the potential long term effects of such a change separately. Often, you have to rely solely on the good judgment and opinions of the people you hired.

Use data to support your passion for games, not the other way around

My biggest concern with the obsessive data focus that is taking over some companies is that we may be growing an entire generation of game developers to believe that the only way to make successful games is to forget everything they are passionate about and focus instead on collecting, overanalyzing, tweaking, obsessing over numbers, until they get the desired result. I regularly see new developers who are just now entering the gaming industry go straight into such environments, growing in the echo chambers of their data obsessed employer. The way they think and talk scares me.

My prior experience working with successful teams has given me a big advantage: proof that obsessing about numbers is not a prerequisite to success. I am planning to use data to support my passion for the games I want to make, not replace it. I will use data to reduce the risk of those games in today’s very uncertain landscape, not to tell me what games to make in the first place. I will use data to test important features and validate key assumptions, not waste time playing with different menu colors or shapes of buttons. I will never let data get in the way of a good user experience, even when such data may indicate that what my game really needs is a full screen ad banner.
I hope others will do the same.

No comments:

Post a Comment