Who exactly is Data Guy?
We know he’s the numbers wizard behind Author Earnings—a collaboration between himself and self-published mega-author Hugh Howey. And we know that he’s anonymous. But that’s pretty much it.
In the past two years, Data Guy’s Author Earnings reports have become an increasingly popular resource for authors, shedding light on aspects of the publishing industry that were going previously unreported.
But the reports have also spurred a great deal of controversy. While some within the industry think they are vital tools for authors everywhere, there are others who criticize the data and think the conclusions resulting from them are worthless. There are of course many in the middle who believe the reports are admittedly far from perfect, but necessary nonetheless.
And yet we know little about the man principally responsible for crafting these reports and birthing this discussion in the first place. Why did he start this venture? What are his methods? And why does he choose to remain anonymous?
To find out the answers to some of these questions, I spoke with Data Guy in a wide-ranging interview. Below, we discuss the genesis of the Author Earnings project, how self-publishing is affecting traditional publishing, the criticism that he and his project receive, and more.
At this year’s Digital Book World Conference, Data Guy will sit down with Michael Cader, the founder of Publisher’s Lunch, for a discussion titled “Outside the Data Box: Taking a Fresh Look at Ebook Sales, the Indie-Publishing Market, and a Fast-Changing Publishing Business.”
Why do you choose to keep your identity anonymous, and is there anything that you’re willing to divulge personally?
I think the anonymity kind of goes back to where I was at when we started. Much by happenstance I discovered there are some advantages to staying anonymous. Back when I first pulled the data, it was really for my own information. I had just been approached by one of the top imprints in my genre, and they were making an offer on one of my books. It had done really well as an indie published release, and they could see it was selling well, ranking high on Barnes & Noble, Amazon, featured on devices, racking up reviews, etc. So they approached me, and I was negotiating with them and I was pretty excited. But, you know, I’m a numbers guy by my other career—my non-writing career—and I was looking for data to help me make my decisions. And there really wasn’t any data out there on what I needed. The official industry stats were kind of blind to half of the story. They didn’t cover indie publishing at all. And so I’m in the middle of negotiating with that publisher, when I pulled this data, and I look at it. I share it with Hugh, and we decide to publish it. But I didn’t want my involvement with Author Earnings to interfere with the discussion that I was having with the publishers.
And since then, I’ve found there are some advantages to being anonymous. The first is, it keeps all of the discussions focused on the data itself and what it says instead of distracting everyone with, “Who collated it? What are their qualifications?” That is kind of what we want. The data speaks for itself, and we make it all available for download for free. I want every author thinking for themselves and downloading raw data, and saying, “You know, I don’t agree with Author Earnings. My opinion is this, and in fact I’m going to publish my opinion of what the data says.” If we could get more authors doing that—remember, our focus, up until DBW, has always been the authors.
Instead of relying on industry experts to guide them in their careers when those experts don’t always have the same financial interests alignment with authors. It sort of encourages them to make their own career decisions and do their own homework. The other thing it let’s me do is keep my fanbase—my readers—separate from the industry advocacy stuff.
Hugh has done an amazing job with blending those two things where they don’t interfere with each other. But when I interact with my readers and fans, or they Google me or connect on Facebook, they don’t really care about any of this industry insider stuff. It’s a distraction. It may fascinate me, but it just bores them to tears. They want to talk about books. I want to talk about books with them.
What I found was that my SEO results were combining these two things. Early on I was like, “No, that’s not what I want.”
And the third reason I like being anonymous is it let’s me only do as much of this as I want, and to focus on my fiction writing. Hugh is out there. He is a public guy. It boggles my mind at how approachable he is. I used to do this stuff for a living in a different industry—in the video game industry, which is a lot more data-centric, and a lot more numbers-driven. I love it. I enjoy it. But I like writing fiction more. There’s a reason I’m writing fiction now. I also still do the same kind of data-driven competitive analysis from time to time for big video game publishers as a consultant. Writing was going to be my separate thing, and now here I am doing data analytics work for a completely different industry for free. [Laughs]
You sort of alluded to it with negotiating with publishers, and how you ended up pulling the data initially, but can you tell me anything else about the genesis of the Author Earnings project, and what went into getting this off the ground a couple years ago?
It was one of those serendipity things where—they happen more often in the age of social media and the Internet and an open source mindset like Wikipedia. You see more and more of that, but it still really comes down to serendipity. Hugh had already had this incredible success as an indie; he’s visible, and they’re covering him in the media. And they’re talking about this handful of million-sellers, and how no one else in self-publishing is making any money. In the meantime, he’s traveling around the country talk to all of these authors that no one has never heard of—who are paying a few bills, earning day-job-quitting money, and life-changing money through indie publishing. He’s running into more and more of them. It’s completely anecdotal, but he’s seeing this whole new indie mid-list starting to be able to make a living with their writing. But it’s boring.
No media coverage touches it because it’s just kind of a boring story, whereas million-sellers are sexy. And he’s doing that, and in the meantime I’m a newbie writer—I just published my second novel. The first one had done ok, but then I put out the second one, and they both kind of go zooming up the charts together. So I’m looking at this, and I’m closing in on $10k a month—pretty awesome revenue with just the two books out. But I’m also going, “You know what? This isn’t a fluke. I’m no outlier.” Because I’m looking at all of the people on the Barnes & Noble bestseller list around me, and the Amazon bestseller list, and a third of these people are indies. I’m talking to the publisher, and I’m trying to make a career decision. I’m torn. I’m like, “Which way should I go?” And then I go pull the data. And that’s when I reached out to Hugh.
He and I had already exchanged some emails. He’s such a friendly guy that—you know, here I am, some annoying newbie writer going, “Hey, dude, I’m right behind you on the sci-fi list.” He’s like, “Awesome. Send me a screenshot when you’ve passed me.” And then two days later I sent him that screenshot going, “Look. I’m number one in sci-fi right now.” [Laughs] So we’d chatted a little bit in email about this phenomenon that he was seeing. And as soon as I saw the data—I email it to Hugh, and I go, “Look at this. No one outside of Amazon knows this yet. Here’s this untold story of self-publishing that you can’t get the media to cover. It’s all right here in black and white. It’s numbers.”
So he’s excited and we’re scratching our heads, because we were seeing this big change happening, and we knew a lot of writers were in the same place I had been, where others were trying to make these career decisions, and half of the data they needed just wasn’t out there.
So we just said, “You know what? Let’s just put it out there ourselves. We’ll launch a site and throw it out there.” We debated about author privacy. Should we anonymize or put it out there non-anonymized? That was literally the only edit we did on the data—was to anonymize it.
Speaking of the data, who else has a hand in gathering it and crunching the numbers that lead to the conclusions you come to in the report? Is it just you, or are there other people involved?
Yeah, it’s pretty much a shoestring volunteer operation. No one else is involved on a regular basis. Occasionally, we’ll reach out to someone we know and go, “Hey, what do you think this means? What’s your opinion?” We do have a few people that will preview reports and go, “Here’s what we think we’re seeing. Does that even make sense?”
When you say “we,” you mean you and Hugh?
Hugh and I, yeah. Basically, I pull the data. I crunch the numbers. The software is mine. And then Hugh and I look at it. We’ll pass it back and forth, spitball on it and go, “Hey, what about this? Should we look at that?” Then when we have a sense of what we want to focus on for the report, we generate the graph and then write it up together. So depending on our schedules, we bounce it back and forth. He’ll do an edit pass, and I’m like, “Eh, I don’t really agree with that.” I’ll make a change, and he’ll go, “No. You’re boring everybody with this part.” And then when we think it’s good enough, we toss it out there and move on.
So based on all the research that you’ve done into ebook sales and where the money is going, is there one piece of strategic advice that you’d offer to Big Five publishers to do things differently than they do now?
There definitely is, and I think that DBW may be an opportunity to dig into some of these trends in more detail. In general, my observation is not something that Hugh and I alone are saying. High ebook prices don’t really hurt mega-selling authors with long established careers in all of the airport book stores and Walmart, but what they do that is not good is they damage the discoverability and also earnings of mid-list authors. And particularly the vast majority of debut authors who are brand new. No one knows who they are. They need to first find their own audience and fanbase among avid readers before their publisher will put a significant amount of marketing and funding behind pushing them to a more casual, broader audience. The industry’s changed, and the dynamics are not the same as they were when today’s traditionally published mega-sellers first came up a decade or more ago.
Most avid readers today read digitally. When you look at who’s reading 50 books a year, 100 books a year, those are the folks who are giving new authors a shot. I’m not talking about the seven-figure advance, Pulitzer Prize, one-of-them-a-year mega-debut author; I’m talking about the vast majority of traditionally-published debut authors who are trying to build a name for themselves. And the digital readers, these avid readers, are basically bypassing those authors, because they don’t recognize the names, and the price is off-putting to them.
Strategically, if you’re a Big Five publisher, supporting those authors now with lower ebook pricing would mean you’re building a healthy, sustainable pipeline of intrinsic revenue streams you control down the road. But by not doing so, instead you become increasingly reliant on being able to make opportunistic acquisitions of these big blockbuster properties that originate outside of the traditional publishing industry. You have to do that quarter after quarter, year after year, without fail. It just doesn’t seem like a sustainable strategy long-term. So that’s the piece of strategic advice that I’d offer.
Have you seen any migration of authors, either from indie to published or the other way around? And if so, do you know if this switch has worked out for them?
You know, there are probably as many migration stories as there are authors. Some have moved from self-publishing to traditional publishing. A recent example is Andy Weir. Some go the other way, like Cornelia Funke. A lot of authors are doing both at the same time: they’re building what’s known as a hybrid career. Some of the ones that have made the switch in either direction—things worked out great and they love it. Some are disappointed, and many have even switched back. It’s hard to say. Any particular hypothesis you form about which direction things are going, you’re going to find a lot of stories that support that and land in that category. I think it really comes down to the individual author, and what kinds of contracts they are being offered. What are their expectations for their career? Do they want literary acclaim? Do they want money? Do they want to build an audience and publish prolifically? It’s different for each author.
There have been pieces written and comments posted that take issue with one or more elements of Author Earnings. What do you say to the people who are critics of your data and the methodologies that you use?
We welcome it, basically. I personally welcome criticism. Particularly when it’s intelligent, analytical, well-reasoned, and provides measurable counterpoints that all authors can look at and verify away from themselves. In general, better data and greater transparency—it isn’t just authors; it helps everybody in the industry. I make a point of trying to incorporate criticism into future reports and say, “Oh, well ok, here’s a valid criticism. Let’s go look at that factor.” Some of the more significant improvements we’ve made over the last two years in both the methodology and to the accuracy of the reports have stemmed directly from not just public criticism of what we were doing, but also private collaborations of industry pros who pointed out some things that we were missing and said, “You know what? This assumption here is way off. Here’s what you’re missing.” That’s helped us kind of put things into a broader industry context.
Do you feel that with every new iteration of this report you guys are getting more and more accurate and getting closer to the actual truth, if there is one?
Absolutely. In fact, one of the biggest jumps in accuracy was—our rank-to-sales conversion method had gotten a bit long in the tooth. And as we headed into the end of last year, we were looking at it and I go, “You know what? This is out of date and it’s going to be too conservative.” And because we mostly avoided making statements about absolute sales that Amazon is doing and instead saying, “Hey, looking at relative measures of sales, this is how the pie breaks down.” That kind of accuracy doesn’t have to be precise, but at the same time I wanted to upgrade our methods so that we could look at quarter to quarter sales and say, “What’s happening to the size of the pie? Is it growing? Is it shrinking? How fast?” And so with the February report that we did, we upgraded our approach significantly.
Now it’s based on real sales data—raw sales data—from exactly that time period provided by about a dozen authors, and an increasing number every day. And these include very high-selling authors, as well as authors who aren’t selling well. And so we pretty much have real-time data points up and down all the different sales-ranks, from one or two of the absolute top-selling books on Amazon down to books that are hardly selling at all. Hundreds of books. So we factor that in.
Your talk at DBW is called “Outside the Data Box: Taking a Fresh Look at Ebook Sales, the Indie-Publishing Market, and a Fast-Changing Publishing Business.” Can you give me a preview of some of the issues that you will discuss? You were saying that this is going to be more publisher-centric in a way.
Exactly, I think it’s actually a pretty exciting opportunity to take a step back from our usual focus on author advocacy and author earnings and say, “Let’s look at this—the raw data—and look at it from a completely different perspective. Let’s look at it from a publisher earnings perspective.” Where are there opportunities here? Where are there risks? What genres are particularly conducive to publisher earnings where money is left on the table? How does ebook pricing as a policy affect hardcover sales and print sales? How do those two thing interact, and are there any things that smaller publishers who maybe don’t have as large a dataset to work with as the big guys? Can they get insights from the data that helps them go back and crank their own earnings up—change their policies? Again, it does slide back in to author advocacy, because ultimately we think that will help traditionally published author earnings as well as publisher earnings.
This is a bit of a long-winded question, but it’s the one that I’m most curious about. Your feelings or anyone’s feelings toward the Big Five publishers aside, how do you personally think the rise of self-publishing has affected our literary culture as a whole? Not too long ago, we had gatekeepers who let only a minority of potential authors past. Now with self-publishing and further avenues to get a book out there to an audience, literally anyone can be an author, and as a result, the number of books published per year has, frankly, exploded. For individual authors, this is great news: they can now achieve their dreams and publish a book. But taking a step back, with the gatekeepers not holding all the power, and a surge in books published, how do you feel this has changed the culture surrounding books? To put it another way, is the value of a book at all watered down now that anyone can be an author?
This is a question that I’m not going to be particularly good at answering. After all, I’m known as “Data Guy,” not “Literary Subjective Opinion Guy.” [Laughs] But with that said, first off, I have no particular feelings about the Big Five publishers, positive or negative. And I think this makes me a little different than a lot of the folks we hear from on various author groups. I’m a brand new author and a new entrant into this industry. I’ve never submitted a query to anyone. I hear a lot of this angst, and there seems to be bad blood one way or another. It’s just lost on me. I don’t get it. I get that some people in this industry feel very strongly about the things that have happened in the past, but for me it’s just a brand new, wide-open field. Let’s see what there is to learn.
With that said, I do think that today’s wide-open, democratic world of publishing is a good thing. It’s been a tremendous boon for literary culture and freedom of expression. The gatekeepers were an economic necessity in the past. It wasn’t so much about quality, although these two concepts tend to get tangled a lot, because nobody wants to think of themselves as just serving an economic function alone when working in the arts. It was more about choosing which manuscripts were worth taking a financial risk on. Well, today that risk is largely mediated by the fact that you don’t have to take a big risk to get a book out there in the public eye. At the end of the day, the only gatekeepers that matter are readers.
If they like what’s out there, the books will tend to do well, gain visibility, spread through word of mouth. And if they don’t, essentially it’s irrelevant in the market, and yet it may not be an irrelevance for that author. That author may have achieved their dreams, and they have finally been able to put their book out, and the three people who read it will be the ones who shared that experience. Maybe that’s all that matters to them.
So I think on the whole it’s a positive thing. Readers benefiting from a far greater wealth and diversity of high quality books and ideas. Making it all available to them, and more importantly now affordable to them. Democratization, greater diversity, and more feelings expressed—it’s kind of hard to see any downside. Not that I have any strong opinions about this or anything. [Laughs]
What do you see happening to publishing in five or 10 years down the road? Do you think that’s too short of a timeframe, or based on the data and the trends that you see, do you think that we are looking at a pretty big shift in how authors choose to publish going forward?
This is going to sound a bit like a weaselly answer, but it isn’t: I think as industry insiders, all of us—publishers, agents, retailers, pundits and analysts, and even authors—we tend to overestimate our own importance and our influence on the ways things are going to shape up. In the end, it’s about readers and what they want. They’re going to define the shape of the industry in the future. Not us. We’re going to basically be playing catch-up. We’re going to be trying to adapt to what the readers are telling us they want—how they want it, the format, the pricing. Those of us who are able to adapt and find ways to add value in a way that the reader recognizes, we’re going to find ways to be around. And those of us who don’t are going to have a hard time doing so. So I think that’s a very long-winded way of saying I just don’t know. But I do know the answer’s not going to based on what we choose to do. It’s going to be based on what readers choose to do.
To get all the ebook and digital publishing news you need every day in your inbox at 8:00 AM, sign up for the DBW Daily today!