The Fear of Data

Expert publishing blog opinions are solely those of the blogger and not necessarily endorsed by DBW.

Audience + InsightChange leads to anxiety, and there has been a lot of change in publishing in recent years. There is one trend, though, that is striking more fears in publishers’ minds than any other. And that is the fear of data.

While sales data has been with us for a long time and is used extensively, there are now new kinds of data that are becoming available. Social media campaigns can reveal who clicked on a link. Email campaigns can be tracked for opens, forwards and reactions to calls to action. But the availability of reading data is probably causing more angst than any other because it strikes at the heart of publishing— everything from acquisition to editorial to marketing to author care.

Reading data is becoming available in aggregated form from conventional pay-per-ebook retailers, like Kobo, but also from all-you-can-read subscription services, like Scribd and 24Symbols—two companies that presented some of their data at the recent IDPF/BEA conference in New York.

My own company, Jellybooks, also presented some of what we’ve collected. But our data is different because we were able (through the opt-in consent of readers) to show measures at the level of the individual, not in aggregated, anonymized forms. This reveals a whole new level of detail and granularity as to how books are actually read.

One fear is that reading data will influence what gets published. This is a somewhat strange notion, as self-publishing has already removed almost any barrier to market, and more books than ever before are getting published. What the doubters really mean is that data could influence what the big publishing houses will publish in the future, such as more celebrity biographies, tales from YouTube stars and vampire novels.

The fear is that, in the future, worthy books of high literary quality will be shunned. Yet titles are getting acquired by editors even when sales data point toward the reality that they are never going to deliver a positive return. What those who have reservations about data fail to see, however, is that data will make it easier to find the audience that appreciates these books. Rather than support an expensive marketing campaign across mass retail, publishers can tailor their campaigns to the relevant audience by virtue of an improved understanding of who likes to read a certain kind of book. And just as important, publishers will also discover the optimal approach to reach that audience.

Audiences are diverse, and each book has one. Some audiences are larger than others, but even small ones can be profitably served.

Another fear is that reading data will influence the editorial process. Will books be edited toward the lowest common denominator? Well, let us look at nonfiction books. Doesn’t it make sense to write a textbook, educational text or any non-fiction text for that matter so that it is easier for readers to absorb? The desired outcome is for readers to acquire new knowledge, not to show off the brilliance of the author.

It’s important of course to use data responsibly. Data of a beginner’s reading of an advanced textbook would give the wrong picture. A book for a beginner needs to be written, tested and measured differently than a book written for an intermediate or advanced expert. More than ever, it will be important to know who the intended reader is. But now we can measure if we are actually achieving that goal.

For fiction books, the issues are a bit trickier. We could say that our mission is to entertain and that we have failed if we are not able to retain the reader’s attention. Yet at the same time, fiction writing is a creative process, and to the extent that the writer is an artist, it may push the reader beyond his or her comfort zone. So if the data shows that we are pushing the user beyond that comfort zone, is that good or bad?

Well, a DJ will gauge the response based on whether people are dancing or not, and a stage actor can measure what the audience thinks based on the applause. Authors to date only had monthly sales data as a similar barometer, which in many ways can be a very poor indicator. There are professional reviews, but those are the opinions of an elite few. Goodreads may give a broader measure, but those who review books are often at the extremes—people who love a book and people who hate it. The majority of readers never review books. Reading data provides feedback on those 98 percent, and it can give that feedback before a book is even published by using advance reader copies with tracking software. The data therefore gives authors the opportunity to understand what the audience reaction might be. It might influence them to be even bolder, or it might make them realize they overstretched and pushed too far ahead of their audience.

Data only informs; it doesn’t decide. Data merely helps us humans make better decisions. Even in situations where machines make decisions, it is rules developed and coded by humans that teach the machines how to decide. Most of the time, we don’t have data to help us make good decisions, so when it is available, we should welcome it. The need to weight different inputs and make choices will never go away.

The greatest crime, admittedly, is the abuse of data. As the saying goes, if you torture the data long enough it will confess. The selective use of data to confirm a decision already made can be extremely dangerous. When we follow this path, not only are we deluding ourselves, but we are misleading others, too. Sadly, it happens all too often.

With sales of ebooks plateauing, people might think the digital revolution in publishing is over. In fact, it may be in its infancy, and we are merely witnessing the end of the beginning, not the beginning of the end.

To get all the ebook and digital publishing news you need every day in your inbox at 8:00 AM, sign up for the DBW Daily today!

2 thoughts on “The Fear of Data

  1. carmen webster buxton

    This assumes the editors and publishers know the reading data, but right now it’s the retailer who knows what books are never finished– or never started — and which are read as soon as they’re downloaded. I think publishers are worried that Amazon, Apple, and Barnes & Noble know so much more than they do.



Your email address will not be published. Required fields are marked *