A Plea: Let Some Ebook Data Flow

Expert publishing blog opinions are solely those of the blogger and not necessarily endorsed by DBW.

This content was provided by Aptara. 

Historically, when all others are concentrating on lowering costs, quality wins.

Publishers with a laser focus on improving the reader experience win over those focused on saving pennies per page. There is an appetite for quality enhanced ebooks at a premium (e.g. audio books, companions).

But it’s not necessarily about simply improving the multimedia experience. The best way to improve revenues, stickiness, and loyalty has always been to ask the customer. Customer data is the best indicator of what does and doesn’t work.

Imagine being able to factor in the subtleties of your readers’ experience. What parts of the book did they like and at what parts did they struggle to keep reading? Which sample of the book led to more sales?  Did they finish the book? If so, how long did it take and what were the sticking points? Who is my audience?

The problem with surveying customers is that the sample size of data is typically too small to warrant rewrites. But ebooks afford us the opportunity to capture this data automatically. In fact, the EPUB3 format allows for embedded JavaScript, so we can leverage some of the same type of detailed analytics we get from web pages (that have been optimized and improved for years) — for ebooks.

Yes, there are privacy concerns. Yes, there are data-ownership questions. Yes, there are platform wars. These are strong forces that have brought down laudable efforts to bring this data to authors, such as Hiptype (a short-lived startup that cracked the problem but was strategically blocked by larger forces).

Platforms are not to be blamed, nor are privacy activists. Their assertions and efforts on behalf of the data and readers are valid. But there is a common understanding that our written word could be improved by what is effectively the best possible peer review system available – a mass contingency of actual consumers. And that little “e” in ebooks allows us to dynamically make changes.

So what’s the answer?

There is common ground between data-driven publishing geeks (such as myself), privacy activists, authors, and platform owners. For example, we can all agree that if most students are incorrectly answering the questions at the end of a lesson, changes likely need to be made. Customer data does not have to include personal information, nor does it have any particular value by itself. However, an author/editor would find it invaluable. The ebook could be improved, the lesson would be more valuable, and scores of students would understand trigonometry better than I.

If we can all agree on sharing some of the most basic data elements (perhaps just with the Publisher and Author for the express use of improving quality and conversion), all parties win. Readers will have a better experience, authors will have created a better product, and publishers will increase sales. Best of all, platforms that make such data available would attract more authors and publishers.

The data-driven publishing movement is a strong current that we can control by defining what data is shared, with whom and for what purpose. Building a dam to stop all data is a detriment to readers, students, publishers, authors, and platform owners.

It’s time to open the flood gates and let some data flow.

9 thoughts on “A Plea: Let Some Ebook Data Flow

  1. Karen Fredericks

    Thank you. You make extraordinary points here. This data must be shared with the authors, who are, as Indies full partners in the publishing process. As they are handing over 30 – 70% of their revenue they are entitled to this valuable data. No doubt it can be shared in a way that protects readers private information. It’s a win, win for all.

  2. Jennifer Stevenson

    I’ve often wondered why Facebook, to name only one of the big data collectors, is shooting itself in the foot by trying to raise pitiful revenues on ads that don’t work, when they could be raising MUCH MORE revenue by selling their immense supply of data, scrubbed of personal ID to protect privacy and crunched to demand.

    FB loses revenue when they sell ads that don’t work…and the ad buyer walks, never to waste that money again. Instead of waiting for another to be born in the next minute, they could have long-term repeat business by selling the data they already use and sort. And then every customer would repeat, because the data is worth far more than the ad.

  3. Jennifer Stevenson

    Of course, Amazon’s goal of world domination prevents them from sharing data–even for a price. But the same point applies to them. Do they *know* how many times an author checks their author rank or the rank of a specific book? That data is so crude, it’s almost worthless…almost. But I’d seriously pay for more refined data.

  4. Michael W. Perry

    Two remarks seem fitting.

    First, ebooks aren’t software. They don’t go through a formal beta testing process for bugs or hitches in the user interface. Does the author of this article really intend to create a second edition to rewrite a book’s first few chapters if this data finds many readers abandoning it after chapter 3? I doubt many authors are going to be happy, halfway through their next book, to be told that they need to rewrite one they’re already done with. At best, publishers might want to recruit some readers to comment on books being written. And for that, intelligent suggestions (\I found that I didn’t really like your main character.\) matter far more than mere digitized metrics about reading time and stopping points. Big data isn’t all it’s made out to be. And any fixes to books need to come before publication.

    Second, what really matters is ongoing sales. The data that publishers really need is sales data. Did the weeklong ad on that scifi website trigger more sales? They can’t figure that out if the only data they comes a month later and is spread over an entire month. If the web ad appears on Monday, they need to see what happens to online sales at the major retailers on Monday, and they need to know it by mid-week or so. If that ad is working, they might want to extend it.

    In short, publishers need the leverage to get Amazon, Apple, B&N and a host of other retailers to give them a continuous stream of sales data. How can they do that?

    By pointing out a glaring flaw in how ebook and direct-from-the-retailer print-on-demand sales are made. The publisher isn’t supplying printed copies by truck. The publisher is supplying a digital file that’s then used to make the sale. That is fraught with a potential for abuse.

    Here’s an example. Later today and tomorrow I’ll be sending print and digital files of my latest book to Amazon’s CreateSpace and KDP, to Apple’s iBookstore, to Lighting Source, and to Smashwords. (who will send it to a host of other ebook retailers).

    Those files are all a retailer needs to sell hundreds of copies of that book but only inform me that a few dozen copies were sold. There’s absolutely nothing in how the distribution system is structured to keep me or any other publisher from being cheated. Some of the major publishers worry about readers breaking DRM and bootlegging a few copies for friends. They’d be better advised to suspect that a retailer, who has a pre-DRM copy of that book, might be steal copies in the hundreds or thousands. And that theft doesn’t have to have the complicity of the corporate CEO. A department head might tweak the system to up his division’s profits and get a promotion.

    What’s the answer to that potential for fraud? Reporting of each and every sale to publishers with enough specifics to make fraud obvious. Names or addresses wouldn’t have to pass from the retailer to publisher, but the date/time of sale and the city, state and zip would.

    Why would that help? Because publishers (or authors) could make test sales. Get someone in a little town in Oregon to buy the ebook and see if a sale in that town at the right time appears in the data stream. If it didn’t that alone would be grounds for a full internal audit, with the retailer picking up all the costs. And note that it wouldn’t just catch deliberate fraud. It’d catch flaws in the system where sales are being passed along to royalty payment software. It’d be like Pres. Reagan’s comment about arms control, \Trust but verify.\ Digital sales need a way to verify.

    And since this involves the potential for fraud, it’d be quite legitimate for it to be part of federal anti-fraud legislation intended for our new digital age. In fact, it’d make sense to include digital music and software applications in that mandated reporting scheme. It’s merely fortunate that a necessary piece of legislation for fraud prevention would also become a useful tool for book, music, and software marketing. And the data that’d come in would be far more useful and not come encumbered with all sorts of privacy violating implications.

  5. Theresa M. Moore

    One of the reasons I don’t work with Amazon or CreateSpace anymore is precisely their lack of attention to reporting hard sales numbers to authors and publishers. I really don’t care if the sales data is being reported on discussion sites, but knowing how many of a specific title sold and where they were being sold is what I needed to accurately project how successful the title was. Too many titles of ebooks were made available by others for free, so there was no sales data to be reported to them. Amazon likes it that way because it does not have to be accountable to its suppliers, who depend on the dissemination of ebooks to attract actual paying buyers of their printed books. I have still kept track of the comments on the Kindle forums, and many authors report the same thing: no sales or very small sales, extremely low ranking numbers and also lack of response to their queries about the reporting system. Amazon’s KDP was especially closed to any improvement of their data reporting structure, so like I was many are kept in the dark about their book sales. My solution to the whole problem is to stop supplying ebooks to KDP and begin selling them direct or through distribution from another party. The traditional publishers are doing it that way even now, not only to improve their sales, but to get an accurate picture of how well a specific title will sell. The retailers like Amazon only care about their end of the bargain, not working with their suppliers to help them improve sales of their books. Even if that happens, I’m not sure I can trust any large retailer for the future.

  6. Helene Byrne

    The bigger problem is how are authors/publishers even know that the sales data that are sent by ebook sellers are even accurate? Somehow we are expected to “trust” that X number of ebooks were downloaded. At least with hard copies, we know how many were printed, how many are in stock, and how many were given away as promos, and can therefore determine the number of books sold.
    As my first publisher gave me fictional royalty statements, I am not longer naive enough to go on “trust.”

  7. Ilan

    I have b
    ts a retailer and a publisher I have major issues with this.
    First we sell our books with social DRM only. A book should be read privately with collecting data when, where how do I read my ebooks.
    Just as when buying a printed book the retailer does not know much more then the name of the customer so it should remain on ebooks.

    I do as a publisher want to know what sells? What ad’s drive sales, etc. Bit I don’t believe you can profile a reader in such a way that you would be able to identify what would be the next best seller.

    The ebook and the devices should only be used to deliver the content as an alternative to a printed book not more.

    1. Ilan

      I am rewriting my comment as it looks like it did not come across properly when I published it through my tablet.

      I have both heat a retailer and a publisher, and I have major issues with this.

      First we sell our books with social DRM only. A book should be read privately without collecting data when? where? and how do I read my ebooks.
      Just as when buying a printed book the retailer does not know much more then the name of the customer so it should remain on ebooks.

      I do as a publisher want to know what sells? What ad’s drive sales, etc. Bit I don’t believe you can profile a reader in such a way that you would be able to identify what would be the next best seller or what should be improved on the next book.

      The ebook and the devices should only be used to deliver the content as an alternative to a printed book not more. a book is not a commodity the only difference between an ebook a book or a scroll is just the reading format.

  8. carmen webster buxton

    Dream on! Why would Amazon, B&N, and Apple all give publishers the advantage they have in assessing reader behavior for specific titles? I can see each of those companies providing that info to self-published authors as an incentive to use their platform, but that’s about it. In fact, I would not be surprised if Amazon made that the next incentive for KDP Select. They save their best carrots for exclusivity, which is generally not in an author’s best interest.



Your email address will not be published. Required fields are marked *