What Code Is Revealing About Readers

Expert publishing blog opinions are solely those of the blogger and not necessarily endorsed by DBW.

Audience + InsightThere’s a brave new world in book publishing, and it’s being shaped by and around audience insights. Not only are publishers becoming more adept at using data to work smarter, but code and algorithms are also getting better at gathering information and executing tasks without the help of humans.

At Jellybooks, we recently developed a piece of code called candy.js, which is embedded inside an ebook to track how users actually read. Penguin Random House UK was among our earliest partners in a pilot program of the technology, and the insights we gathered were fascinating. The question now becomes what story this data tells us and what impact it might have.

To explore, I’m kicking off a new monthly column called “Audience+Insight” that will detail how acquiring, editing, positioning, promoting and marketing are all being reshaped by data—collected by Jellybooks and a growing list of others—about the ways consumers actually read books.

It’s not simply about verifying or disproving notions about how people read, but rather understanding the diversity of readers and audiences out there, and how we in the publishing industry can best cater to their interests and needs.

Not surprisingly, there is a lot of fear that the influx of data will lead publishers to only publish more celebrity biographies and vampire novels. But I think that fear is unfounded and obscures the potential of a technology that first and foremost allows us to judge what kind of audience a book appeals to.

Technology does not tell us if a book is good or bad; it merely identifies the types of readers who are engaging with particular types of books. Every book has its audience—some are bigger and some smaller—but reading analytics is above all else about understanding what that audience for a specific book looks like.

There are also ample opportunities for exploring how readers themselves might leverage their own reading data. Inevitably, there will be exciting new products and countless dead ends as we stumble through this still unfamiliar world.

The column will also examine new trends in machine learning and how they apply to publishing. I will attempt to dissect the algorithms and code behind what Apple, Amazon, Google and others are doing. Recommendation engines are not some kind of mystical black box; they follow rules and work with a particular type of data. Some are considerably less sophisticated than people might imagine, but they are getting smarter and smarter all the time.

What opportunities are there for growth hacking—that is, marketing on a shoestring budget using data and code? This should be natural for many publishers for whom limited monetary resources are the mother of ingenuity. I’ll review a few ideas I’ve written and spoken about before but with a new twist: how we can use data and code to tackle the book discoverability (read marketing) challenge.

And finally, I’ll consider what all this might mean for authors. Some publishers think data is for them alone to interpret and analyze, but what roles can authors play—and what can they gain in the process—as reader analytics grows up? What are the opportunities and what are the threats? Publishing isn’t becoming simpler, but it’s sure becoming more exciting.

To get all the ebook and digital publishing news you need every day in your inbox at 8:00 AM, sign up for the DBW Daily today!

4 thoughts on “What Code Is Revealing About Readers

  1. Miss M

    You may want to clarify under what circumstances/how/if this code is being used by Penguin, or any other publishers, in items for sale to the general public.
    I’ve seen your article referenced in a couple of e-reader communities today re privacy concerns, with the inference that this is going wide in books for sale, without notifying purchasers. I see that in your second paragraph you hyper-link to a previous article stating this is being used in focus groups by volunteer readers, but your current piece doesn’t make clear whether that is still the only use.

    1. no

      Keep your @@@ virus out of my ebooks. The only thing that a publisher needs to know is that someone paid for the ebook.

    2. Andrew Rhomberg

      The post did contained link to an earlier article and there was a presentation at the DBW conference earlier this year.

      The technology has been and continuous to be used exclusively in Advance Reader Copies (ARCs) and complimentary review copies, but instead of being asked to write a review, the recipient of the free book is asked to share their reading data.

      The volunteer gets a free ebook in exchange for promising to share their data (though they cannot be forced to share that data and their active participation sis till require or the data remains locked up in the reading app).

      Readers who are uncomfortable can simple purchase the book upon publication. Off course all the major retailers collect reading data for their own internal purposes through their reading apps (Amazon’s Whispersync technology works on the basis that Amazon always knows on which page a reader is).

      To remain 100% anonymous readers should walk into a book shop wearing sun glasses, pay with cash and purchase a printed book. They are still widely available.

      Readers participating in the Penguin Random House focus groups (and those of other publishers) were clearly informed (and reminded) about the program. Each reader explictedly opted into the program.



Your email address will not be published. Required fields are marked *