Expert publishing blog opinions are solely those of the blogger and not necessarily endorsed by DBW.
If you published a book before 2008, its ebook edition was probably created using optical character recognition (OCR). And if your ebook was created using OCR, it probably has typos in it. That’s the bad news.
The good news: you don’t have to accept this situation.
What’s special about the year 2008? Nothing, really. I just chose 2008 because the first Kindle came out in late 2007. So 2008 is the earliest year I can imagine a significant number of publishers adopting a single-source workflow: a workflow in which the ebook is created from the same files used to create the paper book. For example, nowadays Adobe InDesign can create an ebook and a paper book (well, a PDF) from the same file. A single-source workflow avoids OCR and OCR-caused typos. It doesn’t avoid all problems, but it goes a long way toward making higher-quality ebooks.
Many publishers continued to use OCR for books published more recently than 2008. On the other hand, commendably, some publishers used single-source workflows for books published before 2008. Since files may be available for books published as long ago as the 1970s, single-source workflows are possible (though unlikely) for books published while Jeff Bezos was still a child.
The bottom line for authors is this: regardless of its year of paper publication, ask your publisher whether OCR was used to create the ebook edition of your book.
If OCR was used, your ebook probably has typos in it. It was probably spellchecked, but not carefully. The whole conversion, including spellchecking, was probably outsourced to inexpensive workers who, even if their English skills were good, were probably working under severe time constraints. And even the most careful spellchecking, as you know, is no substitute for good old proofreading. Your ebook was almost certainly not proofread.
So what can you do?
Ask your publisher to tell you what efforts they made to have your ebook match your paper book. If you are not impressed, try to get them to make more of an effort and fix it now, after the fact. You should not be impressed with an answer like “we outsourced the conversion to one of the most popular conversion houses used by the industry,” because in this case “popular” means low-cost, not high-quality. And you definitely should not be impressed with an answer like “we spellchecked it.”
You should only be impressed with one answer: “We proofread it just as carefully as we did the paper version.”
By the way, if you have an agent, ask your agent to take up these ebook quality issues with the publisher. After all, it is your agent who should has been advocating for the proper treatment of your work in the first place. By and large, agents have either not understood what has happened to their authors’ work in ebooks, or they have understood but have been unwilling or unable to prevent it. Therefore, they share quite a bit of the blame here. For-profit publishers are just that: for profit. Where profit is aligned with authors’ interests, great. Where it is not aligned, that’s where agents need to advocate. Ebook quality may be one of those “unaligned” areas.
Regardless of who asks the publisher—you or your agent—what can you ask your publisher to do if you suspect a low-quality ebook conversion has been performed?
Some good remediation options include asking them to:
• reconvert it using a single-source workflow.
• proofread it carefully.
• have the book re-typed and compared to the OCR conversion.
The publisher may claim that #1 (single-source reconversion) is impossible. This is certainly true if the book is old enough to have been typeset using pre-computer technology. But as long as it was typeset with a computer, there is (or should be!) a file somewhere, and this file should be decodable. If they’ve lost the file (incredibly, this happens!), or won’t pay to have it decoded (by the way, this is my specialty), then perhaps #2 (proofreading) or #3 (re-typing) would work.
Re-typing, when combined with OCR, is a laborious but high-quality strategy. The reason it results in high quality is that the types of errors that a (human) typist makes are very different from the types of errors that OCR software makes. When you combine two noisy (error-containing) versions of the same signal (the book), if the two noises are uncorrelated, you can recover a surprisingly high-quality version of the original.
Finally, it is reasonable to ask on what basis can you (the author) or your agent make such ebook quality demands of the publisher. Even if you don’t ask yourself this question, your publisher almost certainly will!
The answer, as is so often the case, is that it depends on your contract. Before diving into questions of ebook quality, it might be worth stepping back for a second and asking if your publisher even has the rights to make an ebook. If your original contract explicitly included ebook rights, then of course there is no issue. Similarly, there is no issue if you signed a new contract giving your publisher ebook rights.
The only case in which there might be an issue is if your original and only contract was interpreted by your publisher to include ebook rights. This means one of two things. If your publisher’s interpretation was correct, then, in my opinion, your original contract was over-broad, but that’s water under the bridge. The other possibility, however, is that your publisher’s interpretation went beyond their rights, in which case you could of course sue. But it probably makes more sense to just demand a new explicit contract for ebooks. You should demand pretty favorable terms as compensation for their going beyond their original contractual bounds.
So assuming the basic issue of rights is resolved, let’s get back to the much trickier question of what ebook quality obligations are implied by the contract. I say “implied” because even in a contract that explicitly lays out ebook rights, it is very rare that ebook quality obligations are explicitly laid out.
So you’ll have to rely on whatever paper-edition quality obligations are explicit in the contract. Hopefully there was a final correction phase in your paper publication process, as specified by your contract. This would typically be the correction of page proofs. Your publisher was hopefully obligated to print—and hopefully did print—something that more or less perfectly matched the corrected page proofs.
My argument, which I urge to be your argument, is that, lacking any specific ebook quality obligations, your contract implies that your ebook, like your paper book, should have been produced to match the corrected page proofs. This is the basis on which I think you can make ebook quality demands of your publisher.
I hope this article has alerted you to what might have happened to your book during conversion to ebook, and what you can do about it. Feel free to post any thoughts or questions in the comments below.
To get all the ebook and digital publishing news you need every day in your inbox at 8:00 AM, sign up for the DBW Daily today!