Audiobook Production Has Never Been Easier

bradleymetrock
Sep 28, 2021
5 min read

The publishing industry includes more than 30,000 publishers and millions of independent self-published authors around the globe. Excluding translation, they produce at least 2.2m unique titles a year.

The audiobook market isn't new. However, it is worth noting that it is growing by dozens of percentage points each year. And it has the potential to grow much more quickly -- but the industry has not experienced a revolution analogous to those that have wrought such enormous change in other tech fields. Most publishers are unaware of many of the opportunities they already have. Indeed, only about 4.5% of all new titles are converted into audiobooks.

Let's figure out why publishers don't take this chance to turn this 4.5% into 10%, 20%, or even 50%. Is it that they do not have the rights to produce audiobooks? Nope, publishers have the rights to produce audiobooks from all their titles, but their unit economy doesn't work with traditional methods of audiobook production. This process is slow, expensive, and highly complex. Just imagine how many idle titles could be heard, how many readers/listeners could find the book they need in the very modern and convenient audio format, and how it could accelerate the industry if publishers could find a fast, cheap, and simple audiobook production method.

Back-catalogs are no longer an idle resource

Meet the company which is aiming to disrupt the industry. Speechki (speechki.io) is an audiobook recording platform to scale publishers' audiobook inventories by several times, thanks to synthetic voices.

The company is about expanding the audiobook industry as it already exists, not replacing it. No technology will ever superannuate the skills of a talented voice actor. But actors deserve to be paid and have only so much time. At the current rate, they produce recordings of a small fraction of available books. Their skills will continue to be in demand. Professional narration is an art, not a science. But Speechki's services allow publishers to present a huge new variety of titles to the listening public in a pleasant, easily accessible form.

Speechki is trusted by major players in its industry. The company has been backed by Greycroft, Alchemist Accelerator, and the Co-Founder and CEO of Blinkist. They saw Speechki as the purveyor of an impressive technology that has the potential to revolutionize the audiobook recording industry.

There's No Time Like the Present

The time is ripe for automated audiobook production because of advances in the technologies that are used to create them. Synthetic voices now sound so good that listening to them for great lengths of time is a pleasant experience for almost every audiobook consumer.

Computer voices -- a.k.a Text-to-Speech voices -- which once seemed robotic, monotonous, and lifeless have now been transformed into natural-sounding realistic voices. They are not only capable of mimicking human-like speech but can generate full-length near-perfect audio narrations.

There is no need to cheat or dissemble. You can be absolutely upfront about the fact that you are using a synthetic voice. For example, 750 audiobooks were recorded for Storytel, Eksmo, and other publishers using AI voices. All of them were marked "narrated by AI," and the audience was questioned. In essence, they said, "We love it, because now these books are available in the audio format. The quality is good enough for comfortable listening. And sometimes, human-narrated audiobooks even sound worse." Not every actor has a voice that's pleasant to every listener. But mainly, it's about availability.

Many technology innovators like to mystify the AI-facilitated audiobook production process. They describe it as very complex, dwelling on the byzantine internal mechanisms of the work of neural networks, recondite voice model point settings, and obscure manual intonation controls. Sure, it sounds very cool, technological, and scientific, but in fact, this is the root of the problem. It merely creates an unnecessarily high barrier to entry and stops many publishers from just diving into the process. In fact, it's extremely simple to create 10 AI-narrated audiobooks as a pilot program to test whether the technology is a good fit.

Easy as Pie

Speechki wants to break down the barrier to entry that prevents people from getting started with creating machine-generated audiobooks from existing texts. Actually, the process of creating AI-narrated audiobooks with Speechki looks remarkably similar to the traditional way -- only vastly streamlined and speed-optimized.

Step 1: Just upload your book to Speechki's easy-to-use system. The interface is extremely intuitive and should look familiar to anyone who has used Google Docs or Microsoft Word.

Step 2: Next, choose which of seventy languages the book is written in, and decide on one of twenty ultra-realistic AI narrator voices to read your book. Then press "START."

Step 3: In the next step, Speechki's AI-powered backend system will synthesize an audiobook. While that happens, do nothing. Sit back and relax for a little while -- about fifteen or so minutes for the recording of an approximately eight-hour audiobook.

Step 4: When the audiobook is ready, then you can "proof-listen" through the book to make sure you catch any errors the AI might have made. Even human narrators stumble sometimes, and machine-generated narrators are no different. It is at this stage too that you can add in any sound effects or music cues if you think the book might be enhanced by the sound of gulls on the beach or some moody strings in the background.

If you don't have time to proof-listen to the audiobook and finalize it yourself, it needn't be an obstacle. For an additional fee, Speechki can take care of the proof-listening process with its own in-house team.

Step 5. Once you proof-listened to the audiobook, the system allows you to download an audiobook prepared for sale. Done!

Demonstrating the maturity of the solution, Speechki is already working with Storytel, the Swedish audiobook streaming service, and imprints of Hachette Book Group, the 3rd largest trade and education publisher in the world, providing them with high-quality audiobooks using synthetic voices.

The secret sauce

The synthetic voices Speechki uses are provided by Microsoft, IBM, Google, AWS, and other big players in the field, and Speechki can adapt its process to all synthetic voices, no matter who produced them. The focus is on controlling the best available voices. This offers customers the flexibility to choose from over 200 ultra-realistic voices in more than 70 languages, to find what best fits their use cases.

Listen to how some of Speechki's voices sound in audio samples from books in several languages:

Speechki's main advantage is in its focus on the audiobook market and audiobook production. Not only are customers impressed by the naturalness of speech produced, but they also see the opportunities generated by Speechki's ease of use and its length of effective audio. While other services can produce human-sounding speech for five- or ten-second clips, Speechki can produce outstanding audio for long-form content.

All existing computerized voices are adequate for recording short samples and clips. But their product provides unique value in automatic text formatting, processing, and voice diversity. The adaptation is different for different voices. Because each voice works on its own model and has its own requirements. This makes these improved AI voices suitable for long, easily-listenable texts -- including lengthy audiobooks.

All text-to-speech platforms require making Speech synthesis marking and using API by professional developers. But Speechki provides a simple interface like a standard word processor. The system makes changes itself, without showing the source codes to the proofer.

Ready to create or expand your audiobook library and rapidly increase your audiobook revenue stream today? Check out Speechki's website for more information. You can find samples of Speechki-produced audiobooks available on Soundcloud to get a feel of what you can expect for your next audiobook. Contact Speechki today to get started! Feel free to contact Dima, co-founder and CEO of Speechki, directly: dima@speechki.org or visit the Speechki website.