
When voice actor Heath Miller sits down in his boatshed-turned-home studio in Maine to record a new audiobook narration, he has already read the text through carefully at least once. To deliver his best performance, he takes notes on each character and any hints of how they should sound. Over the past two years, audiobook roles, like narrating popular fantasy series He Who Fights With Monsters, have become Millerβs main source of work. But in December he briefly turned online detective after he saw a tweet from UK sci-fi author Jon Richter disclosing that his latest audiobook had no need for the kind of artistry Miller offers: It was narrated by a synthetic voice.
Richterβs book listing on Amazonβs Audible credited that voice as βNicholas Smithβ without disclosing that it wasnβt human. To Millerβs surprise, he found that βSmithβ voiced a total of around half a dozen on the site from multiple publishersβbreaching Audible rules that say audiobooks βmust be narrated by a human.β Although βSmithβ sounded more expressive than a typical synthetic voice, to Millerβs ear it was plainly artificial and offered a worse experience than a human narrator. It made giveaway mistakes, like pronouncing Covid as βkah-viidβ when referring to the pandemic.
Miller tracked down βSmithββthe voice matched a sample posted to SoundCloud by Speechki, a San Francisco startup that offers more than 300 synthetic voices for audiobook publishing across 77 dialects and languages. He and other narrators and audio fans who discussed the artificial audiobooks online reported the titles to Audible, which eventually removed them. Although it wasnβt a large number, discovering that synthetic voices were good enough for some publishers to put them to work prompted Miller to wonder about the future of his art and income. βItβs a little terrifying because itβs my livelihood and that of many people I respect,β he says.
Richter says he chose an artificial voice because the concept and its βuncanny valleyβ sound suited his book, which has a piece of intelligence software as one of its main characters, and that he was unaware of Audibleβs policies. βMy intention was never to upset or offend anyone,β he says. Speechki says it recommends publishers identify that narrations are synthetic and that it informs them of Audibleβs policies. Will Farrell-Green, a senior director at Audible, said in an emailed statement that the company uses automated and manual processes to enforce its rules but that βdue to the volume of content on our service, titles that are not compliant do slip through from time to time.β Audibleβs βhumanβs onlyβ policy dates back to at least 2014, when synthetic voices were much less convincing, and the company has said the rule helps provide listeners the performances they expect.
Synthetic voices have become less grating in recent years, in part due to artificial intelligence research by companies such as Google and Amazon, which compete to offer virtual assistants and cloud services with smoother artificial tones. Those advances have also been used to make reality-spoofing βdeepfakes.β Speechki is one of several startups developing speech synthesis for audiobooks. It analyzes text with in-house software to mark up how to inflect different words, voices it with technology adapted from cloud providers including Amazon, Microsoft, and Google, and employs proof listeners who check for mistakes. Google is testing its own βauto-narrationβ service that publishers can use to generate English audiobooks for free, using more than 20 different synthetic voices. Audiobooks published through the program include an academic history of theater and a novelistβs exploration of cultural attitudes to sex. Google spokesperson Dan Jackson says its auto-narrated books supplement rather than replace professionally narrated books. βOur goal with auto-narration is to make it possible to create a low-cost audiobook for any ebook title and increase content accessibility for those that are unable to read via ebook,β he says.
Listen to a sample of WIREDβs feature about AI researcher Timnit Gebruβs ejection from Google, narrated by technology from Speechki.
Some publishers see synthetic voices as a way to tap the growing demand for audiobooks, a segment healthier than other parts of the book business. Total US book publisher revenue declined slightly between 2015 and 2020 and ebook revenue shrank, but audiobook revenue surged by 157 percent, according to the Association of American Publishers. Consumers have steadily grown more comfortable with the format, helped along by technical improvements to mobile apps, smart speakers, and wireless headphones. But due to the cost of a narrator and audio production, most titles never become audiobooks, particularly at smaller publishers, says Brian Carroll, rights manager at Indiana University Press.
IU Press licenses a fraction of its catalog for traditional audio production but is now a customer of Speechki. It plans to release its first synthetically narrated audiobooks later this year. βAll the other books at last have a chance of becoming audiobooks now,β Carroll says.
Speechkiβs technology has been impressive in tests so far, Carroll says, navigating the academic language of titles on paleontology and philosophy. One book chosen for production is Around the World in 80 Toasts, in which the software has to handle text sprinkled with words from other languages. βWe thought if it can do this it will probably be able to do anything, and it did a pretty good job,β Carroll says.






