Leading Authors of Today's Magazine
  • Home
  • Editorial
  • Featured New Authors
  • Anthologies
    • Moguls Unleashed
      • Dr. Dashnay Holmes is a Dynamic Entrepreneur!
      • Dr. Jane Mukami
      • Dr. Demaryl Roberts-Singleton
      • Dr. Desirie Sykes
      • Dr. Terry Golightly
      • Dr. Shontae Davidson
      • Dr. Adrienne Velazquez
      • Dr. Nichole Pettway
      • Dr. Daniela Peel: Corporate Wellness
  • News and Updates
  • More
    • Multimedia
    • Author of the Month
    • Book Reviews
    • Interviews and Conversations
    • Community and Engagement
    • Writing Resources
    • Genre Explorations
No Result
View All Result
Leading Authors Of Today's Magazine
No Result
View All Result

Microsoft AI Records 5,000 Audiobooks for Project Gutenberg

July 28, 2024
in How-to
0
Home How-to
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter
Microsoft AI Records 5,000 Audiobooks for Project Gutenberg


October is National Book Month, and this Halloween there’ll be something new. In an evolutionary leap for the free ebook site Project Gutenberg, readers can now hear the tales of Edgar Allan Poe — or Frankenstein, or Shakespeare’s Macbeth with its spooky witches — magically read out loud by a 21st-century synthesized AI voice.

Researchers from Microsoft, Google, and MIT have teamed up with Project Gutenberg’s executive Greg Newby to create 5,000 open-license audiobooks — roughly 35,000 hours of audio — read by a surprisingly human-sounding voice.

It’s a vast and varied collection containing both fiction and non-fiction — classic literature, plays, and even biographies. There’s something for everybody — from The Return of Sherlock Holmes by Sir Arthur Conan Doyle to The Return of Tarzan by Edgar Rice Burroughs. “We hope this contribution can provide value to both the research community, and the broader community of audiobook listeners,” the researchers wrote in a pre-print paper at arXiv.org. Titled “Large-Scale Automatic Audiobook Creation,” it argues that audiobooks “can dramatically improve a work of literature’s accessibility” — for the visually impaired, young children, and even new learners of a language.

And “Reactions have generally been positive,” Project Gutenberg’s executive director Greg Newby told us in an email interview. “Audiobooks are quite popular, even our older ones from 2004 that have relatively low quality. People appreciate having a variety of literary works available as audiobooks, and of course many of the new audiobooks that Microsoft made from Project Gutenberg texts were not otherwise available as audiobooks — they are not popular enough for major platforms.”

Newby remembers one negative reaction, from someone who called the whole endeavor “inappropriate” — taking a human work of literature and feeding it into an unfeeling machine for the sole purpose of then artificially mimicking both human voices and intonations. But “This seemed to me like a general reaction,” Newby says, “not from someone who was going to listen to any audiobooks or who had prior knowledge of Project Gutenberg.”

“From my point of view, the work they completed (with my input and collaboration) is excellent, and Project Gutenberg is in favor of any activities that make literature more accessible to a broader audience at little or no cost.

“The Microsoft effort certainly ticks those boxes.”

Excited about Tech Philanthropy

Their paper notes it can take hours of work to produce and publish an ebook. Actor Stephen Fry has recounted his tribulations accurately recording the text of the Harry Potter series:

The process is also expensive. But more importantly, the paper points out that audiobooks with a synthesized voice have “historically suffered from the robotic nature of text-to-speech systems.” In an explanatory video from Microsoft Cloud, Newby says that there’s always been a high demand for audiobooks —  but “What we discovered, though, is that we weren’t really good at it, and so we ended up abandoning audiobooks.

“Until Microsoft said, ‘Hey we have some new technology for automated text-to-speech production.’”

In a video on the official Microsoft Developer channel on YouTube, Brendan Walsh summarized their stack for the ambitious project. “Fortunately, we’ve developed some tools, and we used some open source tools online that make it way, way easier… Specifically, we use Synapse ML with Apache Spark on Azure Synapse Analytics to generate a bunch of audiobooks.”

The end result was “The Project Gutenberg Open Audiobook Collection” — made available on the major podcast and streaming platforms, and also available in a single .zip file for researchers.

In the video, Walsh described himself as “excited about working on tech philanthropy.”

And leader researcher Mark Hamilton just sounded happy to be saying that their tech will “make these audiobooks really sound like a human’s reading them, instead of a robot!”

How Does It Sound?

The ebooks have their own pages on Spotify, Apple podcasts, Google Podcasts, and the Internet Archive. “Thank you for listening to this free audiobook,” each recording begins, “created by Project Gutenberg and Microsoft AI.”

And yes, although lacking the effusive human warmth of Stephen Fry, the voices could still easily be mistaken for a human. But they’re not perfect. The AI knows how to read Roman numerals — but gets confused by stand-alone letters like “I” and “V”. (So when reading Shakespeare’s Macbeth it reads the designation of the first scene — Scene I —  as “scene eye,” while the fifth scene becomes “scene vee.”) And when one of Macbeth’s witches talks about tormenting the sea captain who’s the “master o’ th’ Tiger” (presumably a ship named the Tiger) — the AI just gives up and spells out the letters, saying “master O T H Tiger.”

Although perhaps more disappointing is how it reads every part in the exact same voice. Macbeth and Lady Macbeth are the same male narrator, as are the three witches, Banquo, and King Duncan. Newby says he’s heard that feedback as well. “Someone else commented that there don’t seem to be any female-sounding voices, and asking why not. I’ve passed that comment to Microsoft, and agree there should be a variety of voices.”

The researchers’ paper also talks about their work on “automatic speaker and emotion inference system” which would scan the context of passages and then “dynamically change the reading voice and tone” to make dialogue “more life-like and engaging”, even predicting the appropriate emotion to use in their dialogue. (In 2020 some of the same researchers had worked on a more natural-sounding text-to-speech system — by first building a “spontaneous conversational speech corpus” for training, and then equipping their system with a “conversational context encoder” for selecting the appropriate tone for responses.)

Looking to the future, Newby says that “Eventually it would be great if people could select their own preferences for voice, speed, etc. and get an audiobook made just for them!” Newby says he has seen a demo of Microsoft’s technology which does swap in different voices for different characters, but unfortunately this feature “didn’t make it into the current collection.”

This is the first time I’ve heard AI audio narration referred to as synthetic speech…

Project Gutenberg puts 5,000 audiobooks online for free using synthetic speech | TechCrunch https://t.co/mGEtTUfZgG pic.twitter.com/8VeENWotgc

— Terri Nakamura (@terrinakamura) September 22, 2023

The Shape of Things to Come

The project’s lead researcher even told Popular Science they hope to create free audiobooks for all 60,000 of the ebooks available on Project Gutenberg — possibly even translating them into different languages. We’ll see if we can scale this out,” Hamilton said in the YouTube interview on the Microsoft Developer channel.

And their paper also talks of a demonstration app that “allows conference attendees to create a custom audiobook, read aloud in their own voice, using only a few seconds of example sound.” In essence, the system “clones” each participant’s voice using a speedy technique known as “zero-shot text-to-speech. (Although attendees will also have the option of just selecting another pre-synthesized voice.) The attendees will no doubt also be amazed that the audiobook is generated in just a few seconds. In a video on YouTube, lead researcher Mark Hamilton creates an audiobook of Alice in Wonderland in 15 seconds.

And then users can even create a custom dedication, which the AI-speaking-in-their-voice will read before the text of the ebook. “Once the pipeline finishes we will email the user a link to download their custom-made audiobook.”

But Project Gutenberg’s Newby applauds what’s perhaps the most important feature of all: that the work is all open source. The code is available on Microsoft’s Synapse ML site.

“The great thing about the work that Microsoft completed is that not only are the books completely free, so is the software. This could be leveraged by others interested in pursuing their own enhancements or in just using the software as it currently exists.”


Group Created with Sketch.

David Cassel is a proud resident of the San Francisco Bay Area, where he’s been covering technology news for more than two decades. Over the years his articles have appeared everywhere from CNN, MSNBC, and the Wall Street Journal Interactive…

Read more from David Cassel





Credit goes to @thenewstack.io

Previous Post

Flea Book Signing & Interview | “Acid for the Children”

Next Post

Helping Southern authors share their stories, with Lynda Bouchard

Next Post
Helping Southern authors share their stories, with Lynda Bouchard

Helping Southern authors share their stories, with Lynda Bouchard

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Random News

Shaista Samreen’s debut poetry collection ‘Am I Lost or Was I Never Found’ unveiled

Shaista Samreen’s debut poetry collection ‘Am I Lost or Was I Never Found’ unveiled

...

MENUNGGU KOMENTAR “DCU BANGKIT” | SUPERMAN – TEASER TRAILER REACTION

MENUNGGU KOMENTAR “DCU BANGKIT” | SUPERMAN – TEASER TRAILER REACTION

...

Moms become authors in Ottawa book-writing workshop

Moms become authors in Ottawa book-writing workshop

...

10 Lines On Importance Of Books In English, Importance Of Books Essay, Essay On Importance Of Books

10 Lines On Importance Of Books In English, Importance Of Books Essay, Essay On Importance Of Books

...

1st priority studies- க்கு தான்..! – Neet Topper Prabanjan Interview | IBC Tamil | Neet Results 2023

1st priority studies- க்கு தான்..! – Neet Topper Prabanjan Interview | IBC Tamil | Neet Results 2023

...

Triumphant Magazine Exclusive Interview with #1 International Bestselling Author Mona AlHebsi

Triumphant Magazine Exclusive Interview with #1 International Bestselling Author Mona AlHebsi

...

About us

Today's Author Magazine

Welcome to Today's Author Magazine, the go-to destination for discovering fresh talent in the literary world. We shine a light on new authors and captivating anthologies, providing readers with a diverse array of stories and insights. Here's a look at the vibrant categories that make up our magazine

RecentNews

Elevating Leadership, Empowering Women: The Journey of Dr. Janet Lockhart-Jones

Leading with Words: The Transformational Journey of Dr. Mark Holland

Faith, Healing, and Resilience: The Empowering Voice of Elaine King

Rising Beyond Bars: The Transformative Journey of Dr. Nichole Pettway

Categories

  • Anthologies
  • Author of the Month
  • Book Reviews
  • Community and Engagement
  • Editorial
  • Featured
  • Featured New Authors
  • Genre Explorations
  • Global Influence
  • How-to
  • Interviews and Conversations
  • Multimedia
  • News and Updates
  • Other
  • Uncategorized
  • Writing Resources

RandomNews

Methuen student honored in reading, writing contest | News

The Yale Review | Merve Emre: An Interview with Rachel Cusk

Baillie Gifford in new sponsorship crisis talks with book festival

Book sheds new light on dark period of history in Donoughmore

Programme announced for West Cork Literary Festival 2023 – TheCork.ie (News & Entertainment)

  • Home
  • About
  • Privacy
  • Terms
  • Contact

© 2024 Today's Author Magazine. All Rights Are Reserved.

No Result
View All Result
  • About
  • Contact
  • Home
  • Moguls Unleashed
  • Privacy
  • Terms

© 2024 Today's Author Magazine. All Rights Are Reserved.