Leading Authors of Today's Magazine
  • Home
  • Editorial
  • Featured New Authors
  • Anthologies
    • Moguls Unleashed
      • Dr. Dashnay Holmes is a Dynamic Entrepreneur!
      • Dr. Jane Mukami
      • Dr. Demaryl Roberts-Singleton
      • Dr. Desirie Sykes
      • Dr. Terry Golightly
      • Dr. Shontae Davidson
      • Dr. Adrienne Velazquez
      • Dr. Nichole Pettway
      • Dr. Daniela Peel: Corporate Wellness
  • News and Updates
  • More
    • Multimedia
    • Author of the Month
    • Book Reviews
    • Interviews and Conversations
    • Community and Engagement
    • Writing Resources
    • Genre Explorations
No Result
View All Result
Leading Authors Of Today's Magazine
No Result
View All Result

OpenAI Destroyed AI Training Data. Staff Who Collected It Are Gone.

May 25, 2024
in How-to
0
Home How-to
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter
OpenAI Destroyed AI Training Data. Staff Who Collected It Are Gone.


Newly unsealed documents in the class-action lawsuit brought by the Authors Guild against OpenAI show the startup deleted two huge datasets, named “books1” and “books2,” that had been used to train its GPT-3 artificial-intelligence model.

Lawyers for the Authors Guild said in court filings that the datasets probably contained “more than 100,000 published books” and were central to its allegations that OpenAI used copyrighted materials to train AI models.

For months the Guild has been seeking information from OpenAI about the datasets. The company initially resisted, citing confidentiality concerns, before ultimately disclosing that it had deleted all copies of the data, according to the legal filings reviewed by Business Insider.

High-quality training data is an important part of the powerful AI models that are taking the tech world by storm. OpenAI and other companies used data from the internet, including many books, to build these models. Many of the companies that created this information want to be paid for providing intelligence to these new AI products. Tech companies don’t want to be forced to pay. This dispute is being fought in court now, via several lawsuits.

In a 2020 white paper, OpenAI described the “books1” and “books2” datasets as “internet-based books corpora” and said they made up 16% of the training data that went into creating GPT-3. The white paper also says “books1” and “books2” together contained 67 billion tokens of data, or roughly the equivalent of 50 billion words. For comparison, the King James Bible contains 783,137 words.

The unsealed letter from OpenAI’s lawyers, which is labeled “highly confidential – attorneys’ eyes only,” says that the use of “books1” and “books2” for model training was discontinued in late 2021 and that the datasets were deleted in mid-2022 because of their nonuse. The letter goes on to say that none of the other data used to train GPT-3 has been deleted and offers attorneys for the Authors Guild access to those other datasets.

The unsealed documents also disclose that the two researchers who created “books1” and “books2” are no longer employed by OpenAI. OpenAI initially refused to share the identities of the two employees.

The startup has since identified the employees to lawyers for the Authors Guild but hasn’t publicly disclosed their names. OpenAI has petitioned the court to keep the names of the two employees, as well as information about the datasets, under seal. The Authors Guild has opposed this, arguing for the public’s right to know. The dispute is ongoing.

“The models powering ChatGPT and our API today were not developed using these datasets,” OpenAI said in a statement on Tuesday. “These datasets, created by former employees who are no longer with OpenAI, were last used in 2021 and deleted due to non-use in 2022.”



Read More

Previous Post

Judith Heneghan launches new novel Birdeye at book shop event

Next Post

Gruffalo puppet events to be held in East Yorkshire

Next Post
Gruffalo puppet events to be held in East Yorkshire

Gruffalo puppet events to be held in East Yorkshire

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Random News

BTS+EXO =?? #shorts #youtubeshorts #bts #exo #btsarmy #kpop #jungkook #suga #ytshorts #viral #yt

BTS+EXO =?? #shorts #youtubeshorts #bts #exo #btsarmy #kpop #jungkook #suga #ytshorts #viral #yt

...

Writer Daisy May Cooper launches her first ever podcast

Writer Daisy May Cooper launches her first ever podcast

...

Food safety expert releases new book: ‘How to Sell Food Safety’

Food safety expert releases new book: ‘How to Sell Food Safety’

...

Harold Robbins: The Robbins Writing Process

Harold Robbins: The Robbins Writing Process

...

Summer 2024 reading from MIT | MIT News

Summer 2024 reading from MIT | MIT News

...

Adam Nimoy’s new memoir examines relationship with famous father

Adam Nimoy’s new memoir examines relationship with famous father

...

About us

Today's Author Magazine

Welcome to Today's Author Magazine, the go-to destination for discovering fresh talent in the literary world. We shine a light on new authors and captivating anthologies, providing readers with a diverse array of stories and insights. Here's a look at the vibrant categories that make up our magazine

RecentNews

Bishop Funke Adejumo: Writing Her Legacy Into Nations

Elevating Leadership, Empowering Women: The Journey of Dr. Janet Lockhart-Jones

Leading with Words: The Transformational Journey of Dr. Mark Holland

Faith, Healing, and Resilience: The Empowering Voice of Elaine King

Categories

  • Anthologies
  • Author of the Month
  • Book Reviews
  • Community and Engagement
  • Editorial
  • Featured
  • Featured New Authors
  • Genre Explorations
  • Global Influence
  • How-to
  • Interviews and Conversations
  • Multimedia
  • News and Updates
  • Other
  • Uncategorized
  • Writing Resources

RandomNews

Children’s Book Trailer Sample

How to Write a Short Story (for BEGINNERS) | 4 Easy Steps

Stephen King’s You Like It Darker: A Ranking Of How Much I Want To See An Adaptation Of Each Story In The New Collection

Book Trailer: “Life’s A Witch” By Brittany Geragotelis

FYI Calendar: Book clubs keep busy, authors speak | The Arkansas Democrat-Gazette

  • Home
  • About
  • Privacy
  • Terms
  • Contact

© 2024 Today's Author Magazine. All Rights Are Reserved.

No Result
View All Result
  • About
  • Contact
  • Home
  • Moguls Unleashed
  • Privacy
  • Terms

© 2024 Today's Author Magazine. All Rights Are Reserved.