Home Technology Open-Source Tools Meta used pirated books to train its AI models, and there are...

Meta used pirated books to train its AI models, and there are emails to prove it

0
Meta used pirated books to train its AI models, and there are emails to prove it

Meta has been serving tech enthusiasts for more than 25 years. TechSpot is a source of trusted tech advice and analysis.

Facepalm! Authors have sued Meta alleging that it used unauthorized copies to train its generative AI model. Although Meta has denied any wrongdoings, newly unveiled messages suggest that executives were aware of their actions and that they violated copyright laws.

It is possible that the lawsuit filed by Sarah Silverman and Richard Kadrey against Meta, as well as other writers and rights owners, has entered its most critical stage. The authors obtained internal emails from Meta in which employees discussed openly “torrenting” well known archives of pirated content for training more powerful AI models. Meta admitted using controversial datasets in the past

arguing that this should be considered fair usage. The company admitted to downloading “LibGen,” a massive dataset containing millions of pirated ebooks. The newly released emails reveal that Meta had deeper concerns about acquiring and disseminating this data via the BitTorrent network.

The emails reveal that Meta downloaded and distributed at least 81.7 terabytes across multiple datasets, including 35,7 terabytes derived from Z-Library archives and LibGen archives. Plaintiffs claim that Meta engaged in a “astonishing” torrenting plan, distributing pirated ebooks at an unprecedented level.

Bashlykov said in September 2023 that he had consulted Meta’s legal department because torrents and “seeding” the terabytes pirated data were clearly illegal.

Meta knew that its engineers were illegally torrenting data to train AI models. Mark Zuckerberg was also reportedly aware of LibGen. To hide this activity, Meta used servers outside Facebook’s main network to mask torrenting and seeding. In a separate internal message, Meta employee Frank Zhang described this approach as “stealth mode.”

Meta, like other major tech companies, is investing massive amounts of cash into AI development and generative AI. The company, which wants to populate its old social networks with AI generated personas and robots, filed a recent motion to dismiss the lawsuit brought by Silverman and others. The newly revealed emails detailing Meta’s involvement in torrenting, and distributing pirated ebooks could complicate its legal defence.

www.aiobserver.co

NO COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Exit mobile version