Court filings reveal Meta has halted its efforts to license books for AI-training

New court filings add credibility to the AI copyright case against Meta. The company had reportedly “paused” talks with book publishers about licensing deals to provide some of its generative AI model with training data.

These filings are related the case Kadrey is one of the many cases that have been pending in U.S. courts, pitting AI companies against authors or other intellectual property owners. In most of these cases, the defendants — AI companies — claim that training on copyrighted material is “fair use”. The plaintiffs — the copyright holders — disagree vociferously.

According to the new filings, which include partial transcriptions of Meta employee interviews taken by plaintiffs’ attorneys in the case, certain Meta staff believed that negotiating AI licensing data licenses for book might not be scalable. According to a transcript, Sy Choudhury who leads Metaโ€™s AI partnership initiatives said that Metaโ€™s outreach to publishers was met with a “very slow uptake of engagement and interest.” According to a transcript, Choudhury stated that some publishers, namely fiction book publishers did not have the rights for the content Meta was considering licensing.

Choudhury stated, “I would like to point out that — in the fiction categories, we quickly learned that the majority of publishers we were speaking to, they themselves represented that they didn’t have, actually the rights to licence the data to us,” Choudhury. “It would take a very long time to get to know all of their authors.”

Choudhury stated during his deposition, that Meta had on at least one occasion paused licensing activities related to AI development. Choudhury stated that he was aware of licensing efforts, such as when we tried to license 3D Worlds from different game engines and game manufacturers for the AI research team. “And just as I’m describing for fiction and textbooks, we got very few engagements to have a discussion [โ€ฆ] In that case, we decided — in that case we decided to build a solution.” The latest amended complaint filed by plaintiffs’ attorneys alleges that Meta, along with other offenses, compared certain pirated copies with copyrighted copies available for licensing to determine if it made sense to pursue an agreement with a publishing house.

In the complaint, Meta is accused of using “shadow library” e-books containing pirated content to train its AI models. This includes its popular Llama “open” model series. According to the complaint Meta may have obtained some of the libraries through torrenting. Torrenting is a method of distributing files over the internet that requires torrenters to simultaneously “seed” or upload the files they are trying to obtain. The plaintiffs claimed this was a form copyright infringement.

Kyle Wiggers, a senior reporter for TechCrunch, has a special interest on artificial intelligence. His writings have appeared in VentureBeat, Digital Trends and a variety of gadget blogs, including Android Police and Android Authority, Droid-Life and XDA-Developers. He lives in Brooklyn, with his partner who is a piano teacher, and plays the piano occasionally. Sometimes — but mostly unsuccessfully.

View Bio

www.aiobserver.co

More from this stream

Recomended