Reddit Sues Perplexity and Data Scrapers Over Alleged Copyright Infringement

'Would-Be Bank Robbers': Reddit Sues Perplexity, Data Firms Over AI Scraping
CNET

Key Points

  • Reddit sued Perplexity and data‑scraping firms Oxylabs UAB, AWMProxy, and SerpApi in federal court.
  • The complaint alleges illegal scraping of Reddit content, bypassing technical barriers.
  • Reddit claims the firms accessed nearly three billion search‑engine result pages in a two‑week period.
  • Perplexity was previously sent a cease‑and‑desist letter and remains a SerpApi customer.
  • Reddit hosts over 110 million daily active users and more than 22 billion posts and comments.
  • The platform has licensing deals with OpenAI and Google and has sued Anthropic over data use.
  • Perplexity faces a separate copyright lawsuit from Encyclopedia Britannica.
  • The case reflects broader industry disputes over AI training data and copyright law.

Reddit has filed a lawsuit in the U.S. District Court for the Southern District of New York against AI search developer Perplexity and three data‑scraping firms—Oxylabs UAB, AWMProxy and SerpApi—accusing them of illegally harvesting Reddit content and violating the platform’s copyright protections. The complaint alleges the firms bypassed technical barriers, accessed billions of search‑engine result pages, and traced the scraped data back to Perplexity, which had previously received a cease‑and‑desist letter. Reddit, which hosts over 110 million daily active users and more than 22 billion posts and comments, has previously licensed its data to OpenAI and Google and has taken legal action against other AI companies over similar concerns.

Lawsuit Details

Reddit filed a complaint in the U.S. District Court for the Southern District of New York, naming Perplexity and three data‑scraping companies—Oxylabs UAB, AWMProxy and SerpApi—as defendants. Reddit alleges the firms illegally scraped its content, circumventing Reddit’s and Google’s technical safeguards to access nearly three billion search‑engine result pages over a two‑week period in July. The filing describes the defendants’ methods as “masking their identities and locations” to evade detection.

Reddit traced the scraped material back to Perplexity, prompting a prior cease‑and‑desist letter that has not been heeded. Perplexity remains listed as a customer of SerpApi, alongside Meta, Samsung and Nvidia. The lawsuit claims the scraping violates Reddit’s copyright protections and seeks remedies for the alleged infringement.

Context and Industry Impact

Reddit is one of the internet’s largest platforms, reporting over 110 million daily active users and more than 22 billion posts and comments. Its vast repository of human‑generated content makes it a valuable source for AI training data. The company has entered licensing agreements with OpenAI and Google and has previously sued Anthropic for similar data‑use concerns.

Perplexity itself faces a separate copyright lawsuit from Encyclopedia Britannica, which owns the Merriam‑Webster dictionary. These cases highlight the broader legal debate over whether AI companies must obtain licenses for copyrighted material or can rely on “fair use” defenses. Recent court decisions have granted fair‑use victories to Meta and Anthropic, while other high‑profile cases, such as Ziff Davis’s suit against OpenAI, continue to shape the legal landscape.

The Reddit lawsuit underscores growing tensions between content platforms and AI developers as the industry grapples with the need for massive datasets and the rights of copyright holders.

#Reddit#Perplexity#Oxylabs#AWMProxy#SerpApi#copyright#AI training data#lawsuit#OpenAI#Google#Anthropic#Encyclopedia Britannica#Ziff Davis
Generated with  News Factory -  Source: CNET

Also available in: