BBC Challenges AI Firm Over Content Use
The British Broadcasting Corporation (BBC) is escalating its efforts to protect its content, threatening legal action against Perplexity, a US-based artificial intelligence firm. The core of the dispute lies in allegations that Perplexity's chatbot is reproducing BBC material "verbatim" without authorization.
Image: Getty Images
BBC Demands Action and Compensation
In a formal letter to Perplexity, the BBC, a global news powerhouse, has demanded an immediate halt to the use of its content. The corporation also seeks the deletion of any stored BBC material and financial compensation for content already utilized. This move is a first for the BBC in taking such direct action against an AI company for content scraping.
Perplexity's response to these serious allegations was brief. In a statement, the AI firm said, "The BBC's claims are just one more part of the overwhelming evidence that the BBC will do anything to preserve Google's illegal monopoly." The company did not elaborate on the connection it sees between Google and the BBC's copyright concerns.
Copyright Infringement and Reputational Damage
The BBC's legal communication to Perplexity's CEO, Aravind Srinivas, asserts that the AI firm's actions constitute "copyright infringement in the UK and breach of the BBC's terms of use."
Furthermore, the BBC referenced its own research from earlier this year, which highlighted inaccuracies in news summaries generated by several popular AI chatbots, including Perplexity AI. The research found significant issues with how Perplexity AI represented BBC content, stating that such outputs did not meet the BBC's stringent Editorial Guidelines for impartial and accurate news. "It is therefore highly damaging to the BBC, injuring the BBC's reputation with audiences - including UK licence fee payers who fund the BBC - and undermining their trust in the BBC," the letter added. For more details on this research, you can refer to the BBC's findings on AI news summarization inaccuracies.
Intensifying Scrutiny on Web Scraping
The rise of generative AI tools like chatbots and image generators, capable of producing content from simple prompts, has been meteoric since OpenAI's ChatGPT launch in late 2022. However, this rapid advancement has brought to the forefront critical questions about the unauthorized use of existing material to train these models.
Much of the data fueling generative AI is sourced from a vast array of websites through automated bots and crawlers that extract site data—a practice known as web scraping. This activity has prompted UK media publishers to advocate for stronger copyright protections, joining calls from other creatives. You can learn more about AI and chatbot technology here.
The Professional Publishers Association (PPA), representing over 300 media brands, voiced strong support for the BBC's stance. The PPA stated it was "deeply concerned that AI platforms are currently failing to uphold UK copyright law." They further highlighted that bots were "illegally scrape publishers' content to train their models without permission or payment," which "directly threatens the UK's £4.4 billion publishing industry and the 55,000 people it employs."
The Robots.txt Controversy
Many organizations, including the BBC, employ a "robots.txt" file on their websites. This file is intended to instruct bots and web crawlers not to access certain pages or materials for AI data extraction. However, compliance with robots.txt directives is voluntary, and reports suggest that some bots disregard these instructions.
In its letter, the BBC stated that while it disallowed two of Perplexity's crawlers, the company "is clearly not respecting robots.txt". Perplexity's CEO, Aravind Srinivas, denied accusations of ignoring robots.txt in a June interview with Fast Company. Perplexity also maintains on its website that it does not use website content for AI model pre-training because it doesn't build foundation models.
Perplexity: An "Answer Engine" Under Fire
Perplexity positions itself as an "answer engine," a popular tool for users seeking answers to a wide range of questions. Its website explains that it achieves this by "searching the web, identifying trusted sources and synthesising information into clear, up-to-date responses."
Like many AI chatbots, Perplexity advises users to double-check its responses for accuracy, acknowledging the potential for AI to "hallucinate" or present false information convincingly. This isn't the first time AI-generated content related to the BBC has caused issues. In January, Apple suspended an AI feature that produced false headlines for BBC News app notifications when summarizing news for iPhone users, following complaints from the BBC.