Brave Browser Under Fire For Alleged Sale Of Copyrighted Data

Brave browser faces criticism for allegedly selling copyrighted data for AI training, sparking debates on ethical data usage and transparency.

Brave is alleged to sell copyrighted data for AI training.
The company's practices raise ethical and legal questions about fair use and copyright infringement.
The controversy highlights the need for greater transparency in the tech industry.

SEJ STAFF Matt G. Southern

July 17, 2023
⋅
3 min read

SEJ STAFF Matt G. Southern Senior News Writer at Search Engine Journal

Bio

84

SHARES
5.7K

READS

Brave Browser Under Fire For Alleged Sale Of Copyrighted Data

Brave, a privacy-focused web browser, has come under fire for supposedly selling copyrighted data to train artificial intelligence models.

This has sparked debates around the ethical use of data and the need for transparency.

An article by Alex Ivanovs of Stack Diary brought the allegations against Brave to light.

Ivanovs raised concerns that Brave may be collecting and selling user data without permission to companies developing AI systems.

Though Brave touts strong privacy protections, its alleged sale of copyrighted material for AI training raises questions about data practices that may violate user trust and expectations of privacy.

The brewing controversy highlights tensions around using personal data to advance AI capabilities versus respecting data privacy and ownership rights. It underscores the need for clear communication and user consent regarding sharing their information.

The situation calls into question whether Brave truly prioritizes user privacy and data control, as claimed.

Unpacking The Allegations

Ivanovs claimed that Brave enables access to copyrighted content through its Brave Search API, allowing third parties to use this data for AI training without proper licensing.

He argued that Brave’s lack of regard for copyrights and monetizing data access are ethically questionable practices.

Ivanovs writes:

“Brave lets you ingest copyrighted material through their Brave Search API, to which they also assign you ‘rights.'”

Brave’s Response

The accusations led Josep M. Pujol, the head of search at Brave, to defend the company’s actions. Pujol said the rights issues were related to Brave’s search engine results, not the content itself.

Pujol explains:

“Brave Search has the right to monetize and put terms of service on the output of its search-engine.”

Pujol also stated that all data supplied is always attributed to the URL of the content.

The Investigation

Ivanovs noted that Brave Search provides lengthy “Extra alternate snippets” similar to Google’s Featured Snippets. He questioned whether these long snippets, ranging from 150 to 260 words, comply with fair use copyright principles.

Additionally, Ivanovs criticized Brave for not revealing details about its web crawler, which indexes website content. He argued this prevents website owners from blocking Brave from potentially selling their content.

Brave countered that its crawler respects the robots.txt standard websites use to control crawlers.

The Implications

In closing his report, Ivanovs noted that the consequences of Brave’s practices extend beyond the search engine itself.

He voiced worries about the possibility that the system could be misused and the ambiguity surrounding the lawfulness of Brave’s methods.

Additionally, he challenged Brave’s stance that, as a search engine, it is entitled to scrape and resell data verbatim.

Ivanovs warns:

“I don’t see a world in which this cannot be abused.”

As of now, the debate continues.

This issue prompts important questions regarding the ethical application of data, making money from other’s content, and the level of openness displayed by major technology companies.

The tech industry will closely follow these conversations as they evolve.

Category News Generative AI

The Ultimate Topic Cluster Cheat Sheet & Checklist Bundle

The Hidden Cost Of Google Ads: Stop Wasting Budget Bidding Against Yourself