Skip to main content
An evolving AI project from Mi3 | Automation with Editor curation. And oversight. Always.
In partnership with
Salesforce
Posted 29/07/2024 10:31am

Image by DALL·E Pic: Midjourney

Editors' Note: Many Fast News images are stylised illustrations generated by Dall-E. Photorealism is not intended. View as early and evolving AI art!

hAIku

Reddit's content locked,
Only Google holds the key,
AI's training ground.

In partnership with
Salesforce

Microsoft confirms it's no longer crawling Reddit, other search engines stop showing content

Commentors globally reporting on the news Reddit has blocked all search players except Google from crawling its content are not surprised by the decision as the platform looks for more ways to generate revenue from its content and tech players seek more and more content sources to train their AI models.

First flagged in a report from 404 Media, searches over recent weeks show Reddit content blocked across Microsoft Bing's search engine along with DuckDuckGo and Mojeek among others, a situation 404 Media reports will remain unless the search engine pays Reddit for its content.

As the team at 404 Media describe it, the situation reflects Google's "near monopoly on search now actively hindering other companies' ability to compete at a time when Google is facing increasing criticism over the quality of its search results".

In a statement made to The Verge, Reddit spokesperson Tim Rathschmidt insisted the changes were not a direct response to Reddit's recent licensing deal with Google, which became public in February this year and sees Google paying Reddit a reported US$60 million per year to train its AI models on the platform's content. At the time, Google confirmed it had access to Reddit’s Data API, which delivers real-time, structured, unique content from their large and dynamic platform.

“This is not at all related to our recent partnership with Google," Rathschmidt told The Verge. "We have been in discussions with multiple search engines. We have been unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI.”

In a statement to,  Microsoft confirmed it had stopped crawling Reddit after the platform implemented its latest robots.text file on 1 July, which it said "prohibits all crawling of their site".

"Microsoft respects the robots.txt standard and we honor the directions provided by websites that do not want content on their pages to be used with our generative AI models. Bing stopped crawling Reddit after they implemented their updated robots.txt file on July 1, which prohibits all crawling of their site," the Microsoft spokesperson stated in a comment sent to Mi3.

In September last year, Microsoft announced support for Bing webmaster controls that allow publishers to prevent their content from being used for model training and display within Chat answers, or just from being used for model training.

Search Mi3 Articles