During the Q&A portion of John Mueller’s 09/17/2021 hangout, one webmaster asked about a steady increase of random 404s that were not a part of their website.
The pages don’t exist in their sitemap, and they did not get generated by the internal search. Because of this, the webmaster believed that it’s Google’s searches that are being appended to their URLs that Google is trying to crawl.
The webmaster wanted to know how to make sure these URLs don’t impact their overall crawlability and indexability.
John explained that Google doesn’t make up URLs. It’s likely that these are random links they found on the web—probably from some scraper site that’s scraping things in a bad way.
When they find these links, they crawl them, see that they return a 404, and start ignoring them.
John said that it’s not something that the webmaster actively has to take care of. If the URLs don’t exist on the actual website, that’s fine.
This conversation occurs at the 24:40 mark in the video.