An SEO professional asked John Mueller about what they should be looking for in the Google Search Console Crawl Stats report.
They are trying to understand if there is an issue on the technical side with Google crawling their site.
So their main question is: What are some of the signals, or things, to identify in Google Search Console’s crawl stats report, that may indicate that Google is struggling to crawl something?
John explained that the crawl stats report will not be useful for the SEO pro, simply because with the crawl stats report, you are looking at an aggregate view of data representing the crawling of your site.
Usually, for most cases, this makes more sense. If you have something like a few 100,000 pages, then you can examine that and say “Oh, well on average the crawling is slow.”
Whereas if you have a website that has, I don’t know, maybe around 100 pages or so, then essentially, even if crawling is really, really slow, Google can still get them once a day, or in the worst case scenario maybe once a week.
It’s not going to be a technical issue regarding crawling. It’s more a matter of understanding that the website actually offers something unique and valuable that Google needs to have indexed.
So it’s much less of an issue on the crawling side and more about the indexing side of things.
The exception here would be if there’s really a big technical issue with your site. But this is something that you would see right away, because you would probably check these URLs individually and find that Google can’t crawl them at all.
Perhaps there’s an error that’s returned, or there’s a noindex that’s returned. This will be very obvious.
John’s assumption here, especially for a smaller website, is you have to convince Google at first that it should try to crawl and index your site.
This happens at approximately the 25:40 mark in the video.
John Mueller Hangout Transcript
SEO Professional 7 25:40
Hello, I posted my question on YouTube. But I’m going to say it now just because I’m doing it anyway. We have looked at the crawl stats reports on the Search Console, and have been trying to identify if there might be some issue on the technical side with Google crawling our website.
What are some of the signals or things to identify that will point us if Google is struggling to crawl something? Or if Googlebot is distracted by, I don’t know, files that are irrelevant? And things that it’s trying to index that are irrelevant for us?
John 26:18
So I think in your question, you also mentioned that it’s a fairly small site, is that correct?
SEO Professional 7 26:25
Yes. Fairly small, maybe five or six months old?
John 26:28
Okay. So my guess is the crawl stats report will not be useful for you in that case, because with the crawl stats report, you’re really looking at an aggregate view of the crawling of your website. And usually, that makes more sense if you have something like, I don’t know, a couple 100,000 pages.
Then you can look at that and say, Oh, well, on average, the crawling is slow. Whereas if you have a website that has, I don’t know, maybe around 100 pages or so then essentially, even if crawling is really, really slow, then those 100 pages, we can still get them, like once a day, worst case, maybe once a week.
It’s not going to be a technical issue with regards to crawling, it’s essentially more a matter of understanding that the website actually offers something unique and valuable that we need to have indexed. So less an issue about the crawling side and more about the indexing side. The exception there would be if there’s really a big technical issue with your website.
But that’s something that you would see right away, because you would probably check individual of these URLs and notice, Oh, Google can’t crawl them at all. Like there’s an error that’s returned, or there’s a noindex that’s returned. And that would be very obvious.
So my assumption there, especially for a smaller website, is that it’s really a matter of making sure that Google understands the value of the website, and that it knows that it makes sense to index as much as possible, because the crawling side is not going to be the limiting factor. It’s really more like, well, you have to convince Google first that actually it should try to crawl.
SEO Professional 7 28:16
Okay. And just to confirm, you said that the crawl stats report are not an exhaustive report. It’s a sample and should be treated as a sample.
John 28:28
I don’t know. I think the number shown in the report would be complete, but maybe the URLs that are shown there, I believe that’s only limited to something like 1000. I don’t know offhand. But the graphs, that should be essentially complete.
The thing that sometimes throws people off with a crawl stats report is that it includes other systems within Google that use the same Googlebot infrastructure. So for example, if you run ads, then I believe like the ads landing page checks, they also go through that infrastructure. If you have product listing ads, those also go through that infrastructure.
So sometimes you’ll see a number that is much higher than if you look in your log files and just look for Googlebot, just because there are different things that run on the same infrastructure, and we treat them the same, but it’s not purely web search.