One SEO professional was curious about why they were having issues with their CDN implementation.
Their site is mainly based out of India. And their users are also from India.
A few days back, they moved all of their dynamic traffic behind the CDN (content delivery network). But, what they have been observing is that, in Search Console under their crawl stats, their response times actually ended up going up from 300 milliseconds to around a second.
They aren’t understanding why their crawl rate has dropped from around 2 million requests per day to around 80,000 requests.
They aren’t understanding how moving to a CDN would impact how Google crawls their website.
Also, does the network latency that was introduced by the CDN service impact the crawl rate, and their ranking?
John explained that from a ranking point of view, the CDN implementation would not change anything.
Perhaps that’s the first question. But it can be, if you change your hosting significantly, what will happen on Google’s side is the crawl rate will move to a more conservative area first, because they saw a bigger change in hosting.
This could include a move to a CDN or from one CDN to a different CDN. Then over time, perhaps over a couple of weeks to perhaps a month or so, they will increase the crawl rate again to see where they think it will settle down.
So, the drop in crawl rate overall, for a move to a CDN or the change in CDN, can be normal.
The crawl rate itself doesn’t necessarily mean there is a problem. Because, if Google was crawling 2 million pages before, it is unlikely that the site will have 2 million pages that change every single day.
So it’s not necessarily the case that they will miss all of the new content on the site. They would just try to re-prioritize, and figure out which of the pages they actually need to crawl on a daily basis.
Just because the crawl rate drops is not necessarily a reason to be concerned.
What worries John more is the overall change in the average response time.
Google’s crawl rate is one that they choose based on the average response time, and it’s also based on server errors.
If the average response time goes up significantly, then Google will stick to a lower crawl rate.
This happens at approximately the 26:26 mark in the video.
John Mueller Hangout Transcript
SEO Professional 6 26:26
Hi, John. So our website is based out of India, mainly. And our users are also from India. So like a few days back, what we did was we moved all our dynamic traffic behind the CDN.
But what we have been observing is that, like, in the Search Console under crawl starts, our response times have gone up from like, 300 milliseconds to like, around a second. So we are not understanding. And also the crawl rate has dropped.
It dropped from around 2 million requests a day to around 80,000 requests earlier. So we are not understanding how moving to a CDN impacts how Google crawls our website.
And, for the network latency that is introduced due to the CDN service being present close to from where Google is crawling our site, like does that impact how the crawl rate as well as our search ranking as well? And also, yeah, Prathik from my team is also there, if he wants to add something.
John 27:40
Yeah. So from a ranking point of view, this would not change anything.
So maybe that’s kind of the first question ahead. But it can, like if, if you change your hosting significantly, what will happen on our site is the crawling rate will move into a more conservative area first, where we’ll say we’ll we’ll crawl a little bit less first, because we saw a bigger change in hosting, which could be a move to a CDN or move from a CDN or from one CDN to a different CDN.
And then over time, over, I don’t know, a couple of weeks, maybe a month or so, we will increase the crawl rate again to kind of see where we think it will settle down.
So essentially, that drop in crawl rate overall, for moving to a CDN or change of a CDN, that can be normal. The crawl rate itself, doesn’t necessarily mean that there’s a problem. Because like, if we were crawling 2 million pages of your website before, it’s unlikely, I assume, that you would have 2 million pages that change every day.
So it’s not necessarily the case that we would miss all of the new content on your website, we would just try to prioritize again, and figure out which of these pages we actually need to recrawl on a day to day basis. So just because the crawl rate drops is not necessarily a sign for concern.
What would worry me more is the change in the average response time. Because the crawl rate that we choose is based on the average response time, it’s also based on server errors, those kinds of things.
And if the average response time goes up significantly, we will kind of stick to a lower crawl. So that’s something where I assume if the average is around one second per URL on your website, I feel that’s that’s…
SEO Professional 6 29:54
Uh, actually, so about that, the average response time has actually improved for our users. It has only increased for the Google crawlers.
John 30:08
Yeah, I don’t know, it depends on how you have your CDN set up. So that’s, I think, always a tricky part. If you focus kind of your website on users in one country and Googlebot’s crawling usually happens from the US and you don’t kind of see what users in the US would see, then, that’s sometimes a little bit tricky to diagnose, and to figure out what exactly is causing that delay.
But that, from our point of view, the response time is critical for the amount of crawls that we do per day, because we want to limit the number of active connections that we have to your server. And if the response time is high, then like, we run into that limit fairly quickly. Whereas if we can crawl quickly, then we don’t have that much of a problem with the number of concurrent connections.
SEO Professional 6 31:08
We tried running tests like from U.S. locations, like with CDN, without CDN, the latencies that we are observing are the same. So is there anything else that we can look into? Why the…
John 31:24
I don’t know, it’s hard to say. I would maybe use some testing tools that are running within Google’s network. So if, for example, you have something, I don’t know, if you can set up a VM on Google Cloud, for example, and use that to test the latency to your server.
That might be an approach that you could take. What you could also do is look at your server logs to see which URLs Googlebot is explicitly requesting, and check to see if there’s a general pattern there.
So in particular, it might be happening that Googlebot is crawling a lot of URLs that take a lot of resources on your side, and which are not representative of the rest of your website.
And because of that, it might be that the URLs that Googlebot is requesting, they take a lot of time. And maybe there are ways for you to either optimize those URLs or make it so that Googlebot finds fewer of those URLs, or maybe even block them completely if they’re not useful for your website.