An SEO professional was curious about their indexed pages.
They had found that from approximately January 13 forward, their indexed pages dropped by over 90 percent. These are indexed pages and actual traffic in the Google Search Console back end. It was also showing up in their log report.
So they checked the coverage report in the GSC back end. And it turned out that almost 14,000 URLs and 14,000 indexed pages were dropped.
And these 14,000 URLs also increased on the “crawled, but currently not indexed” portion of the back end.
The number is exactly the same, and they also checked the samples. The indexed as well as the crawled but not indexed.
On their site, they didn’t find anything unusual, such as the canonical and also the meta tags marked as noindex. There is nothing that indicates that this is an issue.
So they couldn’t figure out how to identify the problem.
They were wondering if John could give some pointers and information about how to solve it.
John explained that he’s worried that Google needs to dig into this issue a bit more.
The one aspect he recommends that the SEO professional checks is whether or not they can actually get the crawl done properly.
John assumes they have already looked into that, but it’s a good idea to double-check there.
The SEO professional explained that yes they have double-checked that their URLs on Google are live and fully crawlable and indexable.
They have checked GSC, their sitemap, their robots.txt. All checked out and there is nothing unusual.
John explained then that what always happens is that they discover a lot of URLs for websites.
If they don’t think that these URLs are important, then Google will keep them in their list.
And at some point, they will try to recrawl them.
John suspects that these are just random URLs that Google discovered over time. And they try to crawl them again from time to time to see if there is anything that they are missing.
This happens at approximately the 23:50 mark in the video.
John Mueller Hangout Transcript
SEO Professional 7 23:50
Hi, John, can you hear me? Yes. Yes. Great. So this time, I only have one, like, urgent and important question. It’s about indexed pages. Like we just found that from, like January 13, that our indexed pages dropped like over 90%. Like these are indexed pages and actual traffic in the GSC backend and also in our log report. So we checked the coverage report in the GSC backend.
And like almost 40,000 URLs, and 14,000 index pages are dropped. And these 14,000 URLs also increased on the crawled but currently not indexed. So the number is exactly the same, and we also checked the samples. We also check the samples that are indexed and also the crawled but not indexed. In our site, we didn’t find anything unusual, like the canonical and also the meta tag marked as noindex.
No, there’s not that issue. So we just couldn’t find a way to identify the problem. So we just hope that you can give us some recommendations on which aspects can we figure out and to identify the problem?
John 25:41
No, I, I don’t know, offhand. It sounds very similar to one of the previous questions, so I’m kind of worried that I need to dig into this a little bit more. If you can send me some sample URLs, then I’d love to take a look. Or maybe drop them here in the chat.
And then I can pick them up afterwards and check them out with the team. I think the one aspect that you probably also want to check is whether or not we can actually crawl them properly. I imagine you already looked into that. But it’s always good to kind of double check there. Again, if you can, if you can drop me some URLs, I’m happy to take a look at that with a team because like one site kind of going down and indexing is one thing.
But if there are like multiple people at the same time coming in with this kind of topic, then we should probably take a better look.
SEO Professional 7 26:39
Okay. So like you said, we have checked the URLs in Google-live text. Yes, they are all crawlable and also indexable. But the weird thing is they’re not. Yeah. And we also check our sitemap, and robots.txt. All good, nothing unusual.
And also the mobile friendly test. Yes, I will drop the URL- drop to you later. And I also want to specify that, when we check the samples, we notice that Google crawls our URLs, like the URL, that Google crawls have some unusual mark, like, like question mark, and also some plus mark in the URL. But our actual URLs don’t have these marks.
Okay. And yeah, that’s one thing unusual we spotted. We’re guessing that that’s the problem. But like I said, we already analyzed our log reports and found the actual rate of these URLs that Google crawled is very less, maybe 2% to 3%. Yeah, so it’s not a huge amount of these URLs.
John 28:11
Yeah, I mean, what always happens is we discover a lot of URLs for websites. And if we don’t think that they’re important, we will kind of keep them in our list. And at some point, we’ll try to crawl them.
And I suspect these are just random URLs that we discovered over time. And we try to crawl them from time to time to see if there’s anything that we’re missing. But it’s not a sign of a problem with a website, if we also crawl some random URLs. Oh, yeah.
SEO Professional 7 28:43
The question is, this for this random URLs, we can’t find find these URLs either in our website, internal links, anywhere. We just can’t find this URL that Google crawled.
John 29:00
Yeah, I mean, that that happens. And sometimes we collect these URLs for a longer period of time, and then try them maybe a year later. So it’s sometimes hard to track back. But it’s not a sign of a problem.
It’s really just our systems recognizing, oh, we have some extra space to crawl URLs. And we also happen to have this big collection of URLs that we don’t know anything about. So we will just try.
SEO Professional 7 29:28
Okay, great. So as for the indexed pages, do you have some maybe mean technical, like aspects that maybe affect this maybe resulting this?
John 29:44
It’s hard to say. I mean, usually the main issue is really about overall the quality of a website, which kind of goes into the decision whether or not to index individual URLs. And that’s something that can also change O=over time, not not so much that the quality of your website changes, but kind of our perception of the quality of the website can change over time.
And that’s usually the main element that comes into play there. And if you see these kinds of indexing changes, like happening over a short period of time, then it could be that our systems have just kind of changed the way that we evaluate quality for your website. And suddenly everything is in a slightly different bucket.
Whereas if you see them over a longer period of time, then it’s usually more that, like, over time, our systems are less and less confident about the website.