One SEO professional asked John during a Question and Answer segment about their sitemaps.
They explained that they are trying to make sure that their rankings don’t take a hit as they roll out a new search results page. For some context, their searches can result in 10,000 results and they have filtering and sorting functionality.
Their question is: how does Google treat these pages within websites? How do the search results affect the overall ranking for the site?
Is it enough to continue submitting sitemaps for ranking reasons? Or should they take additional measures to help Googlebot gather all of the returnable URLs?
John did a bit of a different answer this time, and answered the last question first: he said he would not rely on sitemaps to find all of the pages of your website.
Instead, sitemaps should be a way to give additional information about your website. It should not be the primary way that people find information through your website.
In particular, internal linking is super important. And something you should definitely watch out for and make sure that however you set things up, when someone crawls your site, they are able to find all your content, and not rely on the sitemap file to get all these things.
From that perspective, being able to go to these category pages perhaps, and being able to crawl through category pages for the product is also very important. Search results pages, John believes, are a bit of a unique area, because some sites use category pages like search results pages.
Then you’re in that situation where search results pages are acting like category pages.
If this is the case for you, then John would watch out for everything that you would do with category pages.
The other thing with search results pages is that people can enter anything in search and your site has to do all this work to generate all of these things.
And this can easily result in an infinite number of URLs that are theoretically findable on your site, because people can search in a lot of different ways.
Because this basically creates a set of infinite pages on your site, that’s something that Google tries to discourage where they would say: “either set the search results pages to noindex, or use robots.txt to block crawling over these pages.”
That way, Google can focus on the normal site structure and the normal internal linking.
John believes that those would be the primary aspects to this problem there.
If you do want to have your search results pages indexed, then John’s tip would be to make sure that on one hand, you have one primary sort order and filtering setup that’s set as the canonical. So if you don’t know how to choose or provide your pages by relevance, then if you have a source filter for price, then he would set the rel=canonical of these filters to the primary sort order.
This happens at approximately the 37:49 mark in the video.
John Mueller Hangout Transcript
John (Submitted Question) 37:49
I’m trying to make sure that our SEO rankings don’t take a hit while we roll out a new search results page. For some context, our searches can result in 10,000 results and have filtering and sorting functionality.
My question is, does Google treat these pages within web…or how does Google treat these pages within websites? How do the search results affect the overall ranking for the website? Is it enough to just be submitting sitemaps for ranking?
Or should we take additional consideration to help Googlebot to gather all of the returnable URLs?
John (Answer) 38:24
So I think maybe the last question first, I would not rely on sitemaps to find all of the pages of your website. Sitemaps should be a way to give additional information about your website. It should not be the primary way of giving information about your website. So in particular, internal linking is super important.
And something you should definitely watch out for and make sure that however you set things up, when someone crawls your website, they’re able to find all of your content, and not that they rely on the sitemap file to get all of these things.
And from that point of view, kind of being able to go to these kinds of category pages perhaps and being able to actually like find all the products that are in individual categories, I think it’s super useful. Being able to crawl through the category pages to the product is also very important. Search results pages, I think are a little bit of a unique area, because some sites use category pages, essentially like search results pages, and then you’re kind of in that situation where search results pages are essentially like category pages.
If that’s the case for you, I would kind of watch out for everything that you would do with category pages. The other thing with search results pages is that people can enter anything in search for something and your site has to do all the work to kind of generate all of these things. And that can easily result in an infinite number of URLs that are theoretically findable on your web site, because people can search in a lot of different ways.
And because that creates kind of this set of infinite pages on your website, that’s something that we try to discourage where we’d say, either set the search results pages to no index or use robots.txt to block crawling of these search results pages, so that we can focus on the normal site structure and the normal internal linking.
So I think those are kind of the primary aspects there. If you do want to have your search results pages indexed, then my tip would be to make sure that on the one hand, you have one primary sort order and filtering setup set up as a canonical. So if you, I don’t know, choose to provide your pages by relevance, then if you have a sort filter for by price up or down, then I would set the rel=canonical of those filters to your primary sort order.
And similarly for filtering, I would perhaps kind of remove the filter with the rel=canonical. Doing this, make sure that we can more focus on the primary version of the pages and crawl those properly. Rather than that we get distracted by all of these variations of the search results pages.
And the other thing I would watch out for is that you create some kind of an allow list or some kind of a system on your site with regards to the type of search queries that you want to allow to be indexed or crawled. So for example, if someone goes to your website and searches for, I don’t know, Canadian pharmaceutical or something like that, and you’re not a pharmaceutical website, you probably don’t want that search page to be indexed.
Even if you don’t have any products that are available that match that query, you probably don’t want to have that indexed. So having a list of the allowed searches that you do allow to have index makes that a lot easier.
And make sure that you don’t accidentally run into this kind of spam situation where someone is spamming your search results.
And then you have to clean up like millions of pages that are indexed and get rid of them somehow.
There are so many things to keep in mind with search results pages.