During a hangout, one SEO professional asked John Mueller about why their JavaScript site was not getting indexed.
Their question was: they have a few customer pages using Next.js without a robots.txt or a sitemap file. Simplified, theoretically, Googlebot can reach all of these pages.
But why is only the homepage getting indexed? There are no errors or warnings in Search Console. Why doesn’t Googlebot find the other pages?
John first explained that Next.js is a JavaScript framework, which means the entire page is generated with JavaScript.
So why isn’t Google indexing everything?
John said that it’s important to first recognize that Googlebot will never index everything across a website.
He also explained that he doesn’t think it happens to any kind of non-trivial-sized website, that Google would go off and index everything completely. Just from a practical point of view, it’s not possible to index everything across the whole web.
So that kind of assumption that the ideal situation is ‘everything is indexed,’ I would leave that aside, and say, like, you really want Googlebot to focus on the important pages.
John also said: “the other thing though, which became a little bit clearer when the person contacted me on Twitter and gave me a little bit more information about their website, was that the way the website was generating links to the other pages was in a way that Google was not able to pick up.”
So in particular, with JavaScript, you can take any element on an HTML page and say, “if someone clicks on this, then execute this piece of JavaScript.”
And that piece of JavaScript can be to navigate to a different page, for example, and Googlebot does not click on all elements to see what happens.
But rather, Google will go off and look for normal HTML links, which is the kind of traditional normal way that you would link to individual pages on a website. But with this framework, it didn’t generate these normal HTML links.
John explained that Googlebot could not recognize that there’s actually more to crawl, and more pages to actually look at. However, this problem can be fixed based on the way that you implement JavaScript on your site.
John continued: there are lots of creative ways to create links.
And Googlebot really needs to find those HTML links to make it work.
Additionally, John suggested that if you’re watching them on the YouTube channel, go out and check out those JavaScript SEO videos on their channel to kind of get a bit of a sense of what else you could watch out for when it comes to JavaScript based websites. He reiterated that they are able to process most kinds of JavaScript based websites, normally, but some things you still have to watch out for, like these links.
This happens at approximately the 04:20 mark in the video.
John Mueller Hangout Transcript
John (Question)
Alright, so first up, we have a few customer pages using Next.js without a robots.txt or a sitemap file. Simplified, theoretically, Googlebot can reach all of these pages. But why is only the homepage getting indexed? There are no errors or warnings in Search Console. Why doesn’t Googlebot find the other pages?
John (Answer)
So maybe taking a step back? Next.js is a JavaScript framework, which means that the whole page is kind of generated with JavaScript. But kind of as in kind of a general answer as well for all of these kinds of questions. Like why is Google not indexing everything?
It’s important to first say that Googlebot will never index everything across a website. I don’t think it happens to any kind of non-trivial-sized website, that Google would go off and index completely everything. It’s just, from a practical point of view, it’s not possible to index everything across the whole web. So that kind of assumption that the ideal situation is everything is indexed, I would leave that aside, and say, like, you really want Googlebot to focus on the important pages.
The other thing though, which became a little bit clearer when I think the person contacted me on Twitter and gave me a little bit more information about their website, was that this, the way that the website was generating links to the other pages was in a way that Google was not able to pick up. So in particular, with JavaScript, you can take any element on an HTML page and say, if someone clicks on this, then execute this piece of JavaScript.
And that piece of JavaScript can be to navigate to a different page, for example, and Googlebot does not click on all elements to see what happens. But rather, we go off and look for normal HTML links, which is the kind of traditional normal way that you would link to individual pages on a website. And with this framework, it didn’t generate these normal HTML links.
So we could not recognize that there’s actually more to crawl, more pages to actually look at. And this is something that you can fix in the way that you implement kind of your JavaScript site. We have a ton of information on the Search Developer documentation site, around JavaScript and SEO, in particular, on the topic of links, because that comes up every now and then. There are lots of creative ways to create links.
And Googlebot really needs to find those HTML links to make it work. Additionally, we have a bunch of videos on our YouTube channel. And if you’re watching this, since nobody’s here, you must be on the YouTube channel. If you’re watching us on the YouTube channel, go out and check out those JavaScript SEO videos on our channel to kind of get a bit of a sense of what else you could watch out for when it comes to JavaScript-based websites. We are able to process most kinds of JavaScript-based websites, normally, but some things you still have to watch out for, like, like these links.