Their question was: they have a few customer pages using Next.js without a robots.txt or a sitemap file. Simplified, theoretically, Googlebot can reach all of these pages.
But why is only the homepage getting indexed? There are no errors or warnings in Search Console. Why doesn’t Googlebot find the other pages?
So why isn’t Google indexing everything?
John said that it’s important to first recognize that Googlebot will never index everything across a website.
He also explained that he doesn’t think it happens to any kind of non-trivial-sized website, that Google would go off and index everything completely. Just from a practical point of view, it’s not possible to index everything across the whole web.
So that kind of assumption that the ideal situation is ‘everything is indexed,’ I would leave that aside, and say, like, you really want Googlebot to focus on the important pages.
John also said: “the other thing though, which became a little bit clearer when the person contacted me on Twitter and gave me a little bit more information about their website, was that the way the website was generating links to the other pages was in a way that Google was not able to pick up.”
But rather, Google will go off and look for normal HTML links, which is the kind of traditional normal way that you would link to individual pages on a website. But with this framework, it didn’t generate these normal HTML links.
John continued: there are lots of creative ways to create links.
And Googlebot really needs to find those HTML links to make it work.
This happens at approximately the 04:20 mark in the video.