An SEO professional was concerned about blocking critical CSS files from Google using robots.txt, such as *.css, php.ini, or .htaccess.
John explained that this sounds like a bad idea, even though the professional suggesteda few use cases.
For *.css – this would block all CSS files. Google has to be able to crawl CSS files so they can accurately render a site’s pages. If you block Google from crawling these files, they would not be able to assess critical information, such as whether the site is mobile-friendly.
Even though CSS files are not generally indexable on their own, Google still needs to see them.
For php.ini, this is a critical configuration file for PHP. It should already be locked down so that nobody can access it. That includes Googlebot.
For .htaccess, this is a special control file that cannot be accessed externally by default. Just like other lockdown files, it does not have to be explicitly disallowed from crawling.
John said that using someone else’s robots.txt file and assuming it will work is a dangerous strategy. Instead, you want to examine your site, think about which parts you don’t want crawled, and just disallow those in robots.txt.
John Mueller AskGooglebot Transcript
John 0:03
Today I’ll be answering Anthony’s question regarding robots.txt: “Should I disallow *.CSS, disallow php.ini, or even disallow .htaccess? Thanks.” No, I can’t disallow you from disallowing those files. But that sounds like a bad idea. You mentioned a few special cases. So let’s take a look.
*.CSS would block all CSS files. We need to be able to access CSS files so that we can properly render your pages.
This is critical so that we can recognize when a page is mobile-friendly, for example.
CSS files generally won’t get indexed on their own, but we need to be able to crawl them. You also mentioned php.ini. This is a configuration file for PHP. In general, this file should be locked down or in a special location so that nobody can access it.
And if nobody can access it, then that includes Googlebot, too. So again, no need to disallow crawling of that. Finally, you mentioned .htaccess. This is a special control file that can’t be accessed externally by default. Like other lockdown files, you don’t need to explicitly disallow it from crawling, since it can’t be accessed at all.
My recommendation is not to just reuse someone else’s robots.txt file and assume it’ll work. Instead, think about which parts of your site you really don’t want to have crawled, and just disallow crawling of those. I hope that answers your question. And stay tuned until the next episode of Ask Google Webmasters.