What is an XML Sitemap?
A great XML sitemap file is really a website roadmap that tells Google about all of your important pages. Because all of your pages exist in this sitemap, Google can find all of the pages on your website much faster.
An XML sitemap file is a file that sits on your server that’s basically a file that contains a mapping of all the important pages (or all pages, if you choose to do so) of your website. These sitemaps also help Google figure out what your site structure is all about.
The goal is to get Google to crawl every important page of your site. Sometimes, in certain situations, some pages might not have any internal links. This makes it difficult for Google to find these pages. And for these situations, a sitemap is tool that can seriously speed up your site’s content discovery.
In an AskGooglebot session, one SEO professional asked John Mueller about removing sitemaps in Google Search Console.
John demonstrated a simple method for removing any sitemaps from Google Search Console.
Once you remove it, Google will stop tracking the sitemap within Google Search Console.
Alternatively, it’s possible to remove your sitemap file so that it returns a 404 or 410 HTTP status code instead, which means that the file just isn’t available for crawling.
Google will end up not using these deleted sitemap files to crawl your site over time.
How to Remove a Sitemap Transcript
Frequently Asked Questions About XML Sitemaps
When it comes to organizing your site into a sensible structure, and providing search engines with the right pages to crawl, XML sitemaps are a superior way to do so compared to others.
In fact, creating an XML Sitemap file can help Google better navigate your site, and can be helpful in telling
Google what your important pages are.
It aids in crawling and indexing by providing a single “source of truth” as to what the important pages of your website actually are.
What is an XML Sitemap?
An XML Sitemap is simply a list of all the pages on your site. It’s a structured format that tells Google where each page is located on your site, how often it changes, and whether it has been indexed or crawled before.
Why Are XML Sitemaps Important?
XML Sitemaps are important because they allow you to provide Google with an at-a-glance crawling resource for your site. By arranging your URLs in a parent > child structure, and making sure that your important URLs are structured appropriately, Google (and other search engine crawlers) can crawl your site almost instantaneously.
By making sure that your list of URLs within this sitemap are optimized properly, it’s possible to help make sure that Google crawls as much as possible, if not everything.
Even though Google doesn’t always crawl everything, it still helps. All of the major search engines (Google, Bing, and Yahoo) work well when a sitemap is implemented.
A complete XML sitemap will also provide all the most important URLs of your site up front, so that search engines know about these when they crawl your sitemap.
What Does an XML Sitemap Look Like?
An XML sitemap is a standard format for describing web pages. Search engines read it to help index your site. You can use one of many free tools to generate and submit an XML sitemap to Google, Bing, Yahoo!, etc., but you don’t necessarily have to do that.
An XML sitemap is similar to a regular HTML sitemap. However, it contains additional information that allows search engines to crawl and index your site more efficiently. This makes it easier for visitors to find specific content on your site and helps you improve your rankings.
The best part is that you don’t need to write code to make one. You simply add it to your WordPress dashboard using the Yoast plugin. Once added, you’ll see an “XML Sitemap Generator” button next to the “Generate XML Sitemaps” option on the left side of your WordPress Dashboard screen. Clicking this button generates a .xml file that you can upload to your server.
You can also generate a .txt file instead. Simply change the settings under “Save output format.”
What is XML Format?
What, exactly, is XML format? The W3C explains that Extensible Markup Language (XML) is “a simple text-based format for representing structured information.”
The XML file format uses this structure in order to present sitemap URLs in an easily-readable format.
The .xml extension indicates that the file is an XML document. The format itself is very simple: it consists of a series of tags, each containing data about a particular URL.
For example, here’s a sample group of tags from a Rank Math sitemap, which marks up a particular URL:
<image:caption><![CDATA[Screenshot of domain name scam invoice]]></image:caption>
Each tag represents a different type of information about a given URL.
The lastmod tag, for example, provides the date that the URL was last modified.
If the URL hasn’t changed since the last time it was crawled, then the lastmod tag will continue displaying the date that it was last modified.
Here’s another example:
<?xml version=”1.0″ encoding=”UTF-8″?><?xml-stylesheet type=”text/xsl” href=”//iloveseo.com/main-sitemap.xsl”?>
This particular tag is the opening tag for a sitemap (in general). The main information within this tag helps identify to the browser that this is an XML document, and which language standard to use in order to parse this document. It also covers the type of text encoding to use (UTF-8).
In addition to the above tags, there are several others that may be useful depending on your needs. For instance, you might want to include a robots tag to tell Google not to follow links, or a noindex tag to prevent certain pages from appearing in the search results.
Which Types of Websites Need an XML Sitemap?
Google recommends creating an XML sitemap for each site you want to optimize. This is because it allows Googlebot to index all the pages on your site quickly and efficiently. However, some websites don’t follow best practices and don’t provide a good quality sitemap.
For example, some websites have hundreds of thousands of pages and include subdirectories, while others have just a handful of pages. These types of websites aren’t necessarily breaking any rules, but they might not be benefiting fully from the benefits of an XML sitemap.
If you are unsure whether your website qualifies for an XML sitemap, here are a few questions to ask yourself:
- Is my website really “large”?
- Do I have lots of unique pages?
- Are my pages well linked internally?
- Does my website have a lot of images?
- Am I using rich media like videos or audio files?
Google’s documentation says sitemap files help crawlers “find all the pages on your site.” In plain English, this means you should link your pages together properly. This makes sense because, while you might want to link to specific pages within certain categories, you don’t necessarily want to link to a random page on your website.
However, some websites choose to ignore this recommendation.
The problem arises when you have a lot of content on your website. If you’ve ever tried creating a sitemap file yourself, you probably found it difficult to keep track of all those URLs. Luckily, there is a solution: XML sitemaps. These files tell Google where each page is located on your website and how often you update it. For example, if you update a blog post once per week, you could add a weekly date stamp to the URL. You’ll see more about what an XML sitemap looks like later in this article.
How to Avoid Lack of Consistency in Your XML Sitemaps
One of the most common mistakes that we see clients make is lack of consistency in the messaging to the search engines about a given page. They think that if they put something in one place, it must be OK everywhere else.
If you use Robots.txt to block a page, and then you use an XML Sitemap to list that page, you are literally just teasing the search engine. You are telling Google, “Hey, here’s a great page you really should crawl!” Your sitemap says, “But wait, there’s more! Here, Google… here, Google… here, here, here…”
And then your robots.txt takes it away.
Same thing with Meta Robots: Don’t include a page in an XML Sitemap and then set your meta robots tag to “noindex,nofollow.” The other thing as well: don’t add a canonical tag to a page and then set your page to noindex,nofollow. Those are conflicting directives. If you have a self-referencing canonical tag, and you have noindex,nofollow, then the canonical tag will win over the noindex,nofollow every time, and Google will index your page anyway.
What If Google Search Console Says XML Sitemaps Have Errors?
If you have a situation where your XML sitemap is throwing errors, or it’s not actually physically valid, then you have a problem. First, you have to figure out the type of error in GSC and why it’s listed. If there is a situation where search engines are not reading the XML sitemap, you want to ensure that it’s actually submitted to Bing
Webmaster Tools and Search Console. For invalid sitemaps, you do want to make sure to check these errors and make sure that any issues specific to your errors are resolved.
How Do I Update My Own XML Sitemap By Hand?
Although it is possible to actually update your own XML sitemaps, we do not recommend it. It can be labor-intensive, especially if you have a rapidly-expanding or large website, and can be exhausting. That’s why we recommend using SEO plugins like Rank Math or Yoast to make sure that your XML Sitemap is accurately and consistently updated.
How Do I Access My Sitemap?
There are – usually – different ways to access your sitemap. You can do so simply by inputting your sitemap URL into your browser address bar, like this:
If you use Rank Math, the URL would be: https://example.com/sitemap_index.xml
If you use Yoast, the URL is similar: https://example.com/sitemap_index.xml.
They really do not change all that much based on the similarity of common SEO plugins. But, depending on the SEO plugin you do end up using, just double-check and make sure they haven’t.
A Good XML Sitemap is Critical for Site Health
Ensuring that you have a good XML sitemap that is well-maintained is another one of those SEO activities that is a great idea. Not only that, it’s a fundamental of good technical SEO.
When do you next anticipate going through your existing sitemaps with a fine-toothed comb?