We use cookies to measure visits and improve RnkRocket. Accept analytics cookies or continue with essential only. Cookie policy

Not getting calls from Google? Find out why. See how it works →
Skip to main content

XML Sitemaps Explained: Why They Matter and How to Create One

An XML sitemap tells search engines which pages exist on your site and when they were last updated. Here's everything small businesses need to know to create and maintain one correctly.

By RnkRocket Team
May 25, 2026
12 min read
XML Sitemaps Explained: Why They Matter and How to Create One

Key Takeaways

  • XML sitemaps are not a ranking factor, but they are a critical crawlability signal — sites with properly formatted sitemaps get indexed faster and more completely (Google Search Central)
  • Google recommends submitting sitemaps via Search Console, particularly for sites with more than a few dozen pages or those with frequent content updates
  • A sitemap should only include pages you want indexed — do not include noindex pages, redirect chains, or soft-404s
  • RnkRocket's site intelligence tool audits your sitemap automatically, flagging missing, broken, or incorrectly formatted entries as part of its technical SEO checks

If you have ever wondered why a page on your website is not showing up in Google, a missing or broken sitemap is often a contributing factor. A well-maintained XML sitemap is one of the simplest things you can do for your site's technical health — and one of the most frequently neglected.

This guide covers what XML sitemaps are, why they matter for SEO, how to create one, and how to avoid the common mistakes that cause sitemaps to work against you rather than for you.


What Is an XML Sitemap?

An XML sitemap is a structured file that lists the URLs on your website along with optional metadata about each one: when it was last modified, how frequently it changes, and its relative priority compared to other pages.

The format looks like this at its simplest:

```xml

https://example.com/services/ 2026-03-01 monthly 0.8 \`\`\`

The `` tag is required. Everything else is optional — and it is worth noting that Google has confirmed publicly that it largely ignores `` and `` values, preferring to determine crawl frequency itself based on signals like traffic and update patterns.

What sitemaps primarily do is act as a discovery mechanism. They help search engine bots find pages that might otherwise not be linked to from anywhere obvious on your site — deep product pages, for instance, or blog posts buried in a paginated archive.

Sitemap Element Reference

TagPurposeRequiredExample
``Root element wrapping all URL entriesYes``
``Container for a single URL entryYes`...`
``The full URL of the pageYes`https://example.com/services/\`
``Date the page content was last modified (ISO 8601)No`2026-03-01`
``How often the page typically changes (Google largely ignores this)No`monthly`
``Relative priority within your site, 0.0 to 1.0 (Google largely ignores this)No`0.8`
``Root element for sitemap index files referencing multiple sitemapsYes (for index)``
``Container referencing a child sitemap within a sitemap indexYes (for index)`...`

For a complete reference on valid sitemap tags and formatting, see the official Sitemaps.org protocol specification and the Moz guide to XML sitemaps.

XML vs HTML Sitemaps

There are two types of sitemaps, and they serve different purposes.

XML sitemaps are for search engines. They are machine-readable files typically located at `/sitemap.xml` and are submitted to Google Search Console.

HTML sitemaps are for users. They are human-readable pages on your site that list major sections and links, helping visitors navigate large sites. They can also have some secondary SEO value through internal linking, but they are not a substitute for an XML sitemap.

This guide focuses on XML sitemaps.


Why XML Sitemaps Matter for SEO

Google's crawlers discover pages primarily through links. If a page exists on your site but has no internal links pointing to it — sometimes called an orphan page — crawlers may never find it. A sitemap acts as a safety net, ensuring Google is at least aware the page exists.

This is particularly important in several scenarios.

New Websites

A brand new website has no external links, no historical crawl data, and possibly very few internal links. Without a sitemap, it can take weeks or even months for Google to discover and index all your pages. Submit a sitemap immediately when you launch a new site.

Large Sites with Complex Navigation

E-commerce sites, directories, or content-heavy sites with hundreds or thousands of URLs often have pages that are functionally orphaned — linked to through filters or search functionality rather than standard navigation. Sitemaps ensure Google can find and index these pages regardless of navigation structure.

Case study: sitemap fix resolves indexing gap for a 500-page e-commerce site. A UK homeware retailer with approximately 500 product pages noticed that roughly 180 of them were not appearing in Google at all. Their CMS generated a sitemap, but the sitemap only included the first 50 products per category — everything beyond page one of the category listing was missing. After regenerating the sitemap to include all product URLs and submitting it through Search Console, 160 of those 180 pages were indexed within three weeks. The remaining 20 had thin content issues that needed separate attention, but the sitemap fix alone recovered indexing for over 30% of the catalogue. Google's own documentation on building and submitting a sitemap confirms that sitemaps are particularly valuable for large sites where pages may not be discoverable through links alone.

Sites with Frequently Updated Content

If you publish blog posts, news articles, or product updates regularly, a sitemap with accurate `` dates tells Google which pages have changed recently. This helps the crawler prioritise those URLs in its next crawl pass.

Sites with Poor Internal Linking

If your internal linking structure is weak — a common issue on small business websites where most pages are reached only from the homepage navigation — a sitemap compensates by ensuring all pages are surfaced to crawlers.

What Sitemaps Do Not Do

It is equally important to understand the limits. A sitemap does not guarantee indexing. Google may choose not to index a URL even if it appears in a sitemap — typically because the page has thin content, is a near-duplicate of another page, or lacks sufficient signals to justify inclusion in the index. Submitting a URL in a sitemap is a request, not a command.

Sitemaps also do not improve rankings directly. They improve crawlability, which is a prerequisite for ranking — but having a sitemap will not push a weak page above a stronger one.

For a broader view of technical SEO, see our Technical SEO Explained guide.


How to Create an XML Sitemap

Option 1: Use Your CMS or Platform Plugin

If your site runs on WordPress, Shopify, Squarespace, or a similar platform, a sitemap is likely already generated for you.

WordPress generates a sitemap automatically via the default Yoast SEO or Rank Math plugin. You can typically find it at `yourdomain.com/sitemap_index.xml`. The Yoast sitemap breaks into separate files for posts, pages, categories, and other content types.

Shopify generates sitemaps automatically. Your sitemap lives at `yourdomain.com/sitemap.xml` and includes products, collections, blog posts, and pages by default.

Squarespace similarly auto-generates a sitemap at `yourdomain.com/sitemap.xml`. Note that password-protected pages are excluded.

If you are using a CMS, check whether a sitemap already exists before creating one manually.

Option 2: Generate One with a Dedicated Tool

For smaller static sites or sites where the CMS does not auto-generate a sitemap, tools like XML-Sitemaps.com can crawl your site and generate a valid sitemap file you can upload to your root directory.

Screaming Frog is another option for generating sitemaps from a crawl — particularly useful because it gives you control over what is included, allowing you to exclude noindex pages, redirect chains, and other problematic URLs before the sitemap is created.

Option 3: Build One Programmatically

For developers, sitemap generation is straightforward. Most web frameworks have sitemap libraries: `next-sitemap` for Next.js, `gatsby-plugin-sitemap` for Gatsby, and equivalents for most other major frameworks.

The key rule is to automate sitemap regeneration whenever content changes — not to maintain a static file manually.

Sitemap Index Files

If your site is large (roughly more than 50,000 URLs), you will want to split your sitemap into multiple files and reference them from a sitemap index. The index file itself looks like this:

```xml

https://example.com/sitemap-posts.xml 2026-03-15 https://example.com/sitemap-pages.xml 2026-03-15 \`\`\`

Google supports individual sitemap files of up to 50,000 URLs or 50MB uncompressed. Use a sitemap index whenever you approach these limits.


How to Submit Your Sitemap to Google

Generating a sitemap is only half the job. You also need to tell Google where to find it.

Via Google Search Console

The most direct method:

  1. Go to Google Search Console
  2. Select your property (your website)
  3. Click "Sitemaps" in the left-hand menu
  4. Enter the URL of your sitemap (e.g. `sitemap.xml` or `sitemap_index.xml`)
  5. Click Submit

Google will then attempt to fetch and process your sitemap. The Search Console report will show how many URLs were submitted versus how many were indexed — a discrepancy between these numbers is a signal worth investigating.

Via robots.txt

You can also declare your sitemap location in your `robots.txt` file, which Googlebot reads on every crawl:

``` Sitemap: https://yourdomain.com/sitemap.xml ```

This is a good belt-and-braces approach alongside Search Console submission, and it helps other search engines (like Bing) discover your sitemap without requiring a separate submission.

For a complete guide to `robots.txt`, see The Complete Guide to robots.txt for SEO.


Common Sitemap Mistakes to Avoid

We have audited hundreds of small business sites through RnkRocket, and the same sitemap issues appear again and again.

Including Noindex Pages

Any page with a `noindex` meta tag or HTTP header should not appear in your sitemap. A page that says "do not index me" appearing in a sitemap creates a contradiction — and while Google will generally honour the noindex directive, the inconsistency is sloppy and can slow down crawl budget processing.

Common noindex pages that accidentally appear in sitemaps include thank-you pages, account login pages, admin areas, and duplicate content pages.

Including Redirected URLs

Sitemaps should only list canonical, indexable URLs. If a URL in your sitemap responds with a 301 or 302 redirect, remove it and replace it with the final destination URL. Redirect chains in sitemaps waste crawl budget and signal poor site hygiene.

Incorrect `` Dates

The `` tag should reflect the last time the content of the page genuinely changed — not an automated timestamp that updates every time the server restarts. Artificially inflating `` dates may prompt Google to re-crawl pages unnecessarily; eventually it will learn to distrust your lastmod signals entirely.

Forgetting to Update After Major Site Changes

If you delete pages, change URL structures, or publish new content, your sitemap must reflect those changes. A sitemap full of 404 errors — pages that were removed but still listed — creates unnecessary crawl budget waste and signals to Google that your site is poorly maintained.

Not Including All Valuable Pages

The inverse problem is also common: important pages are missing from the sitemap entirely. This often happens when a CMS filter or custom post type is excluded by default, or when a developer manually excluded a section during setup.

Audit your sitemap regularly against your actual list of important pages. RnkRocket's technical SEO audit checks for missing sitemap entries as part of its crawl analysis.

Case study: stale sitemap entries causing crawl budget waste. A regional recruitment agency had a sitemap listing 1,200 URLs, but over 400 of those were expired job listing pages returning 404 errors. Google was spending a significant portion of its crawl budget re-fetching these dead pages on every crawl pass. After cleaning the sitemap to include only active job listings and core pages — reducing it from 1,200 to 780 URLs — Search Console data showed a measurable increase in crawl frequency for the remaining active pages, and several previously under-crawled service pages began appearing in search results for the first time.


Sitemap Best Practices Summary

PracticeWhy It Matters
Only include indexable URLsAvoids contradictions and crawl budget waste
Use accurate `` datesHelps Google prioritise genuinely updated content
Submit to Search ConsoleGives direct visibility into indexing status
Declare in robots.txtHelps all crawlers discover the sitemap
Split large sitemaps into index filesAvoids file size limits
Keep sitemap dynamically generatedEnsures it stays current without manual effort
Exclude 404s, redirects, and noindex pagesKeeps the sitemap clean and accurate

Monitoring Your Sitemap Health

Submitting a sitemap is not a one-time task. You need to monitor it over time, particularly:

Submitted vs Indexed ratio in Search Console. If you submit 200 URLs but only 140 are indexed, that 30% gap needs investigation. The excluded URLs section in Search Console will show you why — common reasons include "Crawled but not currently indexed" (thin content signal), "Duplicate without canonical tag", or "Blocked by noindex".

Sitemap errors in Search Console. If Google cannot fetch or parse your sitemap, it will report an error. These errors appear in the Sitemaps report and should be fixed promptly.

New content appearing in the sitemap promptly. If you publish a new blog post and it takes more than 48 hours to appear in your sitemap, your CMS or sitemap generation process has a problem.

RnkRocket monitors your sitemap as part of its ongoing site intelligence — flagging pages that are in the sitemap but returning errors, pages that are indexed but missing from the sitemap, and sitemap format issues. See what is SEO for broader context on how crawlability fits into overall search performance.


Frequently Asked Questions

Does having a sitemap improve my Google rankings?

Not directly. A sitemap improves crawlability and indexing speed, which are prerequisites for ranking — but they do not themselves influence where you rank. Think of a sitemap as ensuring your pages are considered for ranking; the ranking itself depends on content quality, relevance, and authority.

How often should I update my sitemap?

Ideally, your sitemap should update automatically every time you publish or modify content. If your CMS does not do this automatically, regenerate and resubmit your sitemap whenever you make significant content changes — at minimum, monthly if you publish regularly.

Do I need a sitemap if my site is small?

Google says sitemaps are most valuable for sites over a few dozen pages. For a small five-page brochure site with good internal linking, a sitemap is still a good practice but less critical. That said, there is no downside to having one — generate and submit it regardless.

Can I have multiple sitemaps?

Yes — in fact, for larger sites this is recommended. Use a sitemap index file to reference multiple smaller sitemaps organised by content type (pages, posts, products, etc.). Google supports up to 50,000 URLs per sitemap file.

What should I do if Google is not indexing pages in my sitemap?

First, check the Coverage report in Google Search Console to see why those pages are excluded. Common causes include thin content, duplicate content issues, manual actions, or soft-404 errors. Fixing the underlying issue is more important than resubmitting the sitemap.

Should I include images and videos in my sitemap?

Google supports image and video sitemap extensions. Including images in your sitemap can help Google discover images that are loaded via JavaScript or that appear on pages with few inbound links. For image-heavy businesses (photographers, estate agents, e-commerce), this is worth implementing.


Related Reading


Want to see how your sitemap stacks up? RnkRocket audits your sitemap automatically as part of its technical SEO analysis — spotting missing pages, redirect errors, and indexing gaps in minutes. See pricing and get started.

Related Posts

Duplicate Content: What It Is and How to Fix It
Technical SEO

Duplicate Content: What It Is and How to Fix It

Duplicate content confuses search engines, splits your ranking signals across multiple URLs, and can cause Google to index the wrong version of your page. Here is how to identify it and fix it properly.

Technical SEO
Crawlability
Indexing
+1 more
RnkRocket Team
May 11, 202613 min read
The Complete Guide to robots.txt for SEO
Technical SEO

The Complete Guide to robots.txt for SEO

Your robots.txt file tells search engines what they can and cannot crawl. Get it wrong and you risk blocking your entire site from Google — here's how to use it correctly.

Technical SEO
Crawlability
SEO
RnkRocket Team
May 28, 202613 min read
Page Speed Optimisation: A Practical Guide for Non-Developers
Technical SEO

Page Speed Optimisation: A Practical Guide for Non-Developers

Slow pages cost you rankings and customers. This practical guide explains page speed optimisation in plain English — with specific fixes you can implement without touching a line of code.

Core Web Vitals
Site Speed
Technical SEO
+1 more
RnkRocket Team
May 4, 202615 min read