XML Sitemaps Explained: Why They Matter and How to Create One
An XML sitemap tells search engines which pages exist on your site and when they were last updated. Here's everything small businesses need to know to create and maintain one correctly.

Key Takeaways
- XML sitemaps are not a ranking factor, but they are a critical crawlability signal — sites with properly formatted sitemaps get indexed faster and more completely (Google Search Central)
- Google recommends submitting sitemaps via Search Console, particularly for sites with more than a few dozen pages or those with frequent content updates
- A sitemap should only include pages you want indexed — do not include noindex pages, redirect chains, or soft-404s
- RnkRocket's site intelligence tool audits your sitemap automatically, flagging missing, broken, or incorrectly formatted entries as part of its technical SEO checks
If you have ever wondered why a page on your website is not showing up in Google, a missing or broken sitemap is often a contributing factor. A well-maintained XML sitemap is one of the simplest things you can do for your site's technical health — and one of the most frequently neglected.
This guide covers what XML sitemaps are, why they matter for SEO, how to create one, and how to avoid the common mistakes that cause sitemaps to work against you rather than for you.
What Is an XML Sitemap?
An XML sitemap is a structured file that lists the URLs on your website along with optional metadata about each one: when it was last modified, how frequently it changes, and its relative priority compared to other pages.
The format looks like this at its simplest:
```xml
The `
What sitemaps primarily do is act as a discovery mechanism. They help search engine bots find pages that might otherwise not be linked to from anywhere obvious on your site — deep product pages, for instance, or blog posts buried in a paginated archive.
Sitemap Element Reference
| Tag | Purpose | Required | Example |
|---|---|---|---|
| ` | Root element wrapping all URL entries | Yes | ` |
| ` | Container for a single URL entry | Yes | ` |
| ` | The full URL of the page | Yes | `https://example.com/services/\` |
| ` | Date the page content was last modified (ISO 8601) | No | `2026-03-01` |
| ` | How often the page typically changes (Google largely ignores this) | No | `monthly` |
| ` | Relative priority within your site, 0.0 to 1.0 (Google largely ignores this) | No | `0.8` |
| ` | Root element for sitemap index files referencing multiple sitemaps | Yes (for index) | ` |
| ` | Container referencing a child sitemap within a sitemap index | Yes (for index) | ` |
For a complete reference on valid sitemap tags and formatting, see the official Sitemaps.org protocol specification and the Moz guide to XML sitemaps.
XML vs HTML Sitemaps
There are two types of sitemaps, and they serve different purposes.
XML sitemaps are for search engines. They are machine-readable files typically located at `/sitemap.xml` and are submitted to Google Search Console.
HTML sitemaps are for users. They are human-readable pages on your site that list major sections and links, helping visitors navigate large sites. They can also have some secondary SEO value through internal linking, but they are not a substitute for an XML sitemap.
This guide focuses on XML sitemaps.
Why XML Sitemaps Matter for SEO
Google's crawlers discover pages primarily through links. If a page exists on your site but has no internal links pointing to it — sometimes called an orphan page — crawlers may never find it. A sitemap acts as a safety net, ensuring Google is at least aware the page exists.
This is particularly important in several scenarios.
New Websites
A brand new website has no external links, no historical crawl data, and possibly very few internal links. Without a sitemap, it can take weeks or even months for Google to discover and index all your pages. Submit a sitemap immediately when you launch a new site.
Large Sites with Complex Navigation
E-commerce sites, directories, or content-heavy sites with hundreds or thousands of URLs often have pages that are functionally orphaned — linked to through filters or search functionality rather than standard navigation. Sitemaps ensure Google can find and index these pages regardless of navigation structure.
Case study: sitemap fix resolves indexing gap for a 500-page e-commerce site. A UK homeware retailer with approximately 500 product pages noticed that roughly 180 of them were not appearing in Google at all. Their CMS generated a sitemap, but the sitemap only included the first 50 products per category — everything beyond page one of the category listing was missing. After regenerating the sitemap to include all product URLs and submitting it through Search Console, 160 of those 180 pages were indexed within three weeks. The remaining 20 had thin content issues that needed separate attention, but the sitemap fix alone recovered indexing for over 30% of the catalogue. Google's own documentation on building and submitting a sitemap confirms that sitemaps are particularly valuable for large sites where pages may not be discoverable through links alone.
Sites with Frequently Updated Content
If you publish blog posts, news articles, or product updates regularly, a sitemap with accurate `
Sites with Poor Internal Linking
If your internal linking structure is weak — a common issue on small business websites where most pages are reached only from the homepage navigation — a sitemap compensates by ensuring all pages are surfaced to crawlers.
What Sitemaps Do Not Do
It is equally important to understand the limits. A sitemap does not guarantee indexing. Google may choose not to index a URL even if it appears in a sitemap — typically because the page has thin content, is a near-duplicate of another page, or lacks sufficient signals to justify inclusion in the index. Submitting a URL in a sitemap is a request, not a command.
Sitemaps also do not improve rankings directly. They improve crawlability, which is a prerequisite for ranking — but having a sitemap will not push a weak page above a stronger one.
For a broader view of technical SEO, see our Technical SEO Explained guide.
How to Create an XML Sitemap
Option 1: Use Your CMS or Platform Plugin
If your site runs on WordPress, Shopify, Squarespace, or a similar platform, a sitemap is likely already generated for you.
WordPress generates a sitemap automatically via the default Yoast SEO or Rank Math plugin. You can typically find it at `yourdomain.com/sitemap_index.xml`. The Yoast sitemap breaks into separate files for posts, pages, categories, and other content types.
Shopify generates sitemaps automatically. Your sitemap lives at `yourdomain.com/sitemap.xml` and includes products, collections, blog posts, and pages by default.
Squarespace similarly auto-generates a sitemap at `yourdomain.com/sitemap.xml`. Note that password-protected pages are excluded.
If you are using a CMS, check whether a sitemap already exists before creating one manually.
Option 2: Generate One with a Dedicated Tool
For smaller static sites or sites where the CMS does not auto-generate a sitemap, tools like XML-Sitemaps.com can crawl your site and generate a valid sitemap file you can upload to your root directory.
Screaming Frog is another option for generating sitemaps from a crawl — particularly useful because it gives you control over what is included, allowing you to exclude noindex pages, redirect chains, and other problematic URLs before the sitemap is created.
Option 3: Build One Programmatically
For developers, sitemap generation is straightforward. Most web frameworks have sitemap libraries: `next-sitemap` for Next.js, `gatsby-plugin-sitemap` for Gatsby, and equivalents for most other major frameworks.
The key rule is to automate sitemap regeneration whenever content changes — not to maintain a static file manually.
Sitemap Index Files
If your site is large (roughly more than 50,000 URLs), you will want to split your sitemap into multiple files and reference them from a sitemap index. The index file itself looks like this:
```xml
Google supports individual sitemap files of up to 50,000 URLs or 50MB uncompressed. Use a sitemap index whenever you approach these limits.
How to Submit Your Sitemap to Google
Generating a sitemap is only half the job. You also need to tell Google where to find it.
Via Google Search Console
The most direct method:
- Go to Google Search Console
- Select your property (your website)
- Click "Sitemaps" in the left-hand menu
- Enter the URL of your sitemap (e.g. `sitemap.xml` or `sitemap_index.xml`)
- Click Submit
Google will then attempt to fetch and process your sitemap. The Search Console report will show how many URLs were submitted versus how many were indexed — a discrepancy between these numbers is a signal worth investigating.
Via robots.txt
You can also declare your sitemap location in your `robots.txt` file, which Googlebot reads on every crawl:
``` Sitemap: https://yourdomain.com/sitemap.xml ```
This is a good belt-and-braces approach alongside Search Console submission, and it helps other search engines (like Bing) discover your sitemap without requiring a separate submission.
For a complete guide to `robots.txt`, see The Complete Guide to robots.txt for SEO.
Common Sitemap Mistakes to Avoid
We have audited hundreds of small business sites through RnkRocket, and the same sitemap issues appear again and again.
Including Noindex Pages
Any page with a `noindex` meta tag or HTTP header should not appear in your sitemap. A page that says "do not index me" appearing in a sitemap creates a contradiction — and while Google will generally honour the noindex directive, the inconsistency is sloppy and can slow down crawl budget processing.
Common noindex pages that accidentally appear in sitemaps include thank-you pages, account login pages, admin areas, and duplicate content pages.
Including Redirected URLs
Sitemaps should only list canonical, indexable URLs. If a URL in your sitemap responds with a 301 or 302 redirect, remove it and replace it with the final destination URL. Redirect chains in sitemaps waste crawl budget and signal poor site hygiene.
Incorrect `` Dates
The `
Forgetting to Update After Major Site Changes
If you delete pages, change URL structures, or publish new content, your sitemap must reflect those changes. A sitemap full of 404 errors — pages that were removed but still listed — creates unnecessary crawl budget waste and signals to Google that your site is poorly maintained.
Not Including All Valuable Pages
The inverse problem is also common: important pages are missing from the sitemap entirely. This often happens when a CMS filter or custom post type is excluded by default, or when a developer manually excluded a section during setup.
Audit your sitemap regularly against your actual list of important pages. RnkRocket's technical SEO audit checks for missing sitemap entries as part of its crawl analysis.
Case study: stale sitemap entries causing crawl budget waste. A regional recruitment agency had a sitemap listing 1,200 URLs, but over 400 of those were expired job listing pages returning 404 errors. Google was spending a significant portion of its crawl budget re-fetching these dead pages on every crawl pass. After cleaning the sitemap to include only active job listings and core pages — reducing it from 1,200 to 780 URLs — Search Console data showed a measurable increase in crawl frequency for the remaining active pages, and several previously under-crawled service pages began appearing in search results for the first time.
Sitemap Best Practices Summary
| Practice | Why It Matters |
|---|---|
| Only include indexable URLs | Avoids contradictions and crawl budget waste |
| Use accurate ` | Helps Google prioritise genuinely updated content |
| Submit to Search Console | Gives direct visibility into indexing status |
| Declare in robots.txt | Helps all crawlers discover the sitemap |
| Split large sitemaps into index files | Avoids file size limits |
| Keep sitemap dynamically generated | Ensures it stays current without manual effort |
| Exclude 404s, redirects, and noindex pages | Keeps the sitemap clean and accurate |
Monitoring Your Sitemap Health
Submitting a sitemap is not a one-time task. You need to monitor it over time, particularly:
Submitted vs Indexed ratio in Search Console. If you submit 200 URLs but only 140 are indexed, that 30% gap needs investigation. The excluded URLs section in Search Console will show you why — common reasons include "Crawled but not currently indexed" (thin content signal), "Duplicate without canonical tag", or "Blocked by noindex".
Sitemap errors in Search Console. If Google cannot fetch or parse your sitemap, it will report an error. These errors appear in the Sitemaps report and should be fixed promptly.
New content appearing in the sitemap promptly. If you publish a new blog post and it takes more than 48 hours to appear in your sitemap, your CMS or sitemap generation process has a problem.
RnkRocket monitors your sitemap as part of its ongoing site intelligence — flagging pages that are in the sitemap but returning errors, pages that are indexed but missing from the sitemap, and sitemap format issues. See what is SEO for broader context on how crawlability fits into overall search performance.
Frequently Asked Questions
Does having a sitemap improve my Google rankings?
Not directly. A sitemap improves crawlability and indexing speed, which are prerequisites for ranking — but they do not themselves influence where you rank. Think of a sitemap as ensuring your pages are considered for ranking; the ranking itself depends on content quality, relevance, and authority.
How often should I update my sitemap?
Ideally, your sitemap should update automatically every time you publish or modify content. If your CMS does not do this automatically, regenerate and resubmit your sitemap whenever you make significant content changes — at minimum, monthly if you publish regularly.
Do I need a sitemap if my site is small?
Google says sitemaps are most valuable for sites over a few dozen pages. For a small five-page brochure site with good internal linking, a sitemap is still a good practice but less critical. That said, there is no downside to having one — generate and submit it regardless.
Can I have multiple sitemaps?
Yes — in fact, for larger sites this is recommended. Use a sitemap index file to reference multiple smaller sitemaps organised by content type (pages, posts, products, etc.). Google supports up to 50,000 URLs per sitemap file.
What should I do if Google is not indexing pages in my sitemap?
First, check the Coverage report in Google Search Console to see why those pages are excluded. Common causes include thin content, duplicate content issues, manual actions, or soft-404 errors. Fixing the underlying issue is more important than resubmitting the sitemap.
Should I include images and videos in my sitemap?
Google supports image and video sitemap extensions. Including images in your sitemap can help Google discover images that are loaded via JavaScript or that appear on pages with few inbound links. For image-heavy businesses (photographers, estate agents, e-commerce), this is worth implementing.
Related Reading
- Technical SEO Explained: A Plain-English Guide
- The Complete SEO Audit Checklist for 2026
- The Complete Guide to robots.txt for SEO
- What Is SEO? A Beginner's Guide
Want to see how your sitemap stacks up? RnkRocket audits your sitemap automatically as part of its technical SEO analysis — spotting missing pages, redirect errors, and indexing gaps in minutes. See pricing and get started.


