Clarity

How to Create an XML Sitemap for SEO

Summary: An XML sitemap is a file that lists every important page on your website so search engines can find and crawl them efficiently. Create one using a generator tool, CMS plugin, or code, then submit it to Google Search Console and Bing Webmaster Tools. Sites with properly submitted sitemaps get indexed faster and more completely.

If Google can't find your pages, they can't rank them. It's that simple. An XML sitemap is like handing Google a complete map of your website — every page, every post, every product — so it doesn't have to guess what exists. Without one, you're relying entirely on Google's crawler to discover pages by following links, and for new sites, large sites, or sites with complex navigation, that often means pages get missed.

According to Google's official sitemap documentation, sitemaps are especially important for new websites (few external links pointing to them), large sites (over 500 pages), sites with rich media content, and sites where pages aren't well-linked internally. If any of these apply to you — and they probably do — a sitemap isn't optional.

What Is an XML Sitemap?

An XML sitemap is a structured file (in XML format) that lists the URLs of your website along with optional metadata about each page — when it was last modified, how often it changes, and its relative importance. The file lives at a URL like yoursite.com/sitemap.xml and follows the Sitemaps protocol, an open standard supported by Google, Bing, Yahoo, and other search engines.

Here's what a basic XML sitemap looks like:

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com/</loc> <lastmod>2026-03-10</lastmod> <changefreq>weekly</changefreq> <priority>1.0</priority> </url> <url> <loc>https://example.com/about</loc> <lastmod>2026-02-15</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url> <url> <loc>https://example.com/blog/seo-guide</loc> <lastmod>2026-03-08</lastmod> <changefreq>monthly</changefreq> <priority>0.6</priority> </url> </urlset>

Let's break down the XML elements:

  • <urlset>: The root element that wraps the entire sitemap. The xmlns attribute specifies the protocol version.
  • <url>: Contains information about a single page. Each page gets its own <url> block.
  • <loc>: The full URL of the page. This is the only required element — everything else is optional.
  • <lastmod>: The date the page was last modified, in YYYY-MM-DD format. Google uses this to prioritise crawling recently updated pages.
  • <changefreq>: How often the page changes (always, hourly, daily, weekly, monthly, yearly, never). Google has confirmed it ignores this value — include it or leave it out, it makes no difference.
  • <priority>: A value from 0.0 to 1.0 indicating relative importance within your site. Google also ignores this — it determines page importance from links and other signals, not from self-reported priority values.
  • In practice, the only elements that matter are <loc> (required) and <lastmod> (recommended). The <changefreq> and <priority> tags are legacy — they were useful in the early days of sitemaps but are now effectively deprecated by all major search engines.

    Why Your Site Needs an XML Sitemap

  • Faster indexing: When you add new pages, Google discovers them faster through your sitemap than by waiting for its crawler to follow links. For new sites with few backlinks, this can mean the difference between indexing in days versus weeks.
  • Complete coverage: Large sites often have orphan pages — pages with no internal links pointing to them. Google can't discover orphan pages through crawling alone. A sitemap ensures every page gets found, regardless of internal linking.
  • Crawl budget efficiency: For large sites (1,000+ pages), Google allocates a finite crawl budget. A sitemap helps Google focus that budget on your important pages instead of wasting it on duplicate content, parameter URLs, or irrelevant pages.
  • Communication channel: Your sitemap is how you tell Google which pages matter, when they were updated, and which versions are canonical. It's a direct line of communication with the search engine.
  • Diagnostic tool: After submitting your sitemap to Search Console, Google reports back on how many URLs were submitted, how many were indexed, and why certain pages were excluded. This data is invaluable for diagnosing indexing issues.
  • How to Create an XML Sitemap

    There are several ways to create a sitemap, ranging from fully manual to fully automated. Choose the method that matches your tech stack:

    Method 1: Online Sitemap Generators

    The fastest way to create a sitemap for an existing website. Tools like XML-Sitemaps.com crawl your site and generate a ready-to-use sitemap file. Enter your URL, wait for the crawl to complete, download the XML file, and upload it to your site's root directory.

    Pros: No technical knowledge required, works with any website. Cons: Free generators typically limit you to 500 URLs. The sitemap is static — you need to regenerate it whenever you add new pages. Not ideal for sites that update frequently.

    Method 2: WordPress Plugins

    If you're on WordPress, sitemap generation is built in (since WordPress 5.5). WordPress automatically creates a sitemap at yoursite.com/wp-sitemap.xml. However, most SEO plugins offer better sitemaps with more control:

  • Yoast SEO: Automatically generates an XML sitemap at /sitemap_index.xml. Go to Yoast SEO → Settings → Site Features → XML sitemaps to enable/configure. You can exclude specific post types, taxonomies, and individual pages.
  • Rank Math: Go to Rank Math → Sitemap Settings. Offers granular control over which post types, taxonomies, and individual URLs to include/exclude. Also supports image and video sitemaps.
  • All in One SEO: Navigate to AIOSEO → Sitemaps. Supports XML, RSS, video, and news sitemaps. Includes a "Priority Score" calculator (though Google ignores priority values).
  • All three plugins automatically update the sitemap whenever you publish, update, or delete content — no manual regeneration needed.

    Method 3: Shopify, Wix, and Squarespace

    Shopify: Automatically generates a sitemap at yourstore.com/sitemap.xml. It includes products, collections, blog posts, and pages. You can't customise it directly — Shopify manages it automatically. To exclude pages, you'd need to use noindex meta tags instead.

    Wix: Automatically generates and updates a sitemap at yoursite.com/sitemap.xml. Go to Settings → SEO Tools → Sitemap to view it. Wix excludes pages you've set to "hide from search engines" automatically.

    Squarespace: Auto-generates a sitemap at yoursite.com/sitemap.xml. It includes all published pages, blog posts, and portfolio items. Pages with "Enable page" turned off are excluded automatically.

    Method 4: Manual Creation

    For small static sites, you can write the XML manually. Create a file called sitemap.xml in your site's root directory. Use the format shown above — list every URL you want indexed inside <url> tags with at least a <loc> element. Keep it updated manually whenever you add or remove pages.

    This method works for sites with under 50 pages but becomes unmanageable at scale. For anything larger, use an automated approach.

    Method 5: Dynamic Sitemaps for Next.js and React

    Modern JavaScript frameworks don't generate sitemaps by default — you need to build them programmatically. In Next.js (App Router), create a sitemap.ts file in your app/ directory:

    // app/sitemap.ts
    import { MetadataRoute } from 'next';
    
    export default function sitemap(): MetadataRoute.Sitemap {
      const baseUrl = 'https://example.com';
    
      // Static pages
      const staticPages = [
        { url: baseUrl, lastModified: new Date(), changeFrequency: 'weekly' as const, priority: 1 },
        { url: `${baseUrl}/about`, lastModified: new Date(), changeFrequency: 'monthly' as const, priority: 0.8 },
        { url: `${baseUrl}/contact`, lastModified: new Date(), changeFrequency: 'monthly' as const, priority: 0.5 },
      ];
    
      // Dynamic pages (e.g., from CMS or database)
      // const posts = await getAllPosts();
      // const dynamicPages = posts.map(post => ({
      //   url: `${baseUrl}/blog/${post.slug}`,
      //   lastModified: post.updatedAt,
      //   changeFrequency: 'monthly' as const,
      //   priority: 0.6,
      // }));
    
      return [...staticPages];
    }

    Next.js automatically serves this at /sitemap.xml and regenerates it on each build (or on each request if you're using ISR/SSR). For sites with more than 50,000 URLs, use the generateSitemaps() function to create multiple sitemap files automatically.

    For React SPAs (Create React App, Vite), sitemaps need to be generated at build time using packages like react-router-sitemap or a custom build script. SPAs also need server-side rendering or pre-rendering for Google to crawl content effectively — a sitemap alone won't help if Google can't render your JavaScript pages.

    XML Sitemap Best Practices

    Following these best practices ensures your sitemap is effective and doesn't cause issues:

  • 50,000 URL limit: A single sitemap file can contain a maximum of 50,000 URLs. If your site has more, use a sitemap index file (explained below).
  • 50MB uncompressed size limit: Each sitemap file can't exceed 50MB when uncompressed. For most sites this isn't an issue, but very large sites with long URLs may hit this limit.
  • Only include canonical URLs: Every URL in your sitemap should be the canonical version. Don't include URLs with query parameters, non-canonical duplicates, or pages that redirect. Include only pages that return a 200 status code.
  • Don't include noindex pages: If a page has a noindex meta tag, don't put it in the sitemap. Including noindexed pages sends contradictory signals — "here's a page I want you to find, but please don't index it."
  • Use accurate <lastmod> dates: Only update <lastmod> when the content actually changes — not on every page load or build. Google uses this to decide which pages to recrawl. If every page always shows today's date, Google learns to ignore your lastmod values entirely.
  • Use absolute URLs: Always use full URLs including the protocol: https://example.com/page, not /page.
  • Match your canonical URL format: If your site uses https://www.example.com, don't list https://example.com in the sitemap (or vice versa). Consistency matters.
  • Reference your sitemap in robots.txt: Add Sitemap: https://example.com/sitemap.xml to your robots.txt file. This helps search engines discover your sitemap even if you haven't submitted it manually.
  • Sitemap Index Files for Large Sites

    If your site has more than 50,000 URLs (or you want to organise your sitemap by content type), use a sitemap index file. It's a sitemap that points to other sitemaps:

    <?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://example.com/sitemap-pages.xml</loc> <lastmod>2026-03-10</lastmod> </sitemap> <sitemap> <loc>https://example.com/sitemap-posts.xml</loc> <lastmod>2026-03-09</lastmod> </sitemap> <sitemap> <loc>https://example.com/sitemap-products.xml</loc> <lastmod>2026-03-10</lastmod> </sitemap> </sitemapindex>

    Each sub-sitemap can contain up to 50,000 URLs, and a sitemap index can reference up to 50,000 sub-sitemaps — giving you a theoretical maximum of 2.5 billion URLs. Most SEO plugins (Yoast, Rank Math) automatically create sitemap indexes, splitting your URLs by post type (pages, posts, products, categories).

    How to Submit Your Sitemap to Google

    Creating the sitemap is step one. Submitting it tells Google exactly where to find it:

    Step 1: Log in to Google Search Console. If you haven't set up Search Console yet, follow our guide on how to submit your website to Google.

    Step 2: In the left sidebar, click Sitemaps under the Indexing section.

    Step 3: Enter your sitemap URL (e.g., sitemap.xml or sitemap_index.xml) in the "Add a new sitemap" field and click Submit.

    Step 4: Google will show the status as "Success" if it can read the sitemap, along with the number of URLs discovered. If there are errors, it will tell you what's wrong — common issues include incorrect XML formatting, unreachable URLs, or the sitemap itself returning a non-200 status code.

    Step 5: Check back in a few days. Under the Sitemaps report, Google shows how many URLs from your sitemap have been indexed, how many are excluded, and the reasons for exclusion. This is your primary diagnostic tool for indexing problems. The report breaks exclusions into categories: noindex pages, redirect URLs, pages blocked by robots.txt, and others. Each category tells you something different about your site's indexability. A healthy sitemap should show 95%+ of submitted URLs as indexed. If indexing drops suddenly, investigate immediately — it indicates a new problem with specific pages.

    How to Submit Your Sitemap to Bing

    Don't forget Bing. While Google dominates search, Bing still powers a significant portion of web searches — especially on Microsoft devices, Cortana, DuckDuckGo (which uses Bing's index), and various AI assistants.

    Step 1: Go to Bing Webmaster Tools and sign in with a Microsoft account.

    Step 2: Add your site if you haven't already. You can import directly from Google Search Console for instant setup.

    Step 3: Go to Sitemaps in the left sidebar and submit your sitemap URL.

    Bing also supports the IndexNow protocol for instant URL submission — push URLs to Bing the moment they're published rather than waiting for the crawler to discover them.

    Specialised Sitemap Types

    Beyond the standard XML sitemap, there are specialised sitemap types for different content formats:

    Image Sitemaps

    Image sitemaps help Google discover images that might not be found through normal crawling — especially images loaded via JavaScript, CSS background images, or images behind lazy-load implementations. You can extend your existing sitemap by adding image-specific XML namespace tags, or create a separate image sitemap. According to web.dev's SEO audit documentation, image sitemaps are particularly valuable for e-commerce sites where product images drive significant traffic from Google Image Search. Each URL entry can include up to 1,000 image references.

    Video Sitemaps

    If your site hosts video content, a video sitemap helps Google index and display video rich results — including thumbnails, duration, and descriptions in search results. Video sitemaps require additional tags like <video:thumbnail_loc>, <video:title>, and <video:description>. This is the primary way to get your videos into Google Video search results and video carousels on the main SERP.

    News Sitemaps

    News sitemaps are for publishers who want articles to appear in Google News. They include publication date, title, and keywords. News sitemaps should only contain articles published within the last 48 hours — Google News has strict recency requirements. Most news publishers use separate news sitemaps alongside their standard XML sitemaps.

    Hreflang Sitemaps (International Sites)

    For multilingual or multi-regional websites, you can add xhtml:link elements to your sitemap entries to specify language and regional alternatives. This is often easier to manage than adding hreflang tags to every page's HTML — especially for large sites with dozens of language variants. The sitemap approach keeps all your hreflang declarations in one place, making maintenance and auditing simpler.

    Troubleshooting Sitemap Issues

    If your sitemap submission shows errors or low indexing rates, work through these diagnostics:

    "Sitemap unreachable" Error

    Google can't reach your sitemap. Verify that yoursite.com/sitemap.xml returns a 200 status code and the file is publicly accessible (not behind a login or geofence). Check that your robots.txt isn't blocking /sitemap.xml. Test accessibility using the URL Inspection tool in Search Console — it will tell you if Google can fetch the file.

    Low Indexing Rate

    You submitted 1,000 URLs but only 600 are indexed. Check the exclusion report in Search Console. If URLs are excluded due to noindex or redirects, you have stale URLs in the sitemap. Remove URLs you don't want indexed. If exclusions are due to "Page with redirect", only include the final destination URL. If exclusions are "Blocked by robots.txt", update your robots.txt to allow Googlebot access to those pages.

    XML Parsing Error

    Your sitemap has invalid XML syntax. Common causes: unescaped ampersands (& should be &amp;), unmatched tags, or encoding issues. Use the Schema Validator or any online XML validator to identify the exact line with the error. Most CMS plugins auto-validate before serving, so this error is rare with managed sitemaps.

    Sitemaps Showing as Stale

    If you manually created your sitemap, you need to regenerate and resubmit it whenever you add new pages. Automated sitemaps (from WordPress plugins, Next.js, or generators) update automatically. If your sitemap is weeks old, Google will treat it as stale and crawl less frequently. Switch to an automated approach, or set a reminder to regenerate monthly.

    Common XML Sitemap Mistakes

  • Including blocked URLs: If a URL is blocked by robots.txt, don't put it in the sitemap. This creates a contradiction — your robots.txt says "don't crawl this" while your sitemap says "please crawl this." Google flags this as an error in Search Console.
  • Listing non-canonical URLs: If page A redirects to page B, only include page B in the sitemap. Including redirect URLs wastes crawl budget and confuses Google about which URL to index.
  • Stale sitemaps: A sitemap that hasn't been updated in months tells Google your site isn't active. Use dynamic generation (CMS plugins, programmatic sitemaps) to keep it automatically updated.
  • Wrong lastmod dates: Setting every page's lastmod to today's date is worse than not including lastmod at all. Google has confirmed that it loses trust in lastmod values when they're clearly inaccurate. Only update lastmod when content genuinely changes.
  • Including low-value pages: Tag pages, author archives, paginated pages, and filter/sort URLs often don't need to be in your sitemap. Include only pages that add unique value and that you actually want Google to index.
  • Forgetting to add the sitemap to robots.txt: While submitting through Search Console is the primary method, also add Sitemap: https://yoursite.com/sitemap.xml to your robots.txt file as a backup discovery method.
  • HTTP vs HTTPS mismatch: If your site uses HTTPS (and it should), make sure all URLs in the sitemap use HTTPS too. A single protocol mismatch can prevent indexing.
  • Not monitoring after submission: Submitting your sitemap isn't a set-and-forget task. Check the Sitemaps report in Search Console regularly to catch new errors, coverage drops, or indexing issues early.
  • Check Your Sitemap with Clarity SEO

    Clarity SEO's Report Card checks whether your site has a valid, accessible XML sitemap and flags common issues like missing sitemaps, broken URLs, and formatting errors. It also verifies that your sitemap is referenced in your robots.txt file.

    → Check your sitemap for free

    Sitemaps and Your Overall SEO Strategy

    A sitemap is one piece of the technical SEO puzzle. It works alongside your robots.txt file (which controls what Google can crawl), your internal linking structure (which distributes page authority), and your canonical tags (which prevent duplicate content issues). Together, these elements ensure Google can discover, crawl, and index your content efficiently.

    For the complete picture, check out our comprehensive guide on how to improve your website SEO, which covers all the technical, on-page, and off-page factors that contribute to rankings.

    FAQ

    What is an XML sitemap?

    An XML sitemap is a file that lists every important URL on your website in a structured format that search engines can easily read. It follows the Sitemaps protocol, an open standard supported by all major search engines. Each URL entry can include metadata like the last modification date. The sitemap helps search engines discover and crawl your pages more efficiently, especially new pages, orphaned pages, and pages deep in your site's navigation hierarchy.

    Does every website need a sitemap?

    Technically, no — Google can discover pages through links alone. However, Google recommends sitemaps for new sites, large sites (500+ pages), sites with rich media, and sites where pages aren't well-linked internally. In practice, every website benefits from having a sitemap — even small ones. It takes minutes to set up and removes any ambiguity about which pages exist and should be indexed. There's no downside to having one.

    How often should I update my XML sitemap?

    Your sitemap should update automatically whenever you add, remove, or significantly modify a page. If you're using a CMS like WordPress with an SEO plugin (Yoast, Rank Math), this happens automatically. For custom-built sites, generate the sitemap during your build/deploy process. The key is that <lastmod> dates should only change when content actually changes — not on every build or page load.

    Will a sitemap help my rankings?

    A sitemap doesn't directly improve rankings — it improves crawling and indexing. Google can't rank a page it hasn't found and indexed. So while a sitemap won't move you from position 5 to position 1, it ensures your pages get discovered and indexed as quickly as possible. For new sites or sites with indexing issues (see our guide on why your website isn't showing up on Google), this can make a dramatic difference.

    What's the difference between an XML sitemap and an HTML sitemap?

    An XML sitemap is a machine-readable file designed for search engines. An HTML sitemap is a human-readable page on your website that lists all your pages with clickable links — like a table of contents for visitors. Both are useful but serve different purposes. The XML sitemap is for SEO (submitted to Search Console); the HTML sitemap is for user experience (helps visitors navigate your site). Most sites should have both.

    Can I have multiple sitemaps?

    Yes. You can submit multiple sitemaps to Google Search Console, and you can use a sitemap index file to organise them. Common approaches include separate sitemaps for pages, blog posts, products, and images. This makes it easier to monitor indexing by content type. For example, if your product sitemap shows low indexing, you know where to focus your troubleshooting. Each individual sitemap can contain up to 50,000 URLs.

    Related Guides

    Sitemaps are part of a broader technical SEO foundation. Explore these related guides:

  • How to Create a Robots.txt File — Control what Google can and can't crawl, and reference your sitemap.
  • How to Submit Your Website to Google — Set up Search Console and get your site indexed.
  • Why Is My Website Not Showing Up on Google? — Diagnose indexing issues that a sitemap alone can't fix.
  • How to Improve Your Website SEO — The complete guide to building a solid SEO foundation.
  • Summary

    An XML sitemap is one of the simplest and most impactful things you can do for your site's SEO. It takes minutes to create, costs nothing, and ensures Google can discover every important page on your site. Use a CMS plugin for automatic generation, or build a dynamic sitemap in code. Submit it to both Google Search Console and Bing Webmaster Tools. Keep it accurate — only canonical, indexable pages with honest lastmod dates. Reference it in your robots.txt. Monitor the Sitemaps report in Search Console for issues. That's it — simple, foundational, essential.

    Check if your sitemap is set up correctly with a free Clarity SEO audit:

    → Get your free SEO Report Card

    Related Tools