A sitemap is an XML file where you can provide information about the pages, videos, and other files on your website, and the relationships between them. Search engines like Google, Yahoo, Bing read this file to more intelligently crawl your site. A sitemap tells Google to which pages and files you think are important on your website.
What is an XML sitemap?
You want Google to crawl your website’s all-important pages. But occasionally, pages end up without any internal links pointing to them, making them hard to find. An XML sitemap lists a website’s very important pages, making sure Google can find and crawl them all, also helping it understand your website’s structure:
Above is The Code Hubs Sitemap. As you can see, The Code Hubs XML sitemap shows several ‘index’ sitemaps: post-sitemap.xml
, page-sitemap.xml
etc. This categorization makes a site’s structure clearly. So if you click on any index sitemaps, you’ll see all URLs in that particular sitemap. For example, if you click on page-sitemap.xml
you’ll see all The Code Hubs pages URLs:
You have noticed a date at the end of each line. This means Google when each post was last updated and helps with SEO because you want Google to crawl your updated content ASAP. When a date changes in the XML sitemap, Google knows there is updated content to crawl and index.
If you have an extra-large website, sometimes it’s necessary to split an index sitemap. A single XML sitemap is limited to 50,000 URLs, so if your website has more than 50,000 posts/URLs, for example, you’ll need two separate ones for the post URLs, effectively adding a second index sitemap.
Which websites need an XML sitemap?
As per Google’s documentation, You need a sitemap if:
- Your site is really large.
- Your site has a large archive of content pages that are isolated or not well linked to each other.
- Your site is new and has few external links to it.
- Your site has a lot of rich media content (video, images) or is shown in Google News.
Which pages should be in your XML sitemap?
How do you decide which pages must include in your XML sitemap? Always start thinking of the relevance of a URL: when a visitor lands on a particular URL, is it a good result? Do you want visitors to visit that URL? If not, it probably shouldn’t be in it. However, if you don’t want that URL to show up in the search results, you’ll need to add a ‘noindex, follow’ tag on that particular page. Leaving it from your XML sitemap doesn’t mean Google won’t index the URL. If Google can find it by following links, Google can index the URL.
I.e. A new blog/News
Say, for example, you are starting a new blog. You will want Google to find new posts quickly to make sure your target audience can find your blog in the search results, so it’s a good idea to create an XML sitemap right from the start. You need to create a handful of first posts and categories for them as well as some tags to start with. But there won’t be enough content yet to fill the tag overview pages, making them “thin content” that’s not valuable to visitors – yet. In this case, you should leave the tag’s URLs out of the sitemap for now. Set the tag pages to ‘noindex, follow’ because you don’t want people to find that tag pages in search results.
I.e. Media files
The ‘media’ or ‘image’ file XML sitemap is also unnecessary for most of the websites because your images are probably used within your pages, posts, and News. So will already be included in your ‘post’ or ‘page’ sitemap. So having a separate ‘media’ or ‘image’ sitemap would be worthless and we recommend leaving it out is a better option. The only exception to this is if images, Pictures are the main business of yours. For example, Photographers will probably want to show a separate ‘media’ or ‘image’ XML sitemap to Google.
How Google find your sitemap
If you want Google to find your sitemap XML quickly, you’ll need to add it to your Google Search Console account. In the ‘Sitemaps’ section, you’ll quickly see if your XML sitemap is already added. If not exist, you can add your sitemap at top of the page.
Adding your XML sitemap helps to check whether Google indexed all the pages in your sitemap. If there are differences in the ‘submitted’ and ‘indexed’ number on a particular sitemap, we recommend looking into this further. There could be an error preventing some pages from being indexed. Another option is, that you may need more links pointing to the content that’s not been indexed yet.
Conclusion
Now, you know the importance to have an XML sitemap: having one can help your site’s SEO. Google can easily access your most important pages and posts if you add the right URLs to it. Google is also able to find updated content easily, so they know when a URL needs to be crawled again. At the last, adding your XML sitemap to Google Search Console helps Google find your sitemap fast and allows you to check for sitemap errors.