Finding ourselves without any indication in a complex network of a subway, with its multiple lines intertwining and branching off in different directions: this is the risk that search engine crawlers can fall into when they first look at our site in the absence of a proper map. And it is to solve this conundrum that it can be useful to set up an XML sitemap, which functions precisely like a subway map for search engines, showing the different “lines” or paths that crawlers can follow to navigate between the pages of the site, where each “station” represents a web page and the “connections” between the stations are the links that allow them to move from one page to another. Clarifying the structure of a website to search engines is a task simplified by the use of a well-set sitemap, and Google offers us 3 simple tips to get the most out of our sitemaps, allow Google Search to know all our pages, and “hop on and off” to the right stations!
Google’s 3 tips for making effective sitemaps
A well-organized sitemap can optimize the indexing process and ensure that all sections and content on your site are discovered and correctly placed in Google’s Index. In a nutshell, it is simply a list of pages on a website that are accessible to crawlers and literally serves to guide crawlers through the pages of the site, indicating which ones are available for indexing.
Not all sites actually need it, as Martin Splitt reminds us in the new installment of “SEO Made Easy” on Youtube dedicated to this topic, because “small” sites (but there is no specific threshold, not even in Google’s official documentations) are usually found and indexed completely, but in any case there are some advantages to creating sitemaps that can extend to them as well.
To be precise, it is still Splitt who says that a sitemap in XML format can help search engines learn about all the pages on the site, discover new pages faster, crawl the site more efficiently and, last but not least, quickly recognize changes made.
There are three quick tips in particular for getting the most out of sitemaps, namely making a strategic selection of pages to include and prioritizing the URLs that we really care about being indexed, making sure they are canonical, indexable, and pointing to relevant and valid content; using lastmod responsibly to help crawlers know when to crawl a URL again for new content; and splitting sitemaps that have become too large into smaller files.
Best practices for sitemap management
Let’s go into detail about Google’s tips for setting up an effective and truly useful sitemap.
- Strategically choose which URLs to include
The goal of this file, and in general our desire, is to allow search engines to efficiently crawl the site and discover pages.Therefore, we should include in the sitemap only canonical indexable URLs and, preferably and primarily, those that we consider strategic and therefore really need to “be in the search results.” This content can be articles, videos, images or any kind of web page.
URLs that redirect elsewhere or are marked as noindex are not very useful in the sitemap, Splitt specifies.
- Use lastmod responsibly
The next tip is to be careful about when to report a change, responsibly using the <lastmod> element–which indicates the date of the last change made to the file to which the link refers–to indicate to Google that a URL needs to be crawled again.
According to Splitt, we should proceed with lastmod editing only when there are significant changes, so that when the crawler passes over these URLs it will actually find new and updated content. Otherwise, crawl planning may consider lastmod entries useless and crawlers may ignore them in the future.
- Managing many URLs
Last but not least, XML sitemap files have limits, namely 50 megabytes or 50,000 URLs in a single file.
If our site exceeds this limit, says Splitt, the first step is to ask ourselves whether we really need all these URLs in the sitemap and how likely it is that they will all be indexed. But if we verify that they really are all useful, there is still a way to exceed the limit: split the sitemap into multiple files, organized by category or content type, to make the sitemap more manageable and help search engines process the information more efficiently.
This process is also useful when debugging problems with the sitemap, because the problematic URLs may be contained in a single file, and proceeding with the split can help us narrow down the problem.
Other small tips for sitemaps
Let’s close this quick guide with some other simple tips for creating an XML sitemap.
As mentioned, not all pages are the same: some are critical to understanding the site, while others may be less relevant or even duplicates that should not even be indexed. This is where a preventive work of analyzing the structure of the site comes in, identifying the most crucial pages, those that need to be visible.
In addition, customization is key: the frequency of page updates should reflect the reality of the content, and the priorities assigned to URLs should be indicative of their importance within the site structure.
Sitemap maintenance is also something that should not be overlooked: when we add of new pages or remove old ones on the site, the sitemap must be updated to reflect these changes. An outdated sitemap can lead to a poor user experience and inefficiencies in indexing by search engines.
Ultimately, the sitemap is an essential communication element between the website and the search engines, and a well-set up and maintained file can make all the difference in online visibility, ensuring that content is not only found, but also understood and evaluated correctly by the search engines, guided appropriately to “know where to crawl what,” as Martin Splitt puts it.