Placing the pages of our site among the first Google positions is, wanting to sum the thing up, the goal that we all try to achieve. Before the ranking, however, there is a decisive step that we must not forget and that we often take for granted, a.k.a the indexing. In the new episode of Google Search Console Training we then have a guide to the use of the Google Index Coverage Report, to find out if our pages are actually on Google or if there are errors to be fixed.
The video, entrusted as always to Daniel Waisberg, explains just how to use the Google Search Console to know if the pages of our site have been scanned and indexed by Google and if there are any problems encountered during this process.
What is crawling, Googlebot’s scan
It all starts with crawling, that is the process through which Googlebot discovers new or updated pages that it has to add to the Google Index. Googlebot processes each page it scans to compile a huge index of all the words it sees and their location within each page.
When a user launches a query on Google, the automated system of the engine searches performs a search in the index to find the corresponding pages and returns the results it considers to be the most relevant and appropriate for users.
The Report on the index coverage status within GSC
In order to get an overview of all the pages on our site that Google has indexed or tried to index we can use the Index coverage status in the Search Console. It is important to start by saying that this Report indicates via mail the appearance of an indexing problem, but you do not receive notifications in case of errors that worsen: the first advice of the googler is then to periodically check the report to verify that everything is in order.
When and how to use the Report
This tool is useful especially in case of large sites and are the guidelines of GSC themselves to define “unnecessary” the use if our online project has less than 500 pages: in this case, in fact, it is easier to verify if the site appears on Google through the site command: which we have already encountered.
Theoretically, when the site grows we should notice a gradual increase in the number of valid indexed pages: drops or peaks could result from problems. We should not expect all URLs on our site to be indexed, because our goal should be to index the canonical version of each page; moreover, it can take Google a few days to index new content added to the site, but we can reduce the delay by manually requesting the process.
What do reported statuses mean
The default screen of the tool summarizes the indexing errors present on the site, but we can also focus directly on the four reported types: error, valid with warning, excluded or simply valid, which are grouped and sorted by “status and reason”, working first to correct the problems that have the greatest impact on our project.
The possible status values for a page are therefore four and each has a specific reason:
- Error. There is a problem that prevents page indexing, which therefore cannot appear among Google Search results, and thus results in loss of traffic to the site.
This is the case, for example, of a URL sent that contains a “noindex” tag, or pages that return a status code 404 or server errors. The problems of pages submitted through sitemap are explicitly indicated to facilitate their correction.
- Valid with warnings. The page may or may not appear in Google Search depending on a problem we need to be aware of.
For instance, the pages locked in the robots.txt file are indicated as a warning because Google is not sure that the blocking is intentional (we know that the robots.txt directives are not the right way to block the indexing of pages, but we must use other methods).
- Valid. The page has been indexed and can appear among the search results: we do not have to do anything, except work on SEO optimization for a better ranking!
- Excluded. The page has not been indexed and does not appear on Google, which believes is an intentional or right choice.
For instance, the page contains a noindex instruction (intentional choice), it could be a duplicate of another page already indexed (right choice) or it was not found because the bot incurred in a 404 error.
To know and fix the errors
Pages with errors are those on which we should immediately focus our attention; the table of the report is sorted according to the seriousness of the problem and the number of pages affected by it, and clicking on the line we can verify the temporal distribution of the damage and a list of examples to deepen the aspect.
Once you made the corrections (personally or using the support of a developer, who we can give a limited access to the Search Console through the sharing of the link), we have to validate the changes by clicking on the appropriate button and wait for Google to process our work.
A useful tool for a successful strategy
Ultimately, the Google Index Cover Status Report is a useful and fundamental tool because it gives us clearer information about scanning and indexing decisions and how Google manages the contents of our site, but also because it allows us to discover technical problems in time, even on a large scale, and to intervene in order to prevent them leading us to drops in traffic.