Back to basic: it is time to brush up on some basic SEO and site management topics! Our look focuses on taxonomy and we will try to give some advice not only to those who are studying the right solutions to build a new project, but also instructions to make more effective a site already started that is struggling to take off.
What is site taxonomy
With the expression taxonomy of websites (website taxonomy, sometimes also called taxonomy of URLs) we refer to the way our pages are structured in the various containers of categories or tags following the conceptual organization of the site.
The term taxonomy comes from the Greek and means “rule of ordering“: it is a real scientific discipline that analyzes the criterion and the logical order with which they classify and hierarchy certain elements (not only belonging to computer science or SEO). For example, in biology there is the biological taxonomy, that is, the criteria of the hierarchical classification of living species to study their evolution.
What is the taxonomy for
The origins of taxonomy and the process of categorizing elements go back to the very origins of language and therefore to the dawn of thought: originally, men used the same names to define organisms more or less similar (for example, we think of the “flowers” or the trees, so-called “vulgar names”) characterized by traits in common.
The use in sciences
“All human societies have a taxonomic system that names species and groups them into higher-order categories”, Wikipedia also tells us, and “practically all concepts, animated and non-animated objects, places and events can be classified according to a taxonomic scheme”.
However, in biology a problem began to arise, namely, establishing “a more universal and rigorous system for defining organisms”, because “each species had to be named, possess a single name and be described in an unambiguous form” to avoid misunderstandings and allow easy interpretation in every part of the world.
Taxonomy applied to sites
Returning to the topics of interest to us, what was written before applies perfectly to the sites: the taxonomy helps to classify in a hierarchical way all the documents, the Web pages and the respective Urls, through principles and concepts of communication, and aims to clarify and make their interpretation unambiguous for search engines and users.
In summary, it helps to improve the usability of the site itself: by discovering and browsing the categories, visitors have easier accessibility to the contents and a faster ability to identify the pages associated with each container. They can thus move within the site with greater clarity and without confusion, factors that can facilitate its permanence and reduce the rate of abandonment.
At the same time, efficient management communicates better and more directly to the search engine spiders what the site deals with and what are the central topics of the project, with potential benefits for ranking.
Taxonomy and URL structure
Setting up a well-optimized website taxonomy is important to have a scalable SEO strategy, and effectiveness is also closely related to how we manage the URL structure and, in particular, subfolders.
Each time we create a new page, its specific name is slug, which (typically) is the terminal part of the address; for sites that choose to show a URL path with visible category, the new page becomes the daughter of the parent section, which appears as a subfolder in the path.
Site taxonomy: is it better a URL with or without category for the SEO?
It opens at this point an old debate of the SEO community: is it better to think of a URL structure with taxonomy or without it? And so, should the category be part of the path, or can we also avoid inserting them without fear?
In reality there is no single and definitive answer, but – as often happens in our world – a lot depends on the type of site, the context and (why not?) the analysis of competitors to see how they organized our direct opponents (and if their strategies are appreciated by Google). Generally, it is believed that creating clean Urls, which follow a scalable structure, allows you to have a ‘site architecture more easily scannable by crawlers.
As Emanuele Vaccari explains from his blog, the structure with the taxonomy string in the URL allows the segmentation of the pages of the site at different times and for different purposes, both in the analysis phase and in the data processing phase, but do not overestimate the weight of Urls in the SEO.
The value of URLs in SEO
In fact, it is now established that a URL stuffed with keywords is not a relevant ranking factor, and we also know that the length of Urls affects little on performance and can – at most – cause technical problems on the server and client side, as John Mueller recalled long ago.
Much more relevant is the structure of the site, which is instead one of the central elements to improve the chances of having good performance in SERP: if the URLs are simple reference points to call the server resources when necessary, the structure of the site is instead composed by the navigation paths that we trace through the internal links in all their meanings.
How to communicate paths to Google
Thinking back to the hamletic doubt expressed earlier on, there is then another question not to be overlooked: there is a precise way to communicate to Google in a unique way the structure of the site and Urls and to indicate the hierarchy that we thought, or structured data.
Implementing the breadcrumb markups allows us to give an obvious signal of structure and hierarchy through a language understandable to the search engine, making it superfluous the use of Urls with subfolders with only this goal (and allowing sites without taxonomy string in addresses to run smoothly).
Make a clear choice
In principle, for an ex novo project it may be preferable to set the insertion of taxonomies in Urls to simplify the maintenance and analysis of pages: talking Urls are also more readable and understandable to humans and can help identify any problems that occur mechanically in specific sections (for example, canonical errors, pagination that is not printed, excessive heaviness of the pages, abnormal response times and much more).
It is also easier to manage redirects and injections of code, without having to find the body classes that WordPress applies for Urls taxonomies, as well as it becomes convenient to manage/generate sitemaps and filter data analysis performed in Google Search Console.
Anyway, what matters is to make a basic choice and try to respect it as much as possible, because touching and changing the structure of Urls is (almost never) a recommended option and, most of all, it makes virtually no sense if the only hope is to achieve an alleged improvement in ranking. In fact, generally, there are more risks than benefits, and therefore it is an option to be evaluated only in extreme and unavoidable situations.
Optimizing the site taxonomy
There are some tips and best practices to follow to make the taxonomy management of Urls effective and to avoid what is called “hot garbage“, namely those long strands of nodes present in Urls that are of little use and only complicate things.
An example of this garbage is the presence of the date in the path, which is not considered optimal because it does not group the contents in a section of the website according to an affinity of topic, but only by time factor.
On the contrary, an optimized and clean taxonomy brings together Urls (and contents) that belong to the same topic and have a clear reciprocal relationship. This path can help Google in various ways, because it makes the reports of content within a category more explicit and facilitates understanding by the search engine systems.
Internal links are also a powerful tool when it comes to website taxonomy, as they facilitate search engines and users to discover the relationship between topics and determine their relevance. The advice is to use internal links naturally in the text, providing contextual relevance through the supporting text near the link.
Tips to improve URL taxonomy
According to John Mcalpin on Searchengineland, the site taxonomy optimization process must be based on three principles:
- Be scalable.
- Be easy to follow for both users and search engines.
- Target the marketing funnel.
What scalable system means
In the field of software engineering, and more generally in computer science, a scalable system is defined as capable of increasing (or decreasing) ita scale according to needs and availability. When we create a taxonomy of Urls, we must therefore make sure that later you can add new pages that adapt easily to the set context.
Examples of scalable taxonomies
To better understand the issue, the article brings the example of a website for a company with multiple local locations, and in particular with shops covering more (American) states, more cities within that state and more postal ZIP codes within those cities. There are various methods to create talking Urls with the category:
- Single category. In the example is locations, in which to place all the pages differently localized:
- https://www.examplehealthsite.com/locations/north-dallas-office
- https://www.examplehealthsite.com/locations/south-dallas-office
- Grouping by State. A following subfolder speicifying the State and gathering its correlated pages:
- https://www.examplehealthsite.com/locations/texas/north-dallas-office
- https://www.examplehealthsite.com/locations/texas/south-dallas-office
- Grouping by city. Further subfolder specifying the city:
- https://www.examplehealthsite.com/locations/texas/dallas/north-dallas-office
- https://www.examplehealthsite.com/locations/texas/dallas/south-dallas-office
- Grouping by Zip Code, only using the zip code:
- https://www.examplehealthsite.com/locations/75001/north-dallas-office
- Grouping by Zip Code and State:
- https://www.examplehealthsite.com/locations/texas/75001/north-dallas-office
It is really important to carefully study the structuring of Urls because you immediately notice how easy it is to find yourself with long and complex Urls that can “get out of hand”. Using too organized paths can be a flaw, also because it blocks the system and makes it not scalable.
The key to deciding which is the best option is to know your business and anticipate the development: for example, if you expect growth within individual cities and states it may make sense to set up those subfolders, but even a simpler directory may be enough (while the one for Zip Code is the one to discard almost always beforehand). What matters is to understand what makes more sense to your strategies and what it can be useful to your users, and make sure you reflect these aspects in taxonomy.
Creating user friendly structures
A clean taxonomy of Urls is also important for users, because it can improve the user experience and simplify their journey through the pages of the site in search of what they are interested in.
Somehow, one must anticipate the possible paths of visitors, understand the areas of greatest interest (to be identified with an effective keyword research upstream) and reflect these assessments in the Urls.
Taxonomy and marketing funnel
No less strategic is to try to offer consumers a taxonomy aligned with the way the pages and contents of the website are structured, to facilitate the marketing funnel.
It could be useful, for example, to group the different sections that deal with different phases of the decision-making process, still respecting the breadcrumb structure of those pages also in the taxonomy of the URL.
Categories and tags: differencies and ways to manage
In the taxonomy talk we can also open a quick parenthesis to better define the categories and tags of a site, and to provide some suggestions to manage in SEO way these elements.
What are site categories
The category is a grouping by topic and falls within what is called vertical taxonomy, because the selection of documents follows a hierarchical development and further subcategories can be created. It is a classification system that brings together relevant resources united by a relationship of specificity, which are generally close to each other and refer to the same central macro-topic of the site.
Categories should be created respecting two particular properties, namely specificity and proximity. Specificity is what distinguishes each category from another, that is the property of having specialized content on a specific theme, different from those of the others; proximity instead is the characteristic of two categories of having contents on similar topics, that can be tied in a logical thread using common terms.
In principle, an article should fall into a single category and a single subcategory, but in reality it is not a strict rule (even if it is good not to insert content in more than two categories in order not to disorient users).
What are the tags
Tags are essentially specific keywords that can bring together multiple different content, not necessarily linked to the same category. They are an alternative classification and sorting rule to subcategories, in which the selection of documents develops horizontally (horizontal taxonomy) and without hierarchy.
Using tags, you aggregate all documents so labeled regardless of their category (in fact, they can be used under different categories) and there are no quantitative limits: you can create countless tags and label content with an infinite number of tags, but that means creating redundancy and exposing the site to risks (each tag creates a new container page, so if not optimized and if not significant they are only likely to expand the site and waste crawl budget).
Goals of the site taxonomy
In conclusion, let’s repeat what a well-organized taxonomy is in order for you to remember it longer:
- Group site Urls based on a topic.
- Detect the website’s search topic
- More correctly communicate to search engine crawlers the topics covered, to eventually improve content indexing (and potentially have a possible boost for ranking).
- Improve the usability and facilitate the use of information by visitors, who can identify the site as thematical and vertical on a topic, therefore considering it more reliable than a general portal under the same conditions.