It’s no secret that search engine optimization is an overall complex subject. You will learn that as soon as you dig deeper, beyond things like keyword usage, editing meta-tags, and link building. This is not to say that these techniques are simple, but at least they are more or less known to most people who have ever created a website and attempted to promote it on the web.
But there are SEO issues that most casual users are completely unaware of. Meanwhile, they can affect your online visibility and decrease your website rankings on search engines. As they say, ignorance is no defense.
One of such issues is duplicate content and the usage of canonical tags. Let’s find out what these terms mean and how you can use the knowledge to enhance your SEO.
What is a canonical tag?
A canonical tag (also known as “rel canonical”) is an HTML command which tells search engines that a given website page is a copy (full or partial) of a specified master page. Simply put, a canonical tag can be viewed as a reference to an original source of content.
The tag looks like this rel=”canonical” <link> and is usually placed in the HTML head of a web page. The <link> part contains a URL of a page which should be considered as an authoritative page (the page with original content).
A canonical tag is used to help search engines differentiate original content from duplicate content as the latter can be a reason for Google and other search engines to penalize your website and demote it in rankings. Thus, pages containing duplicate content and a canonical link are disregarded by search engine crawlers when ranking your site. Meanwhile, the master page gets better positions on SERPs (search engine result pages).
What is duplicate content?
Today everybody knows that content moves the internet. If you own a website or blog, you know that content is your most valuable asset. Not only because it attracts visitors to your site and keeps the audience engaged, but also because it is mainly content that search engines analyze while ranking websites. As long as your content is unique and of high quality, your website will occupy higher positions on SERPs and will be easier to find.
However, if some of your website pages have the same or very similar content, search engines can deem that suspicious and decrease your site ranking. Content repeating across multiple pages is called duplicate.
You’re probably thinking: why should I care? None of my pages are duplicated. But you’ll be surprised to know that there can actually be quite a few. You’re just unaware of their existence.
What are the reasons for duplicate content?
The problem with duplicate content lies in the difference between how we, humans, approach web pages and how search engines see them. We used to treat a website page just like a page in a book. Each page is supposed to be filled with distinct text. There are hardly any books out there in which you can find identical pages.
But search engines approach a web page from the perspective of its URL (or URLs). In other words, most website pages can be accessed from different URLs, and every URL will be treated as a separate page. And since all the URLs will lead to the same page, the content behind them will be considered as duplicate.
Multiple reasons can result in content duplication:
- Transfer protocol/subdomain variations: Your website may be accessible through different URLs, for instance, http://www.yoursite.com, https://www.yoursite.com, http://yoursite.com, etc. Although all of these addresses take you to one and the same homepage, a search engine will consider these as different pages.
- Regional domain prefix: If you have multiple versions of a single website available for different regions (e.g., uk.yoursite.com), you will have to use a canonical tag pointing to master pages in order to avoid issues with duplicate content (unless your content is translated).
- Mobile version of a website: Mobile-optimized versions of websites are often available through distinct URLs (e.g., mobile.yoursite.com). If this is the case, a canonical tag is required to draw a line between the original and duplicate content.
- Product pages: The problem of duplicate content is especially acute for online stores where different variations of the same product may be represented by different URL paths (e.g., yoursite.com/product?size=small&color=yellow).
- Copied content: Sometimes you need more than just one website to represent your content. For example, if you have multiple company branches or syndicate content for several online resources. In these cases, it makes sense to canonicalize the original (or preferred) source of content.
- Flaws of content management systems: Using a CMS can also be a reason for the emergence of pages with duplicate content on your website. This is because some systems may automatically set search parameters for your URLs, apply wrong tags, and allow access to your pages through multiple URLs.
How to canonicalize pages
There are a few different ways to use canonical tags. Each has both advantages and disadvantages.
Using Google Search Console
This method is the easiest and requires the usage of a dedicated webmaster tool by Google. It offers a setting allowing website owners to specify the preferred domain where the canonical version of the content is found.
However, this feature is useful mostly for pages that have similar content and identical URL paths but different domain names (e.g., yoursite.com/product/t-shirt and anothersite.com/product/t-shirt). Also, this method is relevant only for Google but not for other search engines.
Introducing a canonical tag as metadata
Applying canonical tags to specific pages is the most common, though a bit more tricky way to canonicalize content. The tag is formatted as <link rel=”canonical” href=”[canonical URL]”> and is added as metadata to the page’s HTML head.
The main benefit of this method is that it enables you to canonicalize content for any number of pages. However, adding more data to a page expands its size, thus, slowing down the loading speed.
Furthermore, it can be quite difficult to update canonical tags accurately if your pages’ URLs are changed frequently (although some CMS solutions can automatically update them for you).
Setting up 301 redirects
If you want search engine crawlers to take only one page variant as canonical while ignoring the other, you may consider configuring a 301 redirect. It will automatically forward search engines and visitors from the URL requested or specified in a search result to a preferred URL.
This is an optimal solution if there is a need to show a search engine that a specific version of your page is the most important. 301 redirects are often used to prioritize a root domain over a subdomain or vice versa (e.g., www.yoursite.com vs. yoursite.com).
However, by using this method, you consciously depreciate one of the page versions and deny access to the non-canonical page for all potential visitors.
Using canonical tags wisely can save you from troubles related to duplicate content, for example, Google penalty. So if you used to take good care of your website SEO, canonicalization should be taken as seriously as keyword selection and link building. Here are a few final tips to help you manage your canonical tags effectively:
- Use self-referential canonical tags: It is a common practice to add a canonical tag to a page you want to prioritize. For instance, if you have several pages with similar content, e.g., yoursite.com, www.yoursite.com, https://www.yoursite.com, and you want to choose yoursite.com as canonical, it’s okay to use the canonical link yoursite.com on this specific page. This method is often applied to canonicalize homepages as it is mostly them that people link back to.
- Avoid chain- or cross-canonicalization: Make sure to canonicalize only one source of original content for multiple pages. Don’t canonicalize page A > page B and then page B > page A; or page A > page B > page C. Otherwise, search engines may opt for a wrong page. The correct canonicalization scheme is: page B > page A, page C > page A, page D > page A, etc. (given that A is canonical).
- Use canonical tags sparingly: Remember that it only makes sense to canonicalize pages if they have identical or very similar counterparts. If there’s a significant difference between two pages and you canonicalize one of them, you put another page under the risk of being excluded from rankings. So make sure to use canonical tags only where they are really needed.