Canonical tags: Theultimate guide

Canonical tags are a crucial for SEO, allowing website owners to control duplicate content issues and specify the preferred version of a webpage. However, implementing canonical tags effectively can be a daunting task, especially for those new to SEO.

In this guide, we will explore the intricacies of canonical tags, exploring their definition, usage, and best practices. Whether you’re a seasoned SEO expert or just starting out, this guide will provide you with the knowledge and guidance necessary to master canonical tags.

What are canonical tags?

Canonical tags do one very important job on your site. They show search engines which content is ‘original’ and which is a duplicate. Not using canonical tags to highlight duplicate content can adversely impact your rankings in the search engines.

Unless you copy and paste your content you may not think your website hosts any duplicate content. Yet many pages are duplicated based on variations of what that page contains. Canonical tags are required even if the content of a webpage is mostly identical to another.

This is especially common on eCommerce websites.

Here is an example to illustrate this point:

Example Webpage 1

Webpage URL: exampleclothingstore.com/product/jumper

Webpage Description: This webpage is the generic page for a specific product on the example website (a jumper). It holds key information such as materials used and a description of how the jumper looks

Example Webpage 2

Webpage URL: exampleclothingstore.com/product/jumper?color=navy

Webpage Description: This webpage is identical in almost every way. Except for the fact that the visitor has selected to view the product in a specific color.

The images and some of the descriptions may change due to color selection.

As each page has a different URL, search engines will see these pages as duplicates. Hence canonical tags will need to be used in this case.

 Pro Tip

Canonical tags tell search engines two things:

  1. That the page that the tag is used on is a duplicate
  2. Where the search engine can find the original content

The URL for the ‘original’ content is known as the canonical URL. Google defines this as:

 Definition

“A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site.”

Why are canonical tags important?

Duplicate content is bad for your website’s SEO. Why? Because when there is duplicate content online it can be difficult for search engines to decipher which one is the original.

This makes it difficult for them to know which version of the content they should rank higher. In cases where canonical tags are not used, Google will make the best guess as to which version of the content is canonical.

This can lead to the less valuable pages on your website receiving the most benefit.

 Pro Tip

Google states:
“The canonical URL can be in a different domain than a duplicate URL.”

This means that canonical tags can (and should) also be used where your content is duplicated or nearly duplicated on other websites.

How to implement canonical tags?

Now it is time to discuss the technicalities of actually using canonical tags on your site.

There are five options to choose from to achieve the same goal. Each one differs in its suitability based on the scenario.

These options include HTML tag, HTTP header, Sitemap, 301 redirects, and internal links.

1. HTML Tag (rel=canonical)

The rel=canonical tag is by far the most widely accepted method for highlighting your canonical URL on your website. Using the HTML tag is quick to apply and is easy for search engines to read.

To use this tag, you first need to decide which page is canonical (most important).

For example, you may have three URLs that all lead to very similar pages. So, you need to pick which one of the three should be considered the best and/or original.

 Pro Tip
Once you have decided, apply the following to the <head> section of the web pages that contain the duplicate content:

<link rel=“canonical” href=“[insert canonical URL here]” />

Here is what this code is telling the search engine:

link rel=“canonical” = this page is duplicate content, the following link is the canonical URL

href=“[insert canonical URL here]” = the URL inside the speech marks is the canonical URL

NOTE: Some website platforms may not allow you to edit the HTML code of your site directly. In most of these cases, they will allow you to add canonical tags in the options settings for each page. 

2. HTTP Header

An HTTP header is a field of an HTTP request or response that passes additional context and metadata about the request or response. “

So, you can also use an HTTP header to pass on information on duplicate content.

This may be necessary when there is no opportunity to include a canonical tag in the <head> section of a web page.

One common example of this is when a PDF is hosted online.

Here is what an HTTP header may look like in this instance:


HTTP/1.1 200 OK
Content-Type: application/pdf
Link: <[insert canonical URL]>; rel="canonical"

3. Sitemap

When submitting sitemaps to search engines like Google you can specify your canonical pages. This is a quick and easy method for highlighting duplicate content for larger sites. Yet it is not considered the best method.

 Pro Tip

In guidance published by Google, they highlight that when you specify canonical web pages via sitemaps:

  1. Google must still determine the associated duplicate for any canonicals that you declare in the sitemap.
  2. Less powerful signal to Google than the rel=canonical mapping technique.

As this doesn’t make it incredibly obvious to search engines which URL is canonical, where possible, use another method. We recommend the page by page declaration using the rel=canonical HTML tag.

4. 301 Redirect

Using a 301 redirect takes a slightly different approach to dealing with canonical pages. Most methods we cover in this section highlight to search engines that the web page it is viewing isn’t the canonical URL. 301 is a literal permanent redirect to the new page.

Let’s say you have a website will multiple variations of URLs that offer a version of the homepage, such as:


www.example.com
example.com
Example.com
example.com/home

You may choose to use a 301 redirect on all the URLs except the main one.

As when you use a 301 redirect it will take the visitor (and search engine) directly to the most relevant URL. This may not always fit your UX/UI requirements and you may wish to keep all the variations of the page alive.

Canonicalization Best Practices

Now we dive a little deeper into canonicalization and provide some insight into best practices.

1. Self-Referential Canonical Tags

Canonical tags should be added to pages that have duplicated or near-duplicated content included.

What do we mean by this?

 Pro Tip

As we explored earlier in the article, a canonical tag typically tells a search engine two things:

  1. That the content on the page is a duplication or near duplication
  2. Where to find the canonical content

A self-referential canonical tag uses the same code, but this time it is pointing search engines toward itself.

Telling them that the page the search engine is indexing is the canonical URL.

Here is an example. Let’s say https://mojodojo.io/blog/ is our canonical URL. Other URLs which lead to similar pages would use a canonical tag directing search engines to this URL.

But you can also include a canonical tag on the https://mojodojo.io/blog/ page to point search engines towards itself. This would look like this:

<link rel=“canonical” href=“https://mojodojo.io/blog/” />

2. Canonicalize Your Home Page

It is best practice to canonicalize your home page as your homepage may likely be available via a number of URLs.

This can include using self-referential canonical tags. You will also need to use canonical tags on any version of your homepage that uses a different URL.

This could include www. and non-www. versions as well as HTTP and HTTPS versions.

3. Use Absolute URLs

Both absolute and relative URLs are acceptable when creating canonical tags.

But it is widely accepted as best practice to use absolute URLs.

 Pro Tip

An absolute URL is a URL that contains all information. For example, an absolute URL would include:

  • The protocol (http or https),
  • The (optional) subdomain (www.)
  • The domain (example.com)
  • The path (such as /blog or /products/ebooks)

A canonical tag using an absolute URL would look like this:

<link rel=“canonical” href=“https://mojodojo.io/blog/” />

A relative URL only includes the path. A canonical tag using a relative URL would look like this:

<link rel=“canonical” href=“/blog/” />

By using the absolute URL for your canonical tags you ensure that search engines will interpret them correctly.

4. Use Lowercase URLs

You may be forgiven for thinking that search engines treat lowercase and uppercase letters in URLs the same. But it has been clarified by Google that URLs may be seen as different when upper or lowercase is used. This means they could identify two of the same URLs as different if they use a different combination of uppercase and lowercase letters.

To save confusion, use lowercase URLs across the board. This is especially important for external links and canonical tags, as most webmasters tend to use lowercase URLs when linking to other sites.

5. Use One Canonical Tag Per Page

The canonical URL is treated as the one true source of information. So, it is clear that when multiple canonical tags are used on one page this could confuse search engines.

For example, let’s say there were two canonical tags used on a page <head> section:


<link rel=“canonical” href=“https://mojodojo.io/blog/” />

<link rel=“canonical” href=“https://mojodojo.io/news/” />

In this case, Google wouldn’t know whether the canonical URL was https://mojodojo.io/blog/ or https://mojodojo.io/news/.

This means they would treat neither as correct.

So, using many canonical tags is as bad as using none at all.

6. Prefer HTTPS Over HTTP For Canonical URLs

HTTPS is the secure version of HTTP.  It uses the SSL/TLS protocol for encryption and authentication.

Safe sites provide a better experience for website visitors. As search engines are trying to serve searchers with the best results, it is clear why they prefer HTTPS over their HTTP equivalent.

Google states:

“Google prefers HTTPS pages over equivalent HTTP pages as canonical, except when there are issues or conflicting signals”

Conflicting signals can occur for several reasons. Such as when the SSL certificate for the HTTPS page has expired. Or when then HTTPS redirects users through to an HTTP version of the page.

 Pro Tip

Google suggests two ways that you can ensure that HTTPS web pages are canonical:

  1. Use a redirect that takes readers from the HTTP version f the page to the HTTPS version
  2. Apply a rel=”canonical” tag which links from the HTTP page to the HTTPS page

Common Mistakes to Avoid in Canonicalization

As canonicalization is all about sending the right signals to search engines, it is important to avoid mistakes that can undo all your hard work.

1. Incorrect URL Formatting

Earlier in this guide, we discussed how absolute URLs are the best for creating a canonical tag. But you can use both relative and absolute URLs. Whichever you use, you must use the correct format to ensure it is read correctly.

For example, when using an absolute URL, you must include the HTTP or HTTPS prefix.

2. Blocking the Canonicalized URL via robots.txt

Robots.txt tells a search engine where they should (or should not) go on your website.

So, if you block your canonicalized URL via robots.txt, then it prevents search engines from accessing the page. While Robots.txt is advisory , it is still very much the first file Google and other search engines crawl when they visit your site.

The same rule goes for any pages on which you have used your canonical tag.

If you decide to block any of these pages via robot.txt, the ‘link juice’ you provide your canonical URL will be lost.

3. Using ‘noindex’ on Canonical Pages

The whole point of creating a canonical page is to show search engines which page they should give preference to when indexing your site.

So, telling a search engine that your canonical page is a ‘noindex’ page is a direct contradiction to using a canonical tags on different pages.

Never use ‘noindex’ on a canonical URL.

4. Not Including Canonical Tags in <head>

For your canonical tags to be read by search engines, they need to be included in the <head> section of your code.

If you include your tag in another part of your code, such as the <body> section it will fail to be read and understood by search engines.

Include the canonical tag early in the <head> section to ensure it is properly read.

How to Audit and Fix Canonicalization Issues

The correct deployment of canonicalization can have a big, positive impact on your search engine rankings.

However, poor canonicalization can have the opposite effect. So, when you use canonicalization it is important to consistently audit your site. This will help you spot issues and remedy them in a timely manner.

You can audit your site manually. This involves investigating the code for each page on your site separately. But as you can imagine, this can take considerable time. This is especially true for sites with a large number of pages.

You can also spot common errors on the Google search console.

 Pro Tip

An alternative is to use a tool such as Screaming Frog. These tools can crawl your site and offer you easy-to-understand insight into canonicalization.

For example, they may offer data such as:

  • Whether a page has a canonical URL set
  • Whether the page’s canonical tag is self-referencing
  • Whether the page’s canonical tag is referring to another page
  • Whether there is no canonical tag on the page
  • Whether there are multiple canonical tags on one page

This type of automated audit tool can help you quickly identify potential errors and fix them quickly.

Set a reminder to audit your canonicalization. How often you do this will depend on how many pages you are adding to your site. For smaller sites, once a month should be enough.

Canonical Tags: Summary

It is clear that using canonicalization will pay dividends.

Especially for those looking to improve their search engine rankings. Make it clear to search engines which are canonical URLs. Using canonical tags will let you highlight duplicate content. This makes it easy for them to decide which pages should rank on SERPs.