From classic literature to video games to television shows, the “evil twin” trope can be found in almost every genre and medium. Dr. Jekyll and Mr. Hyde, Shadow Mario, evil Spock—who could forget that goatee? The list is seemingly endless.
The world of SEO has its own version of evil twins too: duplicate content. While not exactly evil, it can cause more than its fair share of problems by diluting links’ power, lowering search rankings and creating a subpar user experience. To avoid those problems, familiarize yourself with the dos and don’ts of duplicate content.
- The Basics:
- Duplicate Content Dos:
- Duplicate Content Don’ts:
What Exactly Is Duplicate Content?
In the publishing world, the first version of a book is called the manuscript. Every version printed after the manuscript’s approval is known as a copy because, well, it’s a copy of the original.
Duplicate content is the internet version of this concept. Any time an article is published across multiple sites or a blurb is repeated on multiple pages, that’s duplicate content in action.
Such duplication isn’t inherently bad, and can even be beneficial—who wouldn’t want to have a stand-out article published on numerous sites, serving to raise brand awareness and cement authority in one fell swoop? One Business Insider article, for instance, appears on several reputable sites:
Nevertheless, problems can arise when search engines get involved. Whereas no reader would mistake their modern copy of Hamlet for an original centuries-old manuscript, search engines often have a hard time discerning which version of a piece of content came first.
As a result, when you want to maximize search rankings and attract visitors, you need a strategy to overcome the challenges duplicate content presents.
Why Learn Duplicate Content SEO Strategies?
Matt Cutts, Google’s former head of search quality, once estimated 25 to 30 percent of all web content to be duplicative. If that’s the case, then what’s the problem?
For most internet users, there is none. Seeing the same product description on multiple websites, for example, is normal, albeit repetitive. For SEO pros, though, duplicate content can get in the way of higher rankings.
Suppose an article is published on a small blog and then syndicated to a larger publication. Since the larger publication’s site gets more traffic and engagement than the small blog, Google can view the syndicated (i.e., duplicate) version as the best one to display in search results. If you’re trying to optimize the blog for search engine rankings, this is a serious stumbling block.
Fortunately, you can dodge the downsides of duplicate content altogether with the right combination of savvy strategies.
Duplicate Content Dos
To avoid losing authority and rankings due to duplicate content, SEO tactics will come in handy. Do:
Check for Duplicate Content on a Regular Basis
The first step to taking control is finding if and where duplicate content exists in the first place.
Copyscape offers one of the most straightforward ways to do so. Enter any page’s URL and you’ll be able to see where its content also appears across the web. A premium account is required to perform more than a couple searches per month or view more than ten results.
For example, one Forbes article also appears on a number of other sites:
Copyscape’s sister site, Siteliner, can be used to find internal duplicate content.
Looking for a free alternative to Copyscape? Check for duplicate content the old-fashioned (and free) way with the help of a handy Google Search tip: Copy a portion of a page’s content and paste it in Google Search surrounded by quotation marks. Google limits search queries to 32 words, so you only need to choose a unique sentence or two.
Or, use the free version of Screaming Frog to crawl a site for duplicate content (if you want to crawl more than 500 URLs, you’ll need to upgrade to the paid version).
Use the Canonical Tag
Whether a piece of content is syndicated across external sites or duplicated internally, the canonical tag can directly tell search engines which page is the canonical (i.e., original) version.
This one HTML element can make a world of difference for the original page’s ranking, yet only occupies a single line of code:
Copy and paste it into the heading of a non-canonical page and insert the URL of the canonical page. Presto!
The canonical tag works for external websites too. Before publishing a guest post on another site or syndicating content, ask the site’s owner to add the tag to the page’s HTML code.
Learn How Your Content Management System Works
If you’re using a content management system (CMS), it might be duplicating content right under your nose.
This can happen for a number of reasons—perhaps the CMS is automatically copying each article to an archive, or maybe it’s displaying large portions of a post’s content on the blog’s home page.
Whatever the case, it’s worth verifying the proper canonicalization is taking place. Many CMS’ settings include an option to automatically add a canonical tag to all duplicate pages—take HubSpot’s canonical settings for example:
Once you’re familiar with the duplicate content SEO tools in your CMS, you can adjust their settings accordingly and check that item off your to-do list for good.
Use 301 Redirects like an Expert
The canonical tag isn’t a duplicate content panacea. You can also use 301 redirects to point search engines to the original version of a page. While not applicable to every situation, 301 redirects are ideal when you’re:
- transferring a site from one domain to another;
- modifying a site’s structure;
- deprecating a page; or
- consolidating or remaking content.
301 redirects are also useful for dealing with inconsistent URLs. For example, maybe your site can be accessed from both https://www.example.com and example.com.
You can create a 301 redirect in one of several ways. If a site runs on Apache, you’ll need to edit the .htaccess file. If you’re not keen to dig into the nuts and bolts of a site’s server (we don’t blame you), you may want to try a more straightforward method. Simple options include:
- WordPress plugins like SEOPress and Redirection; and
- built-in redirection settings on platforms such as HubSpot, Squarespace, Shopify and Wix.
Duplicate Content Don’ts
To prevent those dastardly doubles from hindering your SEO efforts, don’t:
Create Redirect Chains
Redirects can help eliminate duplicate content, but if used to excess they can create a redirect chain, in which page A doesn’t simply redirect to page B. Instead, page A redirects to page B, page B redirects to page C and so on.
If a redirect chain is long enough, search engine crawlers may simply give up on finding the final page in the chain.
Avoiding this issue is simple enough—just take care not to create unnecessary redirects or add onto existing ones. But how do you find and manage existing redirect chains?
Tools like Screaming Frog and SEMrush’s Site Audit tool make identifying redirect chains a piece of cake.
In SEMrush, first click on the “issues” tab, open the “select an issue” dropdown menu and click “redirect chains and loops:”
Screaming Frog offers a similar solution. Start by opening the SEO spider and crawling a URL. Then hover over the “reports” tab, hover over “redirects” in the dropdown menu and click “redirect chains:”
Overuse the Noindex Tag
Google supports a number of special meta tags, including those for preventing automatic page translation, indicating mobile-friendliness and verifying ownership.
One of the most common is the noindex tag, which prevents Googlebot from indexing a given page. By adding the noindex tag to a non-canonical page, you’ll ensure Google only indexes the canonical version.
Sounds like a perfect solution, right? Not exactly. Even pages with duplicated content can have link equity, and it’ll swirl down the virtual drain with the addition of the noindex tag.
Save the noindex tag for things like thank-you pages, admin pages and internal search results, and use the canonical tag or a 301 redirect to deal with duplicate content.
Publish Placeholder Pages
Placeholder pages, or stubs, are used to lay out and test a site’s structure before those pages’ contents are added.
While they can be useful for website administrators looking to work out navigational issues, placeholder pages are nothing but annoying for site visitors. If you clicked on an interesting link only to be presented with a “coming soon” message, you might be irritated too.
Multiple placeholder pages can also negatively affect a site’s rankings. Google views pages with no purpose as low quality, and the same goes for pages containing little to no original or useful content.
To sidestep these problems, Google recommends you avoid publishing them altogether. If you do choose to use them, add the noindex tag so they don’t affect rankings.
Copy and Paste Lengthy Boilerplate Content
On some websites, contact information, privacy policies and similar standardized pieces of text are often added to each page’s footer. This is called boilerplate content, named after the stamped steel plates once used by newspapers to print repeatable or syndicated copy (typing up a blog post sure seems easy by comparison!).
Although a site’s boilerplate content typically contains important and necessary information, search engines may see it as duplicate content if it’s too lengthy. Keep boilerplate text to a minimum and simply link to any pages containing privacy policies, contact information, etc.
By keeping a site’s boilerplate content short and sweet, you can also maintain a clean, aesthetically pleasing design and avoid showing users a block of legal text on every page.
Overlook Product Information
Sites selling products, whether B2B or B2C, usually include a great number of product descriptions and specifications. This isn’t a problem in and of itself, but it can create performance and ranking issues if the same product information is copied across multiple platforms.
For example, a website might try to leverage Pinterest SEO by posting its products in the form of pins. West Elm’s Pinterest page contains a slew of shoppable product listings, including one for their Remi ottoman:
Hop over to that same ottoman’s product page on West Elm’s own site, though, and you’ll see its product description comprises nearly identical text:
While having the same product descriptions posted on one or two other platforms likely won’t create any issues, doing so too often may result in those descriptions being viewed as duplicate content. Plus, if the third-party page (in this case, the Remi ottoman’s pin) gets more visitors and engagement than the first-party product page, it may end up ranking higher in search results.
To avoid duplicating content and give customers a variety of unique copy, switch up product descriptions across platforms.
Does a Duplicate Content Penalty Exist?
SEO practitioners have long debated whether Google penalizes duplicate content or not. As it turns out, Google has been trying to dispel the myth of a duplicate content penalty for years. As Susan Moskwa clarified on the Webmaster Central Blog, “There’s no such thing as a ‘duplicate content penalty.'”
That’s because duplicate content is rarely malicious or specifically designed to manipulate rankings. In the rare instances it is, Google may issue a penalty. But, it will not penalize a site for the simply duplicating content.
It was, as you’ll recall, Google’s own Matt Cutts who said 25 to 30 percent of all web content is duplicative. If duplicate content were penalized, trillions of pages would plummet on the SERPs and Google’s user experience would suffer as a result.
So forget trying to find some secret decoder ring of duplicate content SEO. Instead of a nonexistent penalty, focus on consolidation to achieve higher rankings.
From Double Trouble to Dream Team
When duplicate content runs amok, it can certainly seem as troublesome as an evil twin. From less-than-ideal rankings to a substandard user experience, too much of a good thing can lead to serious SEO headaches.
Master the core dos and don’ts of duplicate content, and transform identical pieces of content from Jekyll and Hyde to The Parent Trap. Unless you’re Meredith Blake, there’s nothing evil about that.
Screenshots by author / July 2020
HubSpot / July 2020
SEMrush / June 2017