SEO experts and digital marketers have long been contemplating whether or not there’s such a thing as a Google penalty for duplicate content. As it turns out, it’s a myth.
We know that Google doesn’t like to rank multiple pages from the same domain. So, one or the other gets picked eventually: one day Google picks one page from your domain, another day a different one gets displayed, thus, you end up losing your potential traffic.We call it the “seesaw effect.” When you have different pages that contain too-similar content, basically, you confuse the Google bot as to which page from your site it should rank for key terms.
1) Internal linking
Cause: “Skewed ‘popularity’ via internal linking, which will cause a particular page to be displayed instead of another regardless of relevancy.”To clarify this point, let’s say, you have a website for selling shoes online. It also includes a blog. You definitely want your homepage to be as popular as possible in search results pages; however, you end up earning more traffic with your blog.When reviewing your internal linking, you may find that many internal links point to your blog, which is ranked higher. Why is that? Because the more links you provide to a page, the more important Google considers it be!
Solution: Review your internal link structure, and place what’s most important higher within the hierarchy of your pages. If you use the latter, you can launch a simple site audit, which will automatically check your domain for duplicate links and tags, and even provide some suggestions for improvement.
2) Anchor text inconsistency
Cause: Overuse of keywords with anchors that refer to the wrong page, which creates spam within your own URL
Let’s consider our shoes website example once again. Your anchors should always be concise and get right to the point! If you anchor says “buy shoes,” after clicking “buy shoes,” your visitor should be able to do so right away – do not digress elsewhere.Don’t, for example, create a transitional page on “shoe sizes” or something else. Preciseness lends clarity, and clarity leads to higher rankings in search engines.
Solution: Make sure your keywords are natural, useful and mapped to the right page – that way, you are sending all the right hints to Google from within your own website to your URLs.
3) Canonicalization (incorrect or missing canonicalization)
A page’s canonicalization is basically a tag at its head claiming that it contains either exactly duplicated or too-similar content.
Cause: When you purposefully use duplicate or too-similar content within your URL, make sure you canonicalize it. However, if you canonicalize too much, it doesn’t do you any good either – Google may consider it to be messy and may ignore the page completely.
Solution: Pay attention to the way you canonicalize:
- Canonicalization with or without a trailing slash is different
- Don’t forget to switch your canonicals when migrating content to HTTPS
- Self-referencing canonicalization can help if your content gets scraped
- Avoid canonicalization of URLs if the content is too different (if your content actually differs quite substantially, canonicalization will only be detrimental)
- Use a “rel=canonical element”
4) Page title and other meta differentiation
Cause: Different URLs ranking for the same term in your industry’s seasonal periods, or optimizing several pages for the same terms (even unwittingly).
For instance, in our previous example, our primary term is “shoes.” Thus, we optimize our main page to appear in response to a keyword “shoes”. However, in the summer you decide to target an audience looking for “summer shoes,” so you create a separate page with that name.Eventually, you end up with two pages on which “shoes” appears simultaneously, and your “summer shoes” page is killing the potential of your primary target page.
Solution: Avoid over-optimization for primary terms on non-targeted pages, and under-optimization for secondary target pages
5) Merging content for concentration
Cause: Stealing power from your own target via variants, stemming, synonyms, or associated keywords.
For instance, you have four pages that are dedicated to “Shoes Care” in winter/summer/spring/autumn period. Thus, you have created four separate pages that could actually be tied into one big category “How to Look After Shoes”.
Solution: Focus on theme level, section and page level, avoid silos. How? Firstly, you can do a content audit, and afterwards, concentrate rather than dilute – try to make one page instead of two.
6) Intent to content mapping
Cause: Lack of logic within your site structure.
For example, you have a blog within your site, however, if you do not provide some logic and order to the navigation within you blog, Google will also consider it as a source of cannibalization.Thus, you end up being cannibalized by illogical blog categories (rather than five random topics, make a category and divide it to topics afterwards). Provide no competition with your primary target page (see point 5).
Solution: Surround your content with a sectional environment of contextual relevance.
7) If you can’t improve (for now), noindex (temporarily) and then improve.
Just don’t forget to index later!
If you cannot identify the problem, here are some possible symptoms:
- Different pages ranking for the same term
- Hovering on page 2
- Never quite achieving those top spots
- Different URLs ranking for the same terms during your industry’s seasonal periods
- Not getting the CTR (click-through rate) despite ranking reasonably well
- Shared impressions in Google Search Console for different pages for the same terms
You can always rely on specially designed tools to help you with your website audit; they can point out weaknesses and help you decrease the possibility of internal cannibalization.