For the last *#$%!ng time, there is no 'duplicate content' penalty
The search engines won't ban your site if you post the same text more than once. But that doesn't mean it's the right thing to do.
By Donald Ritchie
Of all the myths and misunderstandings surrounding search engine optimization (SEO), the one that seems to cause the most alarm and confusion is the so-called "duplicate content" penalty. Webmasters and SEO practitioners often get into a state of panic if they notice the slightest hint of duplication in their sites. At any moment, they fear, Google will banish them from its index for good.
Let's clear this up straight away: There is no such thing as a "duplicate content" penalty. Google will not penalise you - in the sense of banning you from its search results - if the same text happens to appear in more than one place. That's equally true if the duplication occurs within your own site or across multiple domains.
Prove it for yourself
You can prove this for yourself. Just pick a distinctive phrase from a well-known song lyric, poem, or some similar text. Punch it into the search engine surrounded by double-quotes. Then check the results. I just did that for the line "Quoth the raven, nevermore". Google happily showed me 14 separate sites containing the full text of Edgar Allen Poe's most famous poem - and that was just in the first two result pages. Clearly, no duplicate content penalty at work there.
But just because the search engines won't ban your site because of identical content, it doesn't mean that you can ignore the issue. As SEO consultant Jill Whalen points out, duplicate content "generally goes hand in hand with other SEO problems … Post-Panda/Penguin, dupe content on websites can often have major repercussions".
What Google does
In general, when Google finds several pages containing very similar text, it will give the highest ranking to the one that it decides most closely meets the searcher's needs. That decision will be based on geo-targeting, link quality, and all the other "signals" that Google employs. Other pages containing the same text will usually still appear, but lower down the results.
This is as it should be. As Google explains in a blog post: "Our users typically want to see a diverse cross-section of unique content when they do searches. They're understandably annoyed when they see substantially the same content within a set of search results."
This is especially significant where the pages in question attract high-quality links. If those links are split between many similar pages, the value of each link will be diluted. By contrast, if the content was consolidated into a single, unique page, then that page would enjoy the benefit of all the incoming links, and would almost certainly rank higher than it would do otherwise.
To get a better understanding of how to deal with this issue, it helps to look at the different reasons that dupe content might occur:
- E-commerce sites. These sometimes contain multiple URLs pointing to the same product, typically from different categories or departments. For example, a laptop computer might appear under both office equipment and consumer electronics. While this might be convenient for the customer, it is less useful from an SEO point of view. A good way to deal with it is with the rel="canonical" tag, which essentially points the search engine to the "preferred" version. Importantly, it causes any link popularity that would otherwise be divided between the various pages to be focused on the canonical page.
- Printer-friendly pages. These are very useful for your visitors, and it's right that they should be easy to find from the main pages within the site. But they should not appear in their own right within the search results. It's a good idea to block such pages from the search engines, either by means of robots.txt or the noindex meta tag.
- Boilerplate text. Some sites contain lengthy chunks of repetitive text that appear on every page. I'm thinking of things like copyright notices or contact details. Google might sometimes regard these pages as partly-duplicated, and demote them accordingly. It would be better for each page to show just a summary of the relevant information, with a link to a separate page that contains the full details.
- Translations. Don't worry about this one. Google has stated quite clearly that they do not regard the same content written in different languages as duplicates.
- Syndication. If you allow other sites to re-publish your articles, you risk seeing the dilution of link popularity that I mentioned earlier. You should mitigate that by ensuring the re-published version contains a link to whichever of your own pages you ultimately want to promote. For example, if the article describes the benefits of one of your products, it would make good sense for it to link to that product's page on your own e-commerce site.
Clearly, duplicate content can, in many cases, result in lower rankings. But that's a far cry from saying that your site will be penalised. As Jill Whalen points out, "penalties are for spammers". The search engines are not out to punish you just because you happen to publish similar text in two different places. In the post-Panda/Penguin world, webmasters and SEOs have plenty of things to worry about. For most of them, the so-called "duplicate content" penalty is not one of them.
Please note: The information given on this site has been carefully checked and is believed to be correct, but no legal liability can be accepted for its use. Do not use code, components or techniques unless you are satisfied that they will work correctly with your sites or applications.