The Truth about Duplicate Content and SEO

Know the impacts that publishing repeated content can have on your digital strategy and practical ways of dealing with this problem.

It is commonplace to hear that duplicate content is harmful to SEO as it would be punished by search engines like Google.

There are those who claim that punishment for this type of practice may even exclude the entire domain from search results, making the site effectively impossible to find by this means.

Is this really true?

The search engines themselves do not help much, as they tend to be rather opaque about rules, sometimes saying one thing, but in practice doing differently.

In this post we clarify what we know about the impacts that republishing content can have on SEO.

Duplicate content on my site can detract from the SEO of my entire domain?

The short answer is “yes, it may,” but not in the way you are thinking.

Excluding extreme cases, your site is safe as long as the amount of duplicate content is small. But what characterizes an extreme case? Something like what happened with a company that hired a public relations (PR) consultancy a bit bad: PR consulting did not bother writing a press release; they copied the text from the company’s homepage and sent it to several vehicles.

Many of these vehicles published the text unchanged in several places. Google’s algorithms then started firing alarms because the same text was appearing on many websites in a short time. For Google, this was a sign of spam, causing pages that carry this duplicate content to be stuck in searches.

In short, some duplicate posts on your blog will not hurt the ranking of your site. Keep in mind that Google is a company with more than 50,000 employees and that, among other things, makes cars that do not need drivers and kites to generate wind energy. That is, there are a lot of smart people working there. Therefore, Google is smart enough to know that your site is not malicious by having a duplicate post among 50 other original and quality content.

But why is duplicate content a problem for SEO?

Back in 2014, Google’s update of the algorithm, named Panda, refined the organic results shown on the search page, giving priority to relevant content for the user. Already publications poor in information or repeated have lost their visibility.

The main problem with “non-malicious” duplicate content is that search engines do not know which version of the content to display, because if the original content is not useful to the user, equal content will not be either.

So if you do not tell Google which content is the right one to display, it will take care of choosing one of the versions – possibly opting for the version that was indexed first, the original. But if there are too many external links pointing to that version of the page, the chances increase even more.

In addition to choosing which content to display in search results, Google also needs to determine which version will receive the authority in the case of other sites that have links to one of the content versions.

Again, if you do not tell Google which version you should receive this authority, it may assign to the wrong versions, even thinning authority from the various versions – and thus harming the placement of the content in the search results. This directly affects your placement and reduces the amount of visitors coming to your page.

You already have duplicate content on the site without knowing

Duplicate content is often generated by the content management platforms themselves, such as WordPress, without you knowing.

Here are some examples of what is considered duplicate content for Google:

Domain with and without www: http://your_domain.com and http://www.your_domain.com are considered two different sites for Google. Therefore, all the pages within these sites that can be accessed with or without the www are duplicate contents for Google.
Same content accessed with different URLs: It is very common for blog posts to be available in your unique URL and also in other URLs that only show posts in a certain category.
Page version to print: Some sites generate a print-specific version of the page. When accessed by a URL other than the original, this type of content also represents duplicity for search engines.

How to deal with duplicate content?

There are several ways to “teach” search engines how to handle their duplicate content so that you focus authority on the version you want:

Permanent redirects

Also known as redirect 301, they are made directly on the server and used so that users no longer see the page in question, being automatically redirected to another specified page.

By doing this, search engines understand that all the authority that the page should be transferred to the redirect destination page.

This is a widely used method when a company is changing domains and does not want to lose the authority it has already conquered.

But, remember, any redirection involves loss of authority. However, you can minimize the effects by doing it right. There are also some WordPress plugins that make it easier for those who are not intimate with codes.

Canonical tags

While permanent redirects are done on the server, canonical tags are tags inserted directly into the page’s HTML code.

Basically, it specifies the “canonical” version of the content, that is, the URL of the original content. That way, all the authority of incoming links goes to the specified URL.

This option is often used when you want to republish an old post or post a different post to a guest post.

Consistency of internal links

As we mentioned above, there are many pages that can be accessed by more than one link, for example http://your_domain.com or http://www.your_domain.com.

In order not to confuse Google, do not use links from different URLs on your site that lead to the same page.

Tag “noindex, follow”

This tag allows the search engine to crawl the page, but not include it in your search results.

Link to the original article

When you play an article, such as a guest post, you can end up with a link to the original article.

That way, Google knows that this is the URL with the original content.

Summary with your words and the link to the original article

When you’re playing something from another site, it’s helpful to reread your content with your own words so your page gets authoritative.

The link to the original content will help Google know that they are related.

Index pages with post shortened

Never use pages with the listing of posts where the post appears in full.

Show only the first words or summary of the post, because by showing the post in its entirety you will be duplicating the content in different URLs, which we have already seen that undermines your positioning.

Conclusion

Duplicate content is a problem for any SEO strategy, but there are consolidated techniques to tell search engines what the correct version of the material is.

In addition, with the exception of very extreme cases, the concern that Google or another search engine will punish a domain for some duplicate content is not grounded in practical experience.

Now that you already know the truth about duplicate content and SEO, check out the 10 SEO myths you need to know to position yourself well on Google.

About Author:

Yousuf A. Raza is a professional SEO strategist who working at Local SEO Blast to provide Local SEO across the world. Apart from ranking others website into Google. He’s contributing informative articles into different journals.