Preventing Duplicate Content in WordPress
There are tons of blog entries about duplicate content and WordPress and many accounts of users who have experienced problems with duplicate content hurting their search rankings. There are different theories about whether or not this hurts search engine rankings if it is duplicate content within your own site. All in all, it does hurt rankings when search engines such as Google crawl the site and finds multiple pages with the same content. It will appear to them that you are stuffing content into multiple pages simply to increase page rankings and it will definitely cause problems with your SERPs. We want to detail below a few of the tweaks to WordPress you can make sure to include that will make sure you are not dinged for duplicate content as this oftentimes can happen completely innocently with a WordPress user not being aware of some of the things that happen behind the scenes with how WP handles pages, content, etc.
Remove Redirects to Moved Blog Entries
If you have moved a blog entry to another category for instance, WordPress will know about the old category as well as the new category. It will automatically redirect a search engine to the new category. This can get you dinged for duplicate content as the page doesn’t throw a 404 error, it simply redirects to the new page. It will get indexed for both pages resulting in duplicate content. The code below prevents the built in functionality from causing you fits with duplicate entries.
Include the following in your functions.php file:
Be careful with plugins that pull content from one page or another
If you use plugins that may pull certain content from some of your pages and post these dynamically on another, you can get penalized for duplicate content for this as well. The page that has the original entry will show up as having the same content as the page that is pulling the content. Both pages can possibly be indexed of course under different URLs causing duplicate entries and possibly penalties.
Oftentimes, bloggers may go wild with the number of categories a post is filed under. It is suggested though to limit the categories to the one or ones that it absolutely applies to. If you categorize a blog post under 5 categories, it could possibly be indexed under all of those separate categories. Be frugal with your choices here.
The Replytocom problem
WordPress creates pages behind the scenes for comments made to blog entries. These are presented to a search engine like all other pages if measures aren’t taken to prevent this. There is a plugin called the ReplytoCom Redirector which takes these requests from search bots and redirects them to the correct URL pages of the actual blog content.
Alternatively or in addition, to be on the safe side, it is a good idea to add the following line to your robots.txt file just to make sure that these pages aren’t scanned:
Page tags can also show up as duplicate entries, so it is also a good idea to make sure these pages are not indexed as well. Add the following to robots.txt:
All in all WordPress is a great blogging and content platform that continues to get better and more powerful with every release. Many of the issues that we see with duplicate content will probably be resolved in future releases, however, until then, the above tweaks will help to make sure you are not penalized severely for having duplicate content which can happen without you even knowing it due to the way the search engines crawl pages and the way WordPress works on the backend.