How To Protect Your Blog Content From RSS Feed Scrapers

Posted on May 22, 2013
By Rhonda Hurwitz

Does your RSS feed inadvertently contribute to content theft?

In a prior post, we talked about how peer pressure can work to fight online content piracy – particularly for naïve infringement. But peer pressure alone can’t always work. That’s because some content is stolen by bots and automated programs that scrape your RSS feed.

Like spammers, these pernicious programs operate automatically, and it takes tougher measures to defeat them.

The Basics: What’s an RSS feed?

In addition to email subscriptions, many blogs offer an RSS feed as a way to subscribe to their newest posts. Many people use an RSS feed reader, rather than their inbox or social media, to regularly access the blogs they follow, at a time of their choosing.

Here’s the problem: Most bloggers are blissfully unaware that scraper bots crawl the Internet grabbing RSS feeds of entire posts that are there for the taking – until their content is scraped, of course.

Ignoring your RSS feed settings can be an invitation to scrapers and a sure way to lose control over where your content ends up.

Don’t make it easy for content thieves: Review your RSS feed

There are several ways to protect your RSS feed and foil scrapers. Configure your RSS feed to include summaries only, and add links back to your blog.

1. Full Text vs. Summary: RSS feeds can be configured to include the full content of a post, or just a summary. The latter is preferable when it comes to foiling scrapers. While some blog subscribers may prefer to read your entire post in their reader, most won’t mind clicking through to a post that interests them in its entirety.

Here’s how to adjust the dashboard settings on a WordPress blog:

Go to Settings/Reading, and for each article in your feed, choose the setting marked “Summary” instead of "Full Text". Easy enough! The process will differ among various CMS, but the idea is the same. If you use another CMS, look for similar settings options.

2. Adding Links: You can also add links to your blog within your post, so that if your post is scraped, a link to your (original) post is embedded. Many scrapers will strip these out, but if left in you will get a pingback notifying your of where your content has been used, and readers will see the link.

Note: Some SEO plugins (such as Yoast WordPress SEO) make it easy to customize a special link designed to identify you as the original author. They add a link before and after the post identifying blog URL, author name, etc.

What else can you do to combat blog content theft?

1. Regularly Check for Duplicate Content:

When content thieves scrape your content, it results in duplicate content. This can cause problems with your SEO rank -- search engines may index the scraper site before or instead of your original content, and the imposter outranks you in search results.

iCopyright’s Discovery infringement detection service continuously monitors the web and locates suspicious duplicates of your content. This is the most effective way to track down offending sites – including scraper sites – and resolve infringements quickly.

2. Offer Options to Republish Your Content Legitimately:

Some sites taking your feed and displaying it on their site aren’t scraper bots at all, but misguided bloggers who don't realize what they are doing is stealing. They simply want to republish your content and may think they are doing you a “favor.”

No one can republish your content without permission. That’s copyright infringement. Still, suppose someone wants to legitimately republish your content for all the right reasons?

Consider adding an instant licensing tool like the iCopyright Toolbar to your website, to make it easy to republish your content -- automatically, legally, and with permission. Downstream guys who want to display your content get the full feed of the stories -- and you get paid! (Full disclosure – instant licensing as a way to monetize your blog is one of the features of the iCopyright Toolbar.)

Takeaway: Time to “Spring Clean” Your RSS Feed

The combination of the right RSS feed settings PLUS an automated duplicate content monitoring tool should deliver the one-two punch in your fight against blog content theft!

Free eBook: Are you looking to increase traffic, readers and revenues?

Download our checklist, "Curating Content, The New Art of Republishing", to learn how to ease your workflow, expand your reach and reinforce your editorial vision through a content strategy that adds curation and republishing to the mix.

Curating Content: The New Art of Republishing (Best Practices Checklist)