Blog scraping

Blog scraping

Blog scraping is the process of scanning through a large number of blogs, usually daily, searching for and copying content. This process is conducted through automated software. The software and the individuals who run the software are sometimes referred to as blog scrapers.

Scraping is copying a blog that is not owned by the individual initiating the scraping process. If the material is copyrighted it is considered copyright infringement, unless there is a license relaxing the copyright. The scraped content is often used on spam blogs or splogs.

Issues

A blog scraper who gathers content that is copyrighted material is considered in violation of the law. Blog scraping can create problems for the individual or business who owns the blog. Blog scraping is particularly worrisome for business owners and business bloggers. Scrapers can copy an entire post from an independent or business blog. The duplicated content will include the author's tag and a link back to the author's site (if that link appears in the author's tag). However, most blog scrapers copy only a portion of the content that is keyword-relevant to their splog topic. By doing this, the keyword relevancy of the scraper's site is increased. Secondly, by not scraping the entire post, any outbound links are eliminated which means their search engine ranking is not reduced.

Additionally, scraped content can appear on literally any type of splog or RSS-fed spam site. This means an unsuspecting individual could find their creative or copyrighted material copied onto a site promoting pornography or similar type of content that may be offensive to the original author and his/her audience. This may be damaging to the original author's reputation.

External links

* [http://blog.taragana.com/index.php/archive/wordpress-plugin-to-automatically-add-copyright-message-to-your-rss-atom-feeds/2/ WordPress Feed Copyrighter Plugin]
* [http://www.advancedbusinessblogging.com/businessblog/?p=194 Six Steps to Prevent Content Theft and Combat Copyright Infringement on Your Business Blog]


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Spam blog — Spam blogs, sometimes referred to by the neologism splogs [http://www.wired.com/wired/archive/14.09/splogs.html] , are artificially created weblog sites which the author uses to promote affiliated websites or to increase the search engine… …   Wikipedia

  • Screen scraping — is a technique in which a computer program extracts data from the display output of another program. The program doing the scraping is called a screen scraper. The key element that distinguishes screen scraping from regular parsing is that the… …   Wikipedia

  • Screen-Scraping — Der Begriff Screen Scraping (engl., etwa: „Bildschirm auskratzen“) umfasst generell alle Verfahren zum Auslesen von Texten aus Computerbildschirmen. Gegenwärtig wird der Ausdruck jedoch beinahe ausschließlich in Bezug auf Webseiten verwendet… …   Deutsch Wikipedia

  • Web Scraping — Der Begriff Screen Scraping (engl., etwa: „Bildschirm auskratzen“) umfasst generell alle Verfahren zum Auslesen von Texten aus Computerbildschirmen. Gegenwärtig wird der Ausdruck jedoch beinahe ausschließlich in Bezug auf Webseiten verwendet… …   Deutsch Wikipedia

  • Scrape — may refer to:Medicine* Abrasion, a type of injuryTools* Bottle scraper, for removing content from bottles * Scraper (kitchen), a kitchen utensil * Card scraper, for smoothing wood or removing old finish * Hand scraper, for finishing a metal… …   Wikipedia

  • Web feed — Common web feed icon A web feed (or news feed) is a data format used for providing users with frequently updated content. Content distributors syndicate a web feed, thereby allowing users to subscribe to it. Making a collection of web feeds… …   Wikipedia

  • XBMC — Media Center XBMC Media Center Home Screen Developer(s) …   Wikipedia

  • Melissa Delgadillo — (born May 23, 1982) is an American beauty queen, philanthropist, and socialite. She is most known for winning the title of Miss Houston International 2011. Contents 1 Pageantry 1.1 Early Pageantry Career 1.2 Miss Houston International 2011 …   Wikipedia

  • Webjay — History = Webjay was a web based playlist service launched in early 2004. Playlists consisted of links to Vorbis, MP3, WMA, RealAudio and/or other audio files on the web. Webjay users could create new playlists by copying from existing playlists …   Wikipedia

  • WSO2 Mashup Server — Archivo:Mashup logo.gif Desarrollador WSO2 y la comunidad http://wso2.org/projects/mashup Información general …   Wikipedia Español

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”