Scraper site

Scraper site

A scraper site is a website that copies all of its content from other websites using web scraping.cite web|title= Scraper sites, spam and Google (tactics/motives)|publisher=Googlerankings.com diagnostics|year=2007|url=http://diagnostics.googlerankings.com/scraper-sites.html|accessdate=2008-01-02] No part of a scraper site is original. A search engine is not a scraper site: sites such as Yahoo and Google gather content from other websites and index it so that the index can be searched with keywords. Search engines then display snippets of the original site content in response to a user's search.

In the last few years, and due to the advent of the Google Adsense web advertising program, scraper sites have proliferated at an amazing rate for spamming search engines. Open content sites such as Wikipedia are a common source of material for scraper sites.

Made for AdSense

Some scraper sites are created for monetizing the site using advertising programs such as Google AdSense. In such case, they are called "Made for AdSense" sites or MFA. This is also a derogatory term used to refer to websites that have no redeeming value except to get web visitors to the website for the sole purpose of clicking on advertisements.

Made for AdSense sites are considered sites that are spamming search engines and diluting the search results by providing surfers with less-than-satisfactory search results. The scraped content is considered redundant to that which would be shown by the search engine under normal circumstances had no MFA website been found in the listings.

These types of websites are being eliminated in various search engines and sometimes show up as "supplemental results" instead of being displayed in the initial search results.

Some sites engage in "Adsense Arbitrage"--they will buy Adwords spots for lower cost search terms and bring the visitor to a page that is mostly Adsense. The arbitrager then makes the difference between the low value clicks he bought from AdWords and the higher value clicks generated by this traffic on his MFA sites. In 2007, Google cracked down on this business model by closing the accounts of many arbitragers [See: http://www.jensense.com/2007/05/18/google-adsense-disabling-arbitrage-publisher-accounts-as-of-june-1st/] . Another way Google and Yahoo are combating the proliferation of arbitrage are through quality scoring systems. For example, in Google's case, Adwords penalizes "low quality" advertiser pages by placing a higher per click value to its campaigns [See for example: http://www.seobook.com/yahoo-kills-ppc-arbitrage] . This effectively evaporates the arbitrager's profit margin.

Legality

Scraper sites may violate copyright law. Even taking content from an open content site can be a copyright violation, if done in a way which does not respect the license. For instance, the GNU Free Documentation License (GFDL) and Creative Commons ShareAlike (CC-BY-SA) licenses require that a republisher inform readers of the license conditions, and give credit to the original author.

Techniques

Many scrapers will pull snippets and text from websites that rank high for keywords they have targeted. This way they hope to rank highly in the SERPs (Search Engine Results Pages). RSS feeds are vulnerable to scrapers.

Some scraper sites consist of advertisements and paragraphs of words randomly selected from a dictionary. Often a visitor will click on a pay-per-click advertisement because it is the only comprehensible text on the page. Operators of these scraper sites gain financially from these clicks. Ad networks such as Google AdSense claims to be constantly working to remove these sites from their programs although there is an active polemic about this, since these networks benefit directly from the clicks generated at these kind of sites. From the advertiser's point of view, the networks don't seem to be making enough effort to stop this problem.

Scrapers tend to be associated with link farms and are sometimes perceived as the same thing, when multiple scrapers link to the same target site. A frequent target victim site might be accused of link-farm participation, due to the artificial pattern of incoming links to a victim website, linked from multiple scraper sites.

References

ee also

*Domain parking


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Scraper Site — Eine Scraper Site ist eine Webseite, die einen Großteil ihres Inhaltes von anderen Seiten kopiert hat. Ziel ist es in der Regel, automatisiert und mit geringem Aufwand eine Seite zu erstellen, die in den Ergebnislisten der Suchmaschinen gut… …   Deutsch Wikipedia

  • Scraper site — Eine Scraper Site ist ein Webseite, die einen Großteil ihres Inhaltes von anderen Seiten kopiert hat. Ziel ist es in der Regel automatisiert und mit geringem Aufwand eine Seite zu erstellen, die in den Ergebnislisten der Suchmaschinen gut… …   Deutsch Wikipedia

  • Quartz Scraper Site — Infobox nrhp | name =Quartz Scraper Site nrhp type = caption = nearest city= Keens Mills, Maine locmapin = Maine area = architect= architecture= added = November 14, 1992 governing body = Private mpsub=Androscoggin River Drainage Prehistoric… …   Wikipedia

  • Scraper — (engl. Kratzer, Räumer) bezeichnet: Granierstahl oder Wiegeeisen, ein Werkzeug zur Bearbeitung von Kupferstichen Motorschürfwagen oder Schürfkübelwagen, eine schwere Baumaschine zur Bewegung von Erdaushub Scraper Site, eine Website oder ein Blog …   Deutsch Wikipedia

  • Wheel tractor-scraper — In civil engineering, a wheel tractor scraper is a piece of heavy equipment used for earthmoving. The rear part has a vertically moveable hopper (also known as the bowl) with a sharp horizontal front edge. The hopper can be hydraulically lowered… …   Wikipedia

  • Olsen-Chubbuck Bison Kill Site — The Olsen Chubbuck Bison kill site is located 10.5 miles to the southeast of the town of Firstview in Cheyenne County, Colorado. The Paleo Indian site dates back to an estimated 8000 6500 B.C. and provides evidence for bison hunting long before… …   Wikipedia

  • Web scraping — (sometimes called harvesting) generically describes any of various means to extract content from a website over HTTP for the purpose of transforming that content into another format suitable for use in another context. Those who scrape websites… …   Wikipedia

  • Spamdexing — For spam on Wikipedia, see Wikipedia:Spam and Wikipedia:WikiProject Spam. In computing, spamdexing (also known as search spam, search engine spam, web spam or Search Engine Poisoning)[1] is the deliberate manipulation of search engine indexes. It …   Wikipedia

  • Scrape — may refer to:Medicine* Abrasion, a type of injuryTools* Bottle scraper, for removing content from bottles * Scraper (kitchen), a kitchen utensil * Card scraper, for smoothing wood or removing old finish * Hand scraper, for finishing a metal… …   Wikipedia

  • AdSense — Google AdSense Desarrollador Google Inc. www.google.com/adsense Información general …   Wikipedia Español

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”