- Spam blog
Spam blogs, sometimes referred to by the
neologismsplogs [http://www.wired.com/wired/archive/14.09/splogs.html] , are artificially created weblog sites which the author uses to promote affiliated websites or to increase the search engine rankings of associated sites. The purpose of a splog can be to increase the PageRankor backlink portfolio of affiliate websites, to artificially inflate paid ad impressions from visitors, and/or use the blog as a link outlet to get new sites indexed. Spam blogs are usually a type of scraper site, where content is often either inauthentic textor merely stolen (see " blog scraping") from other websites. These blogs usually contain a high number of links to sites associated with the splog creator which are often disreputable or otherwise useless websites.
There is frequent confusion between the terms "splog" and "
spam in blogs". Splogs are blogs where the articles are fake, and are only created for search engine spamming. To spam in blogs, conversely, is to include random comments on the blogs of innocent bystanders, in which spammers take advantage of a site's ability to allow visitors to post comments that may include links.
This is used often in conjunction with other
spammingtechniques, including " spings".
The term splog was popularized around mid August 2005 when it was used publicly by
Mark Cuban, but appears to have been used a few times before for describing spam blogs going back to at least 2003. It developed from multiple linkblogs that were trying to influence search indexes and others trying to Google bombevery word in the dictionary.
The term may be applied to more recent infections, most noticably those reported by Webtrends [http://securitylabs.websense.com/content/Blogs/3063.aspx] in April 2008. Leveraging
botnets, spammers have infected several thousand pages which display prominent keywords from the Google Trendssite by bypassing the CAPTCHA authentication method, which had previously subdued all spam bloggers. A recent sighting puts the top ten google hottest terms of the day as all being owned by spambots on the Blog Results page. As they have gone mostly unchecked, they have also infected real SERPPage One web results and corrupt any hot search terms more than a month old.Phraseoligist [http://www.phraseologist.com/2008/07/over-extended-link-engine.html] reports that the attacks appear on AOL Journal, Blogspot and Spaces Live. Hackers are using a number of methods including link farming, spamdexingand keyword stuffingeach in a simple, moderated form to achieve top PageRankresults. Most of the sites contain an animated graphic which appears as a youtubestreaming video. Once clicked, users become infected with one of several variants of spyware. This generates revenue for the spambot's owner.
Splogs have become a major problem on free blog hosts such as
blog search enginesand damaging bloggers community networking (e.g. Blogger's next blog link).
Google's search engine uses PageRank, which is susceptible to
link flooding, especially from highly weighted bloggers. One splog clearly states: "Google's run by people who can't be bothered to post links on the internet." Splogs could become a detractor to people using, enjoying and finding value in the blogosphere. Splogs sometimes choose a name similar to a popular blog in order to benefit from the occasional incoming link from careless bloggers, who think they are linking to the popular site.
Splog activity can cause problems for legitimate bloggers, if search engines respond to splog by blocking or treating as 'suspicious' all web addresses in a particular domain.
Full content RSS feeds are actually compounding the splog problem. [http://blog.taragana.com/index.php/archive/are-full-content-rss-feeds-compounding-the-splog-problem/] RSS makes it easy to copy content from genuine blogs. Splog RSS feeds pollute RSS search engines, and are reproduced and propagated around the Net.
Several splog reporting services have been created for good willed users to report splog with plans of offering these splog URLs to search engines so that they can be excluded from search results. [http://www.splogreporter.com/ Splog Reporter] was the first service of this kind. Then came [http://www.splogspot.com/ SplogSpot] which actually maintains a large database of splogs and makes it available to the public via APIs, and [http://www.a2b.cc/pingblock A2B] which blocks web server IP addresses that splog URLs resolve to. There is [http://blog.taragana.com/index.php/archive/wordpress-plugin-to-automatically-add-copyright-message-to-your-rss-atom-feeds/ Feed Copyrighter plugin (for WordPress)] which allows you to automatically add copyright messages to feed, so splogs can be easily spotted and reported by visitors or through Google search. There is also
TrustRank, which attempts to automatically find them. Blogger has implemented a system that can detect splogs and then force them to take a Captcha'spell this word' test. Blogger deleted thousands of splogs in September 2005 [http://buzz.blogger.com/deleted-subdomains-2005-10-17.txt] and even more in December.
On February 24, 2007, Splog Reporter announced on its website that it would no longer be providing a splog reporting service.
Adversarial information retrieval
Spam in blogs
* [http://help.blogger.com/bin/answer.py?answer=42577#whatwedo Blogger: About Spam Blogs]
* [http://ebiquity.umbc.edu/paper/html/id/269/ SVMs for the Blogosphere: Blog Identification and Splog Detection]
* [http://news.com.com/Tempted+by+blogs%2C+spam+becomes+splog/2100-1032_3-5903409.html?tag=nl.caro news.com.com: "Tempted by blogs, spam becomes 'splog'"]
The Guardian", 17 November2005, [http://technology.guardian.co.uk/weekly/story/0,16376,1643774,00.html "Cashing in on fake blogs"]
* [http://ebiquity.umbc.edu/blogger/splog-software-from-hell/ Examples of splog creation software]
* [http://www.wired.com/wired/archive/14.09/splogs.html Wired magazine article on splogs]
* [http://ebiquity.umbc.edu/blogger/2007/02/01/pings-spings-splogs-and-the-splogosphere-2007-updates/ Pings, Spings, Splogs and the Splogosphere: 2007 Updates]
Wikimedia Foundation. 2010.