- Surface Web
The surface Web (also known as the visible Web or indexable Web) is that portion of the
World Wide Web that is indexed by conventionalsearch engine s. The part of the Web that is not reachable this way is called thedeep Web . Search engines construct adatabase of the Web by using programs called "spiders" orWeb crawler s that begin with a list of known Web pages. The spider gets a copy of each page and "indexes" it, storing useful information that will let the page be quickly retrieved again later. Anyhyperlink s to new pages are added to the list of pages to be "crawled". Eventually all reachable pages are indexed, unless the spider runs out of time [ [http://www.sitepoint.com/article/indexing-limits-where-bots-stop| Search Engine Indexing Limits] ] or disk space. The collection of reachable pages defines the "surface Web".For various reasons (e.g., the
Robots Exclusion Standard , links generated byJavaScript and Flash, password-protection) some pages can not be reached by the spider. These 'invisible' pages are referred to as thedeep Web .A recent study [ [http://www.cs.uiowa.edu/~asignori/web-size/| Univ. of Iowa study (Jan 2005)] ] queried the
Google ,MSN ,Yahoo! , and Ask Jeeves search engines with search terms from 75 different languages and determined that there were over 11.5 billion web pages in the publicly indexable Web as of January 2005.As of June 2008, the indexed web contains at least 63 billion pages. [ [http://www.worldwidewebsize.com/ The size of the World Wide Web] ]
References
ee also
*
Lightnet
Wikimedia Foundation. 2010.