Sitemaps

Sitemaps

The Sitemaps protocol allows a webmaster to inform search engines about URLs on a website that are available for crawling. A Sitemap is an XML file that lists the URLs for a site. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. This allows search engines to crawl the site more intelligently. Sitemaps are a URL inclusion protocol and complement robots.txt, a URL exclusion protocol.

Sitemaps are particularly beneficial on websites:* where some areas of the website are not available through the browsable interface, or:* where webmasters use rich Ajax or Flash content that is not normally processed by search engines.

The webmaster can generate a Sitemap containing all accessible URLs on the site and submit it to search engines. Since Google, MSN, Yahoo, and Ask use the same protocol now, having a Sitemap would let the biggest search engines have the updated pages information.

Sitemaps supplement and do not replace the existing crawl-based mechanisms that search engines already use to discover URLs. By submitting Sitemaps to a search engine, a webmaster is only helping that engine's crawlers to do a better job of crawling their site(s). Using this protocol does not guarantee that web pages will be included in search indexes, nor does it influence the way that pages are ranked in search results.Fact|date=July 2008

History of Sitemaps

* Google first introduced [http://googleblog.blogspot.com/2005/06/webmaster-friendly.html Sitemaps 0.84] in June 2005 so web developers could publish lists of links from across their sites.

* Google, MSN and Yahoo announced [http://www.google.com/press/pressrel/sitemapsorg.html joint support for the Sitemaps protocol] in November 2006. The schema version was changed to "Sitemap 0.90", but no other changes were made.

* In April 2007, [http://blog.ask.com/2007/04/sitemaps_autodi.html Ask.com and IBM announced support] for Sitemaps. Also, Google, Yahoo, MS announced auto-discovery for sitemaps through robots.txt.

* In May 2007, the state governments of [http://www.google.com/publicsector/ Arizona, California, Utah and Virginia] announced they would use Sitemaps on their web sites.

The Sitemaps protocol is based on ideas [cite conference | author= M.L. Nelson, J.A. Smith, del Campo, H. Van de Sompel, X. Liu| title=Efficient, Automated Web Resource Harvesting | booktitle=WIDM'06 | year=2006 |url=http://public.lanl.gov/herbertv/papers/f140-nelson.pdf] from "Crawler-friendly Web Servers". [cite conference | title=Crawler-friendly web servers | booktitle = Proceedings of ACM SIGMETRICS Performance Evaluation Review, Volume 28, Issue 2 | year=2000 | doi=10.1145/362883.362894 | author=O. Brandman, J. Cho, Hector Garcia-Molina, and Narayanan Shivakumar]

XML Sitemap Format

The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded.(Sitemaps can also be just a plain text list of URLs. They can also be compressed in .gz format.)

ample

A sample Sitemap that contains just one URL and uses all optional tags is shown below.

http://w3c-at.de 2006-11-18 daily 0.8

earch Engine Submission

If Sitemaps are submitted directly to a search engine (pinged), it will return status information and any processing errors. The details involved with submission will vary with the different search engines. The location of the Sitemap can also be included in the robots.txt file by adding the following line to robots.txt:

Site

The should be the complete URL to the Sitemap, such as: "http://www.example.org/sitemap.xml". This directive is independent of the user-agent line, so it doesn't matter where it is placed in the file. If the website has several Sitemaps, this url can simply point to the main Sitemap index file.

The following table lists the Sitemap submission URLs for several major search engines:

itemap limits

Sitemap files have a limit of 50,000 URLs and 10 megabytes per sitemap. Sitemaps can be compressed using gzip, reducing bandwidth consumption. Multiple sitemap files are supported, with a Sitemap index file serving as an entry point for a total of 1000 sitemaps.

As with all XML files, any data values (including URLs) must use entity escape codes for the characters : ampersand(&), single quote ('), double quote ("), less than (<) and greater than (>).

Notes

ee also

*Biositemap, a protocol for broadcasting and disseminating information about computational biology resources (data, software tools and web-services)
*Metadata
*Resources of a Resource - ROR
*Site map, a graphical representation of the architecture of a web site
*Sitemap index - XML file that lists the multiple XML sitemap files

External links

* [http://www.sitemaps.org Official page] (set up by Google, Yahoo & MSN)
* [http://www.google.com/press/pressrel/sitemapsorg.html Google, Yahoo, MSN joint announcement in Nov'06]
* [http://www.ysearchblog.com/archives/000437.html Google, Yahoo, MSN, Ask announcing auto-discovery in Apr'07]
* [http://www.google.com/webmasters/sitemaps/docs/en/faq.html Google's FAQ]
* [http://sitemaps.blogspot.com/ Official Blog]
* [http://groups.google.com/group/google-sitemaps Google Sitemaps newsgroup (archived)]
* [http://groups.google.com/group/Google_Webmaster_Help-Sitemap Google Sitemaps newsgroup]
* [http://goog-sitemapgen.sourceforge.net/ Sitemap Gen] Python script to generate Sitemaps by Google
* [http://code.google.com/sm_thirdparty.html Third party programs & websites] listed on code.google.com
* [http://maposite.com Maposite] - free online service for generating sitemap.xml


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Sitemaps — Sitemaps  XML файл с информацией для поисковых систем (таких как Яндекс, Google, Yahoo, Ask.com, Bing) о страницах веб сайта, которые подлежат индексации. Sitemaps может помочь поисковикам определить местонахождение страниц сайта, время их… …   Википедия

  • Sitemaps — Das Sitemaps Protokoll ermöglicht einem Webmaster, Suchmaschinen über Seiten seiner Website zu informieren, die von dieser ausgelesen werden sollen. Der Standard wurde am 16. November 2006 von Google, Yahoo! und Microsoft beschlossen. Es handelt… …   Deutsch Wikipedia

  • Sitemaps — Le protocole Sitemaps permet à un webmestre d informer les moteurs de recherche quelles adresses d un site web sont disponibles pour l indexation automatique. Proposé initialement par Google, cette technologie a ensuite été adoptée par Live… …   Wikipédia en Français

  • sitemaps — n. listing or diagram on a World Wide Web site that shows the site s structure (Computers); map of a location, diagram of a place …   English contemporary dictionary

  • Google Sitemaps — es una herramienta que la compañía Google pone a disposición de los webmaster registrados para una mejor búsqueda y posicionamiento en su buscador. Al crear un Sitemap, Google puede rastrear más fácilmente los contenidos, además de proporcionar… …   Wikipedia Español

  • Video Sitemaps — VP Files that inform search engines what particular web page on a website an Internet video can be found …   Audio and video glossary

  • Site map — A site map (or sitemap) is a representation of the architecture of a web site. [Peter Morville, Information Architecture on the World Wide Web , Feb 1998, pp:58] It can be either a document in any form used as a planning tool for web design, or a …   Wikipedia

  • Sitemap index — A Sitemap index is an XML file that lists the multiple XML sitemap files. Sitemap index is an XML sitemap for multiple XML sitemaps. The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file [ [http://www.sitemaps …   Wikipedia

  • Mapa de sitio web — Un mapa de sitio web (o mapa de sitio o mapa web) es una lista de las páginas de un sitio web accesibles por parte de los buscadores y los usuarios. Puede ser tanto un documento en cualquier formato usado como herramienta de planificación para el …   Wikipedia Español

  • Search engine submission — is how a webmaster submits a web site directly to a search engine. While Search Engine Submission is often seen as a way to promote a web site, it generally is not necessary. Because the major search engines like Google, Yahoo, and MSN use… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”