Sitemap Generators, XML files, and Webmaster Advice Build your Google Sitemap.
How to Build a Google Sitemap might appear to be a daunting task, particularly to those of us that aren’t computer gurus. However, with today’s available generators and online tutorials, support is easily accessible and available to help you accomplish your Sitemap mission. Google Sitemap Generators are available and the requirements of building and submitting a Google Sitemap is outlined at Google’s Webmaster’s page located at www.google.com/webmasters/.
Google, Yahoo, MSN and Ask.Com have standardized Sitemaps, and all use XML to facilitate faster crawling by the spiders. Traditional Sitemaps that use HTML can still be submitted, and Googling an XML Sitemap page next to an HTML Sitemap page should not cause any problems. However, the major search engines and generators encourage you to build your Sitemap using XML. Google created www.Sitemaps.org to provide the XML schema for the Sitemap protocol under the Creative Commons License.
As you know, spiders discover pages from links. Sitemaps are the webcrawler’s assistant. They help the spiders locate your URLs and provide the spiders with additional information that you have provided. When building a Google Sitemap you include the date the page was last modified, how often the pages change, and Sitemap index files. In April of 2007, Google added the ability to place a robots.txt file specifying the location of your Sitemaps. Just add “Sitemap: http://www.mysite.com/Sitemap.xml” in the usual location for the robot.txt file to tell Google and Ask.com to recognize the location of your Sitemaps.
Google provides details about building Sitemaps in the Webmaster Tools Overview and FAQ pages under Sitemap Protocol. Requirements they list for building Sitemaps include:
* The format must include XML tags. (As mentioned above, generators can assist you with submitting or converting the older HTML formats.)
* Begin with an opening <urlset> tag and end with a closing </urlset> tag; use a <url> entry for each for each URL as a parent XML tag; and a <loc> child entry for each <url> parent tag.
* The Sitemap must be UTF-8 encoded.
* You are limited to 50,000 URLs to each Sitemap file, and each one can be no larger than 10MB. Sitemaps can be compressed using gzip.
* You can submit a site map for just portion of your URLs that are updated frequently.
* URLs must use entity escape codes and follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and use W3C Datetime encoding for lastmod timestamps.
* Your URLs must be completely specified. Make sure your http’s and slashes are in, but you can only have one version of URL in your Sitemap. Frames must include both URLs (frameset and frame contents). Multiple versions of URLs will effect the crawling, but the position of your URL does not have an impact.
* Remove session Ids in URLs.
* The “priority hint” is only relative to URLs in your own site.
* It is “strongly recommended” that you put your Sitemap at the root directory of your web server.
Building a Sitemap for Google takes time, but it can give you an edge in being included in search page results. Absorb all the information on Google’s Webmaster page, critically research the generators available, and take a visit to some blogs and forums focusing on XML and building Sitemaps. The more you learn about Building a Google Sitemap the less daunting the task will feel. Give it a try. There’s a lot of sitemap support available and it certainly can’t hurt. Maybe you’ll turn into a computer guru after all!