Search Engine Optimization and Marketing for E-commerce

Auto-Submitting Sitemaps to Google...Necessary?

by Andrew Kagan 1. May 2009 10:13

Google's Webmaster Tools provides webmasters with a way to upload XML sitemaps to improve the accuracy of Google's index. Registering and maintaining an accurate sitemap (Google, Yahoo, and Microsoft all accept sitemap data) is important to proper indexing of your website pages, and Google provides two methods for notifying them when the sitemap is updated: manually through the Google website, and "semi-automatically" by sending an HTTP request that signals Google to reload the sitemap.

Ping me when you're ready

The second method can be automated through server-side scripting, so that when content on a website or blog is updated, the sitemap file is updated as well, and the update request is sent to Google at the same time. In theory, this should provide rapid updating of Google's index to include the latest content on your website.

Depending on a number of factors, Google will automatically reload your sitemap file without you specifically requesting it to do so. One factor is the content of the sitemap itself. Besides a list of URLs on your website, the sitemap file can also hold information about date the URL was last updated, and how frequently it is updated. For example, if your homepage content changes every day, you can assign a frequency of "daily" to that URL, telling the search engine it should check that page every day.

It should be noted that incorrect use (or "abuse") of a sitemap, such as indicating pages are new when the content hasn't changed, can cause problems if the search engine recrawls the page too many times without seeing any new data. Empirical data have shown that pages may be dropped from the search engine index under this scenario, and new pages added to this "unreliable" sitemap may be ignored or crawled more slowly.

It's a popularity contest

Another factor in sitemap reloading is link popularity. If a lot of websites are linking to particular pages on your website, search engine spiders will crawl those pages more often, and if the site is large, the sitemap will help prioritize which pages are crawled first.

To Submit, or Not to submit...

We have seen that once a sitemap is submitted and indexed by search engines, they will regulary come back and reload the sitemap looking for new URLs, whether you re-submit it or not. As your website's pagerank (on Google) and general link popularity grows, there's an increase in the frequency that the sitemap will be reloaded, without your taking any do you need to submit it manually or automatically?

The answer is "it depends". Google itself warns webmasters not to resubmit sitemaps more than once per hour, probably because that's as fast as it's going to process the changes and redirect Googlebot to the URLs in the sitemap. If you are auto-submitting sitemaps more than once an hour, the "punishment" could range from the SE ignoring the subsequent re-submits, to something more dire...but no one really knows the consequences. It would probably be safer to resubmit sitemaps on a regular schedule, but we do not have any hard data about this at this time.

When you Should re-submit a sitemap

So when should you re-submit a sitemap? The obvious answer is whenever your content changes, but not more than once an hour. Google does not yet provide an API to query when it last loaded your sitemap, although you can see this data in its Webmaster Tools. If you have some very timely news that the SE really needs to know about, then resubmit the may not increase the crawl rate, but it may impact which URLs are crawled first.

The bottom line is that sitemaps are becoming increasingly important to search engines to help them prioritize the content they crawl, so use them, don't abuse them, help the internet be a better place!

Tags: , , ,


Comments are closed

Powered by BlogEngine.NET
Theme by Mads Kristensen updated by Search Partner Pro