Forum Moderators: Robert Charlton & goodroi
As we have been on a static system for the last 4 years and have a 3500 + page commerce site I am a litte concerned about the transition to a dynamic system. So, I am trying to do everything I can to make it as painless as possible.
So at the very least its a good validation tool for your site.
I have a small (<500 pages) site, all static pages. Therefore I find the overhead of updating the sitemap not really worth the effort. It also doesn't speed up indexing or even crawling. I updated a page about 7 days ago and Google haven't indexed it, yet MSN and Yahoo have, without the benefit of a sitemap.
Like much of what Google attempts beyond strict searching I find the whole approach very clunky. For example it is an XML format, why not just a CSV? Also if you use multiple sitemaps like I do there is no indication when the sub-sitemaps are crawled, so it's frustrating.
I've had the sitemap since my site went online, so I can't say if it has made things better. But I'm not sure I'll continue. It is their job to crawl my site and find the pages after all.
So, the only reason this item could have sold was because of the XML sitemap.
but it keeps giving me errors
It may not be your XML. Commonly over looked in the process of setup is making sure that your .htaccess file is telling the server to parse XML.
While setting up a few shopping carts for friends I found this to be true. I have also heard other developers go through this. You pour over and over your code and all along it was the .htaccess file.
It hasn't helped traffic but that's not it's purpose.
Yeah, I don't think anyone here is under the impression it would. But, since historically SEs have had such a tough/unstable results with DB driven sites, I was really hoping that we would be able to safely switch.
Well, at least one person has had a good experiance.
By placing a Sitemap-formatted file on your web server, you enable our crawlers to find out what pages are present and which have recently changed, and to crawl your site accordingly.
This collaborative crawling system will allow our crawlers to optimize the usefulness of Google's index for users by improving its coverage and freshness.
Google uses the Sitemaps it receives to add your pages to our search index.
We don't guarantee that we'll crawl or index all of your URLs. However, we use the data in your Sitemap to learn about your site's structure, which will allow us to improve our crawler schedule and do a better job crawling your site in the future. In most cases, webmasters will benefit from Sitemap submission, and in no case will you be penalized for it.
Site verification will give you statistics about your site and information about URLs we couldn't crawl so that you can make changes, if necessary.
After reading all this and maintaining my Sitemap XML file for couple of months, i am not able to add my most important pages to Google's Index. There are no errors shown in Sitemap statistics, but i regularly see that Googlebot spiders those pages now and then.
Anyway, you can try your luck!
Yes it is "their" job, but as webmasters, it is our job to make sure they can if we need to be in the SERPs. After all, it is an interdependant relationship.
I agree, it is a two-way thing. But you have to determine whether the overhead of keeping the sitemap up to date is worth the benefit. If a site is well linked internally it shouldn't really need a Google sitemap. The point I was making is that I'm not sure it really infers much benefit if your site is already easy to crawl. Given the delays I see with indexing it doesn't seem worth the effort.
As for the interdependant bit, that's all well and good. But it's hard to be enthusiastic if you feel they are either ignoring the sitemap, or taking too long to act upon it.
If your site is easy to understand for Google and it has no errors you probably need not the sitemap, but you never can be absolutely sure.
So there is no harm to have the sitemap just for the case, unless you are going to hide something and would not like to fix the errors hoping that Google will never notice them.
Also you may implement only the features that you believe can help Google.
For example I have the simple list of URLs (txt site map) because I have no strict schedule and cannot tell in advance how often the particular page is updated.
On the other hand, when I add new page, I add it to the sitemap and seems Google finds it more quickly. It does not mean that Google will show it in the index immediately, because it depends also on the page content, links etc, but sitemaps seems help.
Also my site gradually evolved from one topic (theme) to another. I did not include the old topic pages into the sitemap and again, it seems that Google now understands my site better.
The bottom line is that the sitemap might help; it is easy to implement (if you omit the feature that you don't need) and it does no harm (if before adding the sitemap you double check your site for errors).
Vadim.
On those sites, where changes and modifications are frequent (new pages created / old pages deleted), it was a total mess in the past few months. Google (and Yahoo) got totally “confused”. I suspect, it might be problems with the server, may be too slow for Google crawl. I am not familiar with the server and cannot modify server’s file by myself. Some of my clients host sites on cheap server – I am not even sure we have access to the server files.
So I removed sitemaps beginning of March – and see improvement with site indexing and fixing supplemental.
I have not noticed that having XML sitemap helps site to be crawled more often, or new pages to be indexed faster.
I find the statistics quite interesting, e.g. some info on broken links or the search-phrases.
In the first place the sitemap helps to crawl the site, and yes: It always takes only a few days until a new page is in the index.
I launched a huge pdf-file comprising 70MByte and got the info that the spider most of the time swallowed it recently, but had some timeouts in the beginning. How could you know without that sitemap?
ahmedtheking there is/was this 2500 k-bug, your errors don't necessarily come from your xml-code. Just do some research in the groups. I got very helpful answers from there and have the suspicion that these came from someone inside google.
Googlebot is highly regular and predictable at my site over the long term. When I add a new page, I can pretty much predict when Googlebot will find them. Most new pages have little page rank, and are only picked up on a complete crawl, which happens approximately twice per month.
So, there's little guesswork here or opinion or "seems like" for my very narrow goal. My logs tell me precisely when the sitemap was picked up, and how many times Googlebot came by without fetching brand new pages that were mentioned in a sitemap that was picked up earlier.
I haven't let the current experiment run an entire month, but I can say that so far the use of a sitemap did not stimulate Googlebot to pick up new/changed pages any sooner than normal.
This is one of those areas where I think Google's traditional unwillingness to share any technical details hurts them. Clearly, it's in everybody's bests interests for sitemaps to get widely used -- we can't have every search engine trying to crawl every page on the Internet every day, and sitemaps could be a way to both reduce that load and increase the completeness of indexing.
But without some clues about what benefits we can predictably expect (even if they just say "this is how it works now, we might change it next month), it's real hard to get motivated to participate. Of course, many folks will participate whether there's any benefit or not, because they take no measurements and rely on superstition/intuition for their feedback. Well, a lot of us keep pushing that elevator button too, just because it "seems" to make the car arrive sooner. Maybe that's good enough.
Sadly, Slurp has picked up all the new pages I've posted this week before Googlebot has, even though I handled Googlebot a sitemap to tell it precisely what's new.
I sure wish the Google folks would think about supplying some details about whether, or under what conditions, supplying a sitemap should provide any particular specific benefit.
Anyway i downloaded some software to make sitemaps, did mine, and uploaded them.
That was november last year. Since them my monthly traffic has moved from about 10 gig to 28 gig. As my content has only minimally improved I put it down to the massive spidering and indexing of the pages.
Both my html and php pages are indexed, google hits on my php pages are now heaps more than in the past.
The Google sitemap feature also shows you the keywords you are indexed for and the placement of your page in that index. I have numerous pages now in the top 10 hits for words.
My income had also doubled (to modest levels.) I have nothing but positive comments for this system, except I had to move to a bigger package to cover the bandwidth usage, this was covered by the increase in income.
I haven't let the current experiment run an entire month, but I can say that so far the use of a sitemap did not stimulate Googlebot to pick up new/changed pages any sooner than normal.This is one of those areas where I think Google's traditional unwillingness to share any technical details hurts them.
Sadly, Slurp has picked up all the new pages I've posted this week before Googlebot has, even though I handled Googlebot a sitemap to tell it precisely what's new.
I sure wish the Google folks would think about supplying some details about whether, or under what conditions, supplying a sitemap should provide any particular specific benefit.
I think this is it in a nutshell. Google are poor communicators and, increasingly, are hiding behind 'Beta' software to push out technology quickly, without really bothering to think it through.
When I first found out about sitemaps I thought it was a great idea, even though my site was crawled well without it. Naturally I expected it to infer some benefit, mainly in terms of speed. This has not manifested itself and, like others, I'm finding MSN and Yahoo picking up new material in a few days without the overhead of a sitemap.
However there seems to be a real reluctance to criticize any aspect of Google's operation. I don't know why this is since they do so little to instill loyalty.
Since them my monthly traffic has moved from about 10 gig to 28 gig.
Sounds like you were having trouble getting crawled correctly, perhaps due to failing to follow the simple guideines Google supplies for getting crawled correctly.
I have numerous pages now in the top 10 hits for words.
If half your content had never been crawled correctly, and then suddenly got crawled and indexed, it would sure be surprising if that hurt your rankings.
My income had also doubled (to modest levels.)
I bought a new pair of jeans and then my IRS refund came. I don't know why people don't buy new jeans more often.
Superstition ain't the way.