I had made mine the old-fashioned way by using notepad and pasting my URLs into it. I still had all the problems that I've seen others report here.
Google should provide a generator for those who want one.
When I first looked at it I could not understand why they were plugging something then telling you to go off and download software from god knows who!
Given their resources it should simply be 'click here to create your site map'
Half a job in my opinion!
|Given their resources it should simply be 'click here to create your site map' |
The reason for using a sitemap is to tell Google what to crawl and where to find it. If you don't think Google can crawl your site adequately without help, why would you trust Google to crawl your site for a sitemap?
[edited by: europeforvisitors at 5:50 pm (utc) on Aug. 6, 2006]
The vast majority of my site is dynamically generated, so I wrote my own sitemap generator for all the dynamic pages and then hand-coded a sitemap for the few static pages. I followed Google's examples of how sitemaps should look and have had ZERO problems.
So you might be onto something...
Same for me as for jay5r.
To state the so obvious sounds absurd in the first place, but has often enough caused revolutions in the long run.
One thing I noted, if your html coding is crappy or a "cowboy code" these automated programs will miss links.
I wouldn't blame then for that - but my sites are plain vanilla html with positively beautiful site navigation ;)
|One thing I noted, if your html coding is crappy or a "cowboy code" these automated programs will miss links. |
The same will happen with regular spiders like Googlebot. If a sitemap generator is not capable of reading the HTML soup on your site, you shouldn't expect Googlebot to index it well either. This is where the basic problem lies with sitemap generators. Sites with a clean HTML structure and linkpaths won't have that much problems with indexing.
My experience is that a good link checker capable of generating sitemaps is far better than using a generator which can generate sitemaps only. A link checker is therefore the start for generating a sitemap.
First run a link checker over your site until no internal link errors are found and no orphan pages exist. Orphan pages can be put in a sitemap file for Google to index them, but chances are rare that they will rank because of no incomming links.
Secondly use the link checker to check the link depth of the pages from the main source of pagerank (mostly the homepage). Lack of pagerank is often the reason that Googlebot doesn't index specific pages or pagetrees. Pagerank dillutes when pages are many steps away from the pagerank source. Decreasing the number of steps from the homepage to the pages you want to index may help. Again, you can add these pages in a sitemap file and Google might index them, but they probably won't rank.
Third, check the output of the link checker for duplicate contents. Do all links display under the same type of URL, or do you see URL types you didn't know they were there. next and previous links in some forum software kan generate these strange URLs, or printer-friendly outputs. Remove these URLs or at least make them harmless with a dynamic generated robots meta tag. I ran a link checker on a dynamically generated site with a common used CMS system where I thought I had all URLs rewritten in the .htaccess. There were however still several hunderds "strange" URLs popping up from all kinds of deep pages. Rewrite these URLs, until you are sure every piece of content can only be accessed via one unique URL and is only referenced in your site with that specific URL.
Step four is the generation of the sitemap. But for many sites, after you completed the first three steps, you don't need the sitemap file anymore because Googlebot can find its way through the site on its own.
xenu link slueth is a very large help
|xenu link slueth is a very large help |
Yes, but it doesn't accurately count the path depth of a specific page to the homepage. The level it shows is the level of the first reference of a page it encountered, but not necessarily the lowest possible level. If you run the program several times, you may see different level values for a specific URL with each run. Other link checkers are better in level counting but worse in other fields.
But back to the subject of the thread: it does generate a list of URLs which can be the base of a Google sitemap.
Its great that you guys seem to be getting great returns from the G sitemaps but its just not that great.
I launched on July 1, I have 33 back links in MSN 167 in Y and 0 in G plus MSN has indexed just over 300 pages and Y 52. Yet G only 1.
Despite the Googlebot coming every week on the money then posting it crawled on Aug 5 that it came on AUG 2!.
Then it leaves the same info nothing changed during the time saying 1 URL cant be found when its there and one bag on nonsense.
I have a massive site and it took me only a few days to get hooked in DMOZ. And yet to no avail G will only index my home page.
Whats the deal, Ive heard people say that they have dumped their sitemap account and then G indexes there pages how weird is that. Plus I heard G does not count a IBL if its just a few weeks OLD only.
Is this for real, that means when the next update comes I am...out of luck then.
I BLAME GOOGLE. Why put out a system if it is not working properly and can invariably do more harm than good AND guys here cuss out Microsoft. There might be a cost difference but good hype drives up stock value and keeping in the news with new innovative things drives up the hype. So what is the real purpose of G sitemaps.
If your site is that new, sitemaps are not the issue; Google just does not fully list new sites. Period.
Do a search for 'sandbox', and you'll get the general idea - it really is not a sitemaps issue.
|If you don't think Google can crawl your site adequately without help, why would you trust Google to crawl your site for a sitemap? |
I would trust Google's technical know-how a great deal more than I would some programmer unknown, possibly sat in a bedroom somewhere!
My main point is they are promoting a service that, when visited, cannot be easily used by many.
If it requires a site map generator to use then they should supply one of their own.
You don't need the brains of a Google programmer to see that!
I have helped a couple of people with their sites using automated generators. Besides the bad code issue, there is an issue with session id's at times as well which is another point to bring forward.
|I would trust Google's technical know-how a great deal more than I would some programmer unknown, possibly sat in a bedroom somewhere! |
In that case, why not trust Googlebot to crawl your site without a sitemap?
Google sitemaps are simply an option for people who don't trust Googlebot to get the job done without help from an outside source.
|I launched on July 1, I have 33 back links in MSN 167 in Y and 0 in G plus MSN has indexed just over 300 pages and Y 52. Yet G only 1. |
Google's back link search only shows a random sample of the backlinks that Google has actually found and is using to rank the site. In other words, there is no way of knowing how many backlinks Google knows about.
|If it requires a site map generator to use then they should supply one of their own. |
The entire point of the sitemaps project is for site owners to build and check a complete list of their site's pages and assign crawl priorities. It wouldn't make a difference if Google provided the tool if site owners aren't doing the checking part with the current tools.
|Google sitemaps are simply an option for people who don't trust Googlebot to get the job done without help from an outside source. |
No, it's more than that... It's a way to set page priorities (including, I would think, if it's not in the list it's not important), it's a single place for Google to check to see what's new and what's changed.
All in all it's a pretty great tool for those of us with dynamic web sites - and that's the only context I use it in. If the site rarely changes (e.g. a brochureware site), I don't know that it would be all that valuable.
>>>>Sites with a clean HTML structure and linkpaths won't have that much problems with indexing.<<<<<
You would think that would be true, wouldn't you? Especially on a small 32 page site. But alas, it is not.
I got one particular page which I renamed over 2 months ago and updated all links pointing to it, as well as the correct URL in the sitemap index file. Still, the Google index carries the old version of the page even though I manually checked that all references to it use the new URL. Maybe it takes longer than expected to update such changes in the index.
Also the site which is 3 months old, was fully indexed and all pages were out of the supplemental index 2 weeks ago. But only 3 days ago, half of pages where dropped and the vast majority went back to the supplemental index. I wonder whether this is because I use Sitemaps, or is there something else I am doing wrong.
I can't see how an automated sitemap creator could possibly be of much use. They work by crawling your site to find links. This is what Google does, if Google can't find the page, then the sitemap creator's crawler surely won't be able to either.
The sitemap generator should be built in to your web site's content management system. It shouldn't be a seperate piece of software that crawls your site.
Google have done a good job of creating a simple, yet effective way of submitting sitemaps to Google. It's not complicated to program a sitemap creator to complient your existing web sitesback-end software. I can't see an off-the-shelf package could be of any real use, it's not a one-size-fits-all problem.