homepage Welcome to WebmasterWorld Guest from 54.234.147.84
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google Sitemaps Problems? May come from the generator used.
Maybe we're too quick to blame Google ...
Quadrille




msg:3036273
 3:25 pm on Aug 6, 2006 (gmt 0)

There's been a lot of threads here lately discussing demonic possession and other problems associated with sitemaps, but one issue that I have not seen discussed lately is the quality of the sitemaps them selves ...

Many of us non-techie folk rely totally on 'off the shelf' sitemap generators. This may be where the problem lies.

One I've used only includes pages to level two; rather missing the point that sitemaps are to help get those deep pages Googled!

Another does a beautiful job from a well designed, professional looking site - but often (not always) omits the last few lines of the map. Not helpful.

A third does a great job. Usually. But it seems to have a minor allergy, and repeatedly claimed errors that simply did not exist - very frustrating until I used two other programs, and have had no problems since.

Clues to errors:

1. Makes it too fast - probably missing chunks.

2. Fails to report the number of pages that you know you have (don't you!)

There's no substitute for a visual check, however.

All three of these programs are listed on Google's website.

 

icedowl




msg:3036329
 4:59 pm on Aug 6, 2006 (gmt 0)

I had made mine the old-fashioned way by using notepad and pasting my URLs into it. I still had all the problems that I've seen others report here.

phantombookman




msg:3036347
 5:26 pm on Aug 6, 2006 (gmt 0)

Google should provide a generator for those who want one.

When I first looked at it I could not understand why they were plugging something then telling you to go off and download software from god knows who!

Given their resources it should simply be 'click here to create your site map'
Half a job in my opinion!

europeforvisitors




msg:3036364
 5:49 pm on Aug 6, 2006 (gmt 0)

Given their resources it should simply be 'click here to create your site map'

The reason for using a sitemap is to tell Google what to crawl and where to find it. If you don't think Google can crawl your site adequately without help, why would you trust Google to crawl your site for a sitemap?

[edited by: europeforvisitors at 5:50 pm (utc) on Aug. 6, 2006]

jay5r




msg:3036365
 5:49 pm on Aug 6, 2006 (gmt 0)

The vast majority of my site is dynamically generated, so I wrote my own sitemap generator for all the dynamic pages and then hand-coded a sitemap for the few static pages. I followed Google's examples of how sitemaps should look and have had ZERO problems.

So you might be onto something...

Oliver Henniges




msg:3036378
 6:00 pm on Aug 6, 2006 (gmt 0)

Same for me as for jay5r.
To state the so obvious sounds absurd in the first place, but has often enough caused revolutions in the long run.

trinorthlighting




msg:3036509
 8:29 pm on Aug 6, 2006 (gmt 0)

One thing I noted, if your html coding is crappy or a "cowboy code" these automated programs will miss links.

Quadrille




msg:3036550
 9:42 pm on Aug 6, 2006 (gmt 0)

I wouldn't blame then for that - but my sites are plain vanilla html with positively beautiful site navigation ;)

lammert




msg:3036653
 11:38 pm on Aug 6, 2006 (gmt 0)

One thing I noted, if your html coding is crappy or a "cowboy code" these automated programs will miss links.

The same will happen with regular spiders like Googlebot. If a sitemap generator is not capable of reading the HTML soup on your site, you shouldn't expect Googlebot to index it well either. This is where the basic problem lies with sitemap generators. Sites with a clean HTML structure and linkpaths won't have that much problems with indexing.

My experience is that a good link checker capable of generating sitemaps is far better than using a generator which can generate sitemaps only. A link checker is therefore the start for generating a sitemap.

First run a link checker over your site until no internal link errors are found and no orphan pages exist. Orphan pages can be put in a sitemap file for Google to index them, but chances are rare that they will rank because of no incomming links.

Secondly use the link checker to check the link depth of the pages from the main source of pagerank (mostly the homepage). Lack of pagerank is often the reason that Googlebot doesn't index specific pages or pagetrees. Pagerank dillutes when pages are many steps away from the pagerank source. Decreasing the number of steps from the homepage to the pages you want to index may help. Again, you can add these pages in a sitemap file and Google might index them, but they probably won't rank.

Third, check the output of the link checker for duplicate contents. Do all links display under the same type of URL, or do you see URL types you didn't know they were there. next and previous links in some forum software kan generate these strange URLs, or printer-friendly outputs. Remove these URLs or at least make them harmless with a dynamic generated robots meta tag. I ran a link checker on a dynamically generated site with a common used CMS system where I thought I had all URLs rewritten in the .htaccess. There were however still several hunderds "strange" URLs popping up from all kinds of deep pages. Rewrite these URLs, until you are sure every piece of content can only be accessed via one unique URL and is only referenced in your site with that specific URL.

Step four is the generation of the sitemap. But for many sites, after you completed the first three steps, you don't need the sitemap file anymore because Googlebot can find its way through the site on its own.

trinorthlighting




msg:3036673
 12:15 am on Aug 7, 2006 (gmt 0)

xenu link slueth is a very large help

lammert




msg:3036688
 12:32 am on Aug 7, 2006 (gmt 0)

xenu link slueth is a very large help

Yes, but it doesn't accurately count the path depth of a specific page to the homepage. The level it shows is the level of the first reference of a page it encountered, but not necessarily the lowest possible level. If you run the program several times, you may see different level values for a specific URL with each run. Other link checkers are better in level counting but worse in other fields.

But back to the subject of the thread: it does generate a list of URLs which can be the base of a Google sitemap.

JamaicanFood




msg:3036703
 1:01 am on Aug 7, 2006 (gmt 0)

Its great that you guys seem to be getting great returns from the G sitemaps but its just not that great.

I launched on July 1, I have 33 back links in MSN 167 in Y and 0 in G plus MSN has indexed just over 300 pages and Y 52. Yet G only 1.

Despite the Googlebot coming every week on the money then posting it crawled on Aug 5 that it came on AUG 2!.
Then it leaves the same info nothing changed during the time saying 1 URL cant be found when its there and one bag on nonsense.

I have a massive site and it took me only a few days to get hooked in DMOZ. And yet to no avail G will only index my home page.

Whats the deal, Ive heard people say that they have dumped their sitemap account and then G indexes there pages how weird is that. Plus I heard G does not count a IBL if its just a few weeks OLD only.

Is this for real, that means when the next update comes I am...out of luck then.

I BLAME GOOGLE. Why put out a system if it is not working properly and can invariably do more harm than good AND guys here cuss out Microsoft. There might be a cost difference but good hype drives up stock value and keeping in the news with new innovative things drives up the hype. So what is the real purpose of G sitemaps.

My thoughts....

Quadrille




msg:3036725
 1:24 am on Aug 7, 2006 (gmt 0)

If your site is that new, sitemaps are not the issue; Google just does not fully list new sites. Period.

Do a search for 'sandbox', and you'll get the general idea - it really is not a sitemaps issue.

phantombookman




msg:3036937
 7:16 am on Aug 7, 2006 (gmt 0)

If you don't think Google can crawl your site adequately without help, why would you trust Google to crawl your site for a sitemap?

I would trust Google's technical know-how a great deal more than I would some programmer unknown, possibly sat in a bedroom somewhere!

My main point is they are promoting a service that, when visited, cannot be easily used by many.
If it requires a site map generator to use then they should supply one of their own.

You don't need the brains of a Google programmer to see that!

trinorthlighting




msg:3037137
 12:41 pm on Aug 7, 2006 (gmt 0)

I have helped a couple of people with their sites using automated generators. Besides the bad code issue, there is an issue with session id's at times as well which is another point to bring forward.

europeforvisitors




msg:3037271
 2:33 pm on Aug 7, 2006 (gmt 0)

I would trust Google's technical know-how a great deal more than I would some programmer unknown, possibly sat in a bedroom somewhere!

In that case, why not trust Googlebot to crawl your site without a sitemap?

Google sitemaps are simply an option for people who don't trust Googlebot to get the job done without help from an outside source.

whoisgregg




msg:3037285
 2:38 pm on Aug 7, 2006 (gmt 0)

I launched on July 1, I have 33 back links in MSN 167 in Y and 0 in G plus MSN has indexed just over 300 pages and Y 52. Yet G only 1.

Google's back link search only shows a random sample of the backlinks that Google has actually found and is using to rank the site. In other words, there is no way of knowing how many backlinks Google knows about.

If it requires a site map generator to use then they should supply one of their own.

The entire point of the sitemaps project is for site owners to build and check a complete list of their site's pages and assign crawl priorities. It wouldn't make a difference if Google provided the tool if site owners aren't doing the checking part with the current tools.

jay5r




msg:3037293
 2:45 pm on Aug 7, 2006 (gmt 0)

Google sitemaps are simply an option for people who don't trust Googlebot to get the job done without help from an outside source.

No, it's more than that... It's a way to set page priorities (including, I would think, if it's not in the list it's not important), it's a single place for Google to check to see what's new and what's changed.

All in all it's a pretty great tool for those of us with dynamic web sites - and that's the only context I use it in. If the site rarely changes (e.g. a brochureware site), I don't know that it would be all that valuable.

texasville




msg:3037430
 4:17 pm on Aug 7, 2006 (gmt 0)

>>>>Sites with a clean HTML structure and linkpaths won't have that much problems with indexing.<<<<<

You would think that would be true, wouldn't you? Especially on a small 32 page site. But alas, it is not.

apprentice




msg:3041004
 10:00 am on Aug 10, 2006 (gmt 0)

I got one particular page which I renamed over 2 months ago and updated all links pointing to it, as well as the correct URL in the sitemap index file. Still, the Google index carries the old version of the page even though I manually checked that all references to it use the new URL. Maybe it takes longer than expected to update such changes in the index.

Also the site which is 3 months old, was fully indexed and all pages were out of the supplemental index 2 weeks ago. But only 3 days ago, half of pages where dropped and the vast majority went back to the supplemental index. I wonder whether this is because I use Sitemaps, or is there something else I am doing wrong.

mrMister




msg:3041086
 11:54 am on Aug 10, 2006 (gmt 0)

I can't see how an automated sitemap creator could possibly be of much use. They work by crawling your site to find links. This is what Google does, if Google can't find the page, then the sitemap creator's crawler surely won't be able to either.

It's my understanding that the Google Sitemaps service is for pages that are hidden from crawlers due to links embedded in Flash, Javascript or other client-side code.

The sitemap generator should be built in to your web site's content management system. It shouldn't be a seperate piece of software that crawls your site.

Google have done a good job of creating a simple, yet effective way of submitting sitemaps to Google. It's not complicated to program a sitemap creator to complient your existing web sitesback-end software. I can't see an off-the-shelf package could be of any real use, it's not a one-size-fits-all problem.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved