homepage Welcome to WebmasterWorld Guest from 54.205.205.47
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Strange Googlebot Spidering Activity
Why is Googlebot getting a 404
Uber_SEO




msg:765110
 2:16 pm on May 6, 2005 (gmt 0)

Recently one of my sites suddenly started to perform very badly in Google. I decided to have a look at the server logs, and I'm seeing some very strange behaviour.

Googlebot turns up and requests my homepage several times during the course of a day. However, when it requests the page, a HTTP code of 404 is frequently being returned. It comes back several hours later, requests the homepage again, but this time a HTTP code of 200 is returned.

There seems to be a loose correlation between the time and the HTTP code returned - 404s seems to be returned early morning (between 6-11), whereas the rest of the time a 200 is returned.

I don't understand why this is happening. As far as I'm aware there has been no server downtime, and even if the server was down, then it wouldn't be possible to record the log.

I've had a look at the homepage through a HTTP viewer and a 200 is always returned. I used Firefox to view the page as Googlebot, and the page is fine.

What's even weirder, is that Slurp appears to have no problems with the site. It requests the homepage during the same time periods as Googlebot, and a 200 is always returned.

Has anyone ever seen anything like this?

 

DanG




msg:765111
 10:28 am on May 9, 2005 (gmt 0)

Hi

I've never seen that myself.

Very interesting though.

Dan

ncgimaker




msg:765112
 10:53 am on May 9, 2005 (gmt 0)

What type of web server are you using?

Uber_SEO




msg:765113
 11:25 am on May 9, 2005 (gmt 0)

IIS

g1smd




msg:765114
 11:27 am on May 9, 2005 (gmt 0)

Are you sure that the requested page has an absolutely identical URL when you get a 404 served?

ncgimaker




msg:765115
 11:37 am on May 9, 2005 (gmt 0)

IIS? If you can't find a rational reason for it, you might consider switching servers away from Microsoft. Recall the Opera thing they did? I wouldn't put it past MS to throw Google a few 404's.

Uber_SEO




msg:765116
 12:36 pm on May 9, 2005 (gmt 0)

Yes, it definitely gets a 404 on the homepage - index.asp - which if i copy and paste into my broswer gives me the homepage.

wanderingmind




msg:765117
 12:48 pm on May 9, 2005 (gmt 0)

Try the Poodle Predictor once and see what happens.

Switching servers, obviously, is not exactly practical.

trillianjedi




msg:765118
 1:02 pm on May 9, 2005 (gmt 0)

This can be caused by the www subdomain not being setup properly and that may be a possibility.

You need a 301 redirect to point example.com to www.example.com (or the other way around).

One single inbound link without the "www" subdomain in the URL would create what you're seeing in the logs.

Type http*://yourdomain.com in your browser and see what you get.

TJ

Uber_SEO




msg:765119
 2:03 pm on May 9, 2005 (gmt 0)

Poodle Predictor worked fine on the site.

trillianjedi - I think you're spot on - browsing the site without the www gave me a 404 error. I'll sort that out and see what happens.

Thanks for your help.

trillianjedi




msg:765120
 2:11 pm on May 9, 2005 (gmt 0)

browsing the site without the www gave me a 404 error

Yes, that'll be the problem then. Someone has linked to you without the "www" and googlebot is following the link.

Can't help you with a 301 in iis I'm afraid - I suggest you head over to the Website Technology Issues forum to see if you can find out how it's done.

TJ

bcolflesh




msg:765121
 2:13 pm on May 9, 2005 (gmt 0)

[webmasterworld.com...]

trillianjedi




msg:765122
 2:25 pm on May 9, 2005 (gmt 0)

There you go - well hunted bcolflesh ;-)

I should just add that this is probably not the reason your site is performing badly in G, although it certainly won't do any harm to fix it and you definitely should - many of your repeat visitors will type in the URL without the WWW and think you've vanished.

Check your own internal link structure while you're about it. Correct any internal links you have to point to the "main" domain (either with or without the WWW sub - whatever you decide to do).

TJ

g1smd




msg:765123
 2:47 pm on May 9, 2005 (gmt 0)

If you serve both non-www and www then you have duplicate content. Google will attach some pages to non-www and others to www each time, and these will vary on a random basis. The other version may appear as a URL-only listing or may not appear at all.

PR passed from www.domain.com/page1.html to www.domain.com/page2.html will be "lost" if for page 2 it is only domain.com/page2.html that is listed in the SERPs.

Use a 301 redirect to fix this. It will help a lot.

Additionally, when you link to an index file inside a folder, make sure that you use only the folder name followed by a trailing / on the URL. Do not include the actual filename of the index file in the link.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved