Welcome to WebmasterWorld Guest from 54.226.32.234

Message Too Old, No Replies

Strange Googlebot Spidering Activity

Why is Googlebot getting a 404

     

Uber_SEO

2:16 pm on May 6, 2005 (gmt 0)

10+ Year Member



Recently one of my sites suddenly started to perform very badly in Google. I decided to have a look at the server logs, and I'm seeing some very strange behaviour.

Googlebot turns up and requests my homepage several times during the course of a day. However, when it requests the page, a HTTP code of 404 is frequently being returned. It comes back several hours later, requests the homepage again, but this time a HTTP code of 200 is returned.

There seems to be a loose correlation between the time and the HTTP code returned - 404s seems to be returned early morning (between 6-11), whereas the rest of the time a 200 is returned.

I don't understand why this is happening. As far as I'm aware there has been no server downtime, and even if the server was down, then it wouldn't be possible to record the log.

I've had a look at the homepage through a HTTP viewer and a 200 is always returned. I used Firefox to view the page as Googlebot, and the page is fine.

What's even weirder, is that Slurp appears to have no problems with the site. It requests the homepage during the same time periods as Googlebot, and a 200 is always returned.

Has anyone ever seen anything like this?

DanG

10:28 am on May 9, 2005 (gmt 0)

10+ Year Member



Hi

I've never seen that myself.

Very interesting though.

Dan

ncgimaker

10:53 am on May 9, 2005 (gmt 0)

10+ Year Member



What type of web server are you using?

Uber_SEO

11:25 am on May 9, 2005 (gmt 0)

10+ Year Member



IIS

g1smd

11:27 am on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Are you sure that the requested page has an absolutely identical URL when you get a 404 served?

ncgimaker

11:37 am on May 9, 2005 (gmt 0)

10+ Year Member



IIS? If you can't find a rational reason for it, you might consider switching servers away from Microsoft. Recall the Opera thing they did? I wouldn't put it past MS to throw Google a few 404's.

Uber_SEO

12:36 pm on May 9, 2005 (gmt 0)

10+ Year Member



Yes, it definitely gets a 404 on the homepage - index.asp - which if i copy and paste into my broswer gives me the homepage.

wanderingmind

12:48 pm on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try the Poodle Predictor once and see what happens.

Switching servers, obviously, is not exactly practical.

trillianjedi

1:02 pm on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



This can be caused by the www subdomain not being setup properly and that may be a possibility.

You need a 301 redirect to point example.com to www.example.com (or the other way around).

One single inbound link without the "www" subdomain in the URL would create what you're seeing in the logs.

Type http*://yourdomain.com in your browser and see what you get.

TJ

Uber_SEO

2:03 pm on May 9, 2005 (gmt 0)

10+ Year Member



Poodle Predictor worked fine on the site.

trillianjedi - I think you're spot on - browsing the site without the www gave me a 404 error. I'll sort that out and see what happens.

Thanks for your help.

trillianjedi

2:11 pm on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



browsing the site without the www gave me a 404 error

Yes, that'll be the problem then. Someone has linked to you without the "www" and googlebot is following the link.

Can't help you with a 301 in iis I'm afraid - I suggest you head over to the Website Technology Issues forum to see if you can find out how it's done.

TJ

bcolflesh

2:13 pm on May 9, 2005 (gmt 0)

trillianjedi

2:25 pm on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There you go - well hunted bcolflesh ;-)

I should just add that this is probably not the reason your site is performing badly in G, although it certainly won't do any harm to fix it and you definitely should - many of your repeat visitors will type in the URL without the WWW and think you've vanished.

Check your own internal link structure while you're about it. Correct any internal links you have to point to the "main" domain (either with or without the WWW sub - whatever you decide to do).

TJ

g1smd

2:47 pm on May 9, 2005 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If you serve both non-www and www then you have duplicate content. Google will attach some pages to non-www and others to www each time, and these will vary on a random basis. The other version may appear as a URL-only listing or may not appear at all.

PR passed from www.domain.com/page1.html to www.domain.com/page2.html will be "lost" if for page 2 it is only domain.com/page2.html that is listed in the SERPs.

Use a 301 redirect to fix this. It will help a lot.

Additionally, when you link to an index file inside a folder, make sure that you use only the folder name followed by a trailing / on the URL. Do not include the actual filename of the index file in the link.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month