Welcome to WebmasterWorld Guest from 54.225.58.238

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Googlebot/2.1 and Pagination

     
7:43 pm on Oct 24, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 19, 2004
posts:562
votes: 0


Going nuts with this one. I have MYSQL database feed site with pages that return products with links like this at the bottom:
[3]¦1-30¦31-60¦61-90¦Next¦[/3]

The server log is showing Googlebot getting links like /category/30/30/widget.html or /category/1621/1650/widgets.html. Google is getting lots of invalid URLs and getting 200's in the log for each one.

The problem is that I can't replicate this in the browser or even using Lynx on my Suse machine. It's a fairly new site and Yahoo and MSN aren't spidering it so I can't compare there.

3:19 am on Oct 25, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


I can't replicate this in the browser

Do you mean the server is returning a 404 to your browser, but the logs say it returns 200 to googlebot?

2:16 pm on Oct 28, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 19, 2004
posts:562
votes: 0


tedster: Sorry I didn't see your response since I don't get email notification even when I check it.

What I am seeing is /category/30/30/widget.html in the server log for the Googlebot with a 200 code. This URL does not exist in my code or can I make it produce it.

I may have solved the problem with a more educated pagination after digging deeper into MySQL. My links are now like 1¦2¦3¦ showing page numbers like Google rather than result ranges as before. Larry Ullman's book clarified some things for me now to wait and see how G likes it.

3:43 pm on Oct 28, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


"What I am seeing is /category/30/30/widget.html in the server log for the Googlebot with a 200 code. This URL does not exist in my code or can I make it produce it."

You probably have another issue as well, if those pages are returning a 200 then your server is probably returning a _custom error page_ for pages that don't exist without returning the proper header response.

That _could_ be (and likely is) a major problem for your site.

4:04 pm on Oct 28, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 19, 2004
posts:562
votes: 0


"GET /russ/cat2/30/30/gzhel.html HTTP/1.1" 200 38657 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

It looks like the bot loaded 38657. Are you thinking that could be the custom error page?
4:25 pm on Oct 28, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


It sure looks like it could be one, now the question would be where does it come from your content management system or via the server configuration.