Forum Moderators: open

Message Too Old, No Replies

Wierd Googlebot Entries

What does this mean?

         

webdude

3:49 pm on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One of my sites got hit pretty hard by Googlebot last night and I have some very strange entries in my log.

The site is mostly dynamic so it is passing values in search aguments. The main argument is called maincategory which is passed to the database for a return of products.

Googlebot hit all my produst pages but it appears it appended search arguments to the search.

Example
Normal searches would look like ths...

GET /category.xml maincategory=20
GET /category.xml maincategory=19
GET /category.xml maincategory=18
GET /category.xml maincategory=17

But the googlebot hits looks like this...

GET /category.xml maincategory=20&function=post
GET /category.xml maincategory=19&function=register
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=20&function=recent
GET /category.xml maincategory=19&function=mlall
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=20&function=post
GET /category.xml maincategory=19&function=register
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=20&function=recent
GET /category.xml maincategory=19&function=mlall
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=20&function=post
GET /category.xml maincategory=19&function=register
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder
GET /category.xml maincategory=20&function=recent
GET /category.xml maincategory=19&function=mlall
GET /category.xml maincategory=18&function=login
GET /category.xml maincategory=17&function=createpoll
GET /category.xml maincategory=16&function=reminder

And it goes on for quite a while. Repeating the same maincategories but appending the function= part.

I did global searches on my site on some of the arguments that are being passed and most of these terms are not even on/in the site anywhere.

Does anyone know what this is all about?

Thanks

webdude

7:06 pm on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



After reading the posts on blogging, I am getting worried that someone is targeting this site. The googlebot is hitting all these pages 3 to 4 times a day, repeatedly hitting the same pages with these different arguments. This tarted happening on saturday and happened twice so far today. None of these arguments are on the site anywhere as links or data calls.

Any Ideas?

Thanks

Kwix

10:10 pm on Jan 20, 2004 (gmt 0)

10+ Year Member



You are 100% positive that it is truly googlebot hitting the site? IP address resolve back to google? The only reason I ask is recently I have been hit by some pretty clever spambots (email harvesters I think). These usually grab a Agent_Referrer and change it every minute or so.

If you are sure it is googlebot, try doing a link:http://mysite.com/category.xml maincategory=20&function=post in both G and Alltheweb to see if anybody had indeed blogged you as such.

webdude

10:40 pm on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All the IP's hitting resolve to googlebot.com. I did a search for the link and came up with no results. Each time googlebot hits it is from a different IP but they all resolve to googlebot.com

Got to go for today - I will pick this thread up tomorrow after checking my logs --- Very Strange

amoore

11:31 pm on Jan 20, 2004 (gmt 0)

10+ Year Member



I don't think I've ever seen googlebot coming from an address with reverse DNS that points to "googlebot.com". I suspect this is not actually google. If you'd like to make sure, you can check the addresses with ARIN to see that they are actually assigned to Google. Anyone can make their IP addresses resolve back to any address they like.

webdude

1:51 pm on Jan 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I did an ARIN lookup and the IPs are from Google Inc. Looked at the logs again today and so far everything looks normal to me. The only difference is that it appears that googlebot is deep crawling at least once a day, sometimes twice, for the past week. I have never seen so much activity from the bot before.

I will keep you posted if the problem arises again.

webdude

1:59 pm on Jan 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the way...

Just checked my ranking for this site and I now popped up to #12 for my main money phrase. This is after totally disappearing after Florida, coming back to #20, then disappering for the past month after intentional overoptimization (as an experiement). I backed off on the money phrase and added synonyms for related phrases. Seems to have helped, either that or all this latest googlebot activity has pushed the site upwards ... don't know :-)

webdude

4:53 pm on Jan 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay,

It is starting to happen again. I wish I knew what this means. Here is a snip of my logs. I changed the IPs. These hits from googlebot are repeating over and over again like it is in some sort of loop. Some of the arguments passed include....
function=faq
function=groupcp
function=viewonline
function=index
function=groupcp
function=dailyss
function=skinmenu
function=poll
function=forumlatest
function=users_online
function=account
function=register
function=mlall
function=reminder

This usually happens right after a deep crawl which has been happening more then ever in the past (2 to 3 times daily). My index page has dropped from #10 to over 1000, but this could be due to Austin.

None of these arguments are related to the site. No arguments like these are passed in any of my code. It happens about 2 to 3 times per day. I have searched google to see if these particular urls are coming from somewhere, but I can't find them. Any insight would be helpful.

2004-01-25 20:32:30 64.68.84.51 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=11&function=search 200 0 8930 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:31 64.68.84.42 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=12&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:32 64.68.84.16 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=19&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:34 64.68.84.43 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=19&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:38 64.68.84.76 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=20&function=search 200 0 12988 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:38 64.68.84.149 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=11&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:39 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=11&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:41 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=13&function=search 200 0 830 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:43 64.68.84.76 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=14&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:44 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=14&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:44 64.68.84.131 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=16&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:46 64.68.84.51 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=17&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:47 64.68.84.42 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=18&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:47 64.68.84.149 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=20&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:49 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=20&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:50 64.68.84.42 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=19&function=search 200 0 830 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:50 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=18&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:52 64.68.84.42 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=16&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:52 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=12&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:54 64.68.84.144 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=14&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:54 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=15&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:55 64.68.84.144 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=15&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:55 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=15&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:57 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=16&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:57 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=18&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:59 64.68.84.147 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=17&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:32:59 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=14&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:00 64.68.84.39 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=18&function=search 200 0 6254 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:00 64.68.84.160 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=12&function=search 200 0 14373 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:02 64.68.84.137 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=12&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:02 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=13&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:04 64.68.84.149 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=13&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:04 64.68.84.42 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=20&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:05 64.68.84.153 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=17&function=search 200 0 10280 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:05 64.68.84.144 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=17&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:07 64.68.84.39 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=16&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:07 64.68.84.132 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=11&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:08 64.68.84.137 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=11&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:08 64.68.84.134 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=12&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:10 64.68.84.39 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=15&function=search 200 0 15710 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:10 64.68.84.15 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=16&function=search 200 0 8917 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:11 64.68.84.46 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=18&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:13 64.68.84.15 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=19&function=index 200 0 147 217 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:14 64.68.84.16 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=14&function=search 200 0 830 218 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:15 64.68.84.51 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=13&function=groupcp 200 0 147 219 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:17 64.68.84.149 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=13&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:18 64.68.84.39 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=20&function=faq 200 0 147 215 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:20 64.68.84.51 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=19&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -
2004-01-25 20:33:21 64.68.84.49 - W3SVC29 111.111.111.111 80 GET /category.taf maincategory=17&function=viewonline 200 0 147 222 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -

gerwin

6:19 pm on Jan 26, 2004 (gmt 0)

10+ Year Member



I don't understand what the problem is, Google is just indexing you're site it seems. Google cam handle dynamic links quite well....

webdude

6:40 pm on Jan 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



None of the function=(arguments) are on my site. The maincategory=(number) is on my site.

As far as I can tell googlebot is either making up these arguments as it crawls, then getting stuck in a loop, or the actual links are from another site that I have not been able to find.

All the args that google is trying to pass to my site appear to be args associated with a forum. I found similar args on other forums, but not all on one forum.

Example....

The link on your site is

mysite.com/search.php?category=lemons

But googlebot is trying to hit

mysite.com/search.php?category=lemons&function=login

So where in the heck is &function=login coming from?

It make no sense......!