Forum Moderators: martinibuster

Message Too Old, No Replies

Adsense bot not implementing If-Modified-Since

The other G-Bots can do it, but not the Media bot

         

AlexK

3:31 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



GoogleGuy advised Webmasters to implement If-Modified-Since [webmasterworld.com] back in October 2002. I finally got the point of this at the beginning of Sept this year (I am obviously a bit slow) and got it going on the PHP-pages on my site.

[For those in my former position: The idea is, send a Last-Modified Response header with the page, and the bot will send a If-Modified-Since Request header on a subsequent hit, which allows your server to send a 304 Not Modified page if it hasn't changed, saving vast amounts of bandwidth. Do it today!]

It has worked very well.

Not only the G_Bot (and M_Bot) implements it; so does Inktomi Slurp! and even the Baiduspider, but not msnbot, ia_archiver, BecomeBot, NutchCVS, SurveyBot, etc. (boo!). It is very surprising, then, to see the G-Media-Bot lumped in with these 3rd-rank spiders.

Also, this wretched bot has now taken 9,575 pages in Sept (humans have taken only 76,677 pages so far), which is 1 page for every 8 requested.

Here is a handful of examples from the most recent page of my access-log:

66.249.66.48 - - [12/Sep/2005:14:44:52 +0100] "GET /mfcs.php?mid=116&nid=7490 HTTP/1.1" 200 7084 "-" "Mediapartners-Google/2.1" In:28306 Out:7084:25pct.
66.249.66.48 - - [12/Sep/2005:14:44:56 +0100] "GET /mfcs.php?mid=116&nid=7490 HTTP/1.1" 200 7084 "-" "Mediapartners-Google/2.1" In:28306 Out:7084:25pct.
66.249.66.48 - - [12/Sep/2005:14:49:19 +0100] "GET /mfcs.php?mid=276 HTTP/1.1" 200 7849 "-" "Mediapartners-Google/2.1" In:36353 Out:7849:22pct.
66.249.66.48 - - [12/Sep/2005:14:49:22 +0100] "GET /mfcs.php?mid=276 HTTP/1.1" 200 7849 "-" "Mediapartners-Google/2.1" In:36353 Out:7849:22pct.
66.249.66.48 - - [12/Sep/2005:15:02:13 +0100] "GET /search.php?id=PCI%5CAZT_4002 HTTP/1.1" 200 7224 "-" "Mediapartners-Google/2.1" In:28373 Out:7224:25pct.
66.249.66.48 - - [12/Sep/2005:15:02:17 +0100] "GET /search.php?id=PCI%5CAZT_4002 HTTP/1.1" 200 7224 "-" "Mediapartners-Google/2.1" In:28373 Out:7224:25pct.
66.249.66.48 - - [12/Sep/2005:15:06:48 +0100] "GET /mfcs.php?mid=6&nid=14451 HTTP/1.1" 200 7348 "-" "Mediapartners-Google/2.1" In:29577 Out:7348:25pct.
66.249.66.48 - - [12/Sep/2005:15:06:52 +0100] "GET /mfcs.php?mid=6&nid=14451 HTTP/1.1" 200 7348 "-" "Mediapartners-Google/2.1" In:29577 Out:7348:25pct.

zCat

6:03 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



I'm happy to see the Mediabot again and again, I just wish it would take each page once instead of up to six times...

FWIW the "Mozillabot" (Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)) doesn't seem to implement If-Modified-Since either,

AlexK

6:23 pm on Sep 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



zCat:
FWIW the "Mozillabot" ... doesn't seem to implement If-Modified-Since either

Good Lord! Just checked, and there no 304s from this bot on my site, either (a month's worth of rotated logs). What on earth is G doing?

The M_Bot is a particular bugbear of mine. It was hitting my site at upto 3 times/sec [webmasterworld.com] and, when I asked them to slow it down, they virtually switched it off.

DamonHD

7:42 pm on Sep 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

For me the MediaBot seems to come and respider a page right after each vistor to that page, mainly because the detailed ad layout changes each time IMHO

Thus the Mediabot probably does not and can not care about IMS, but only whether it believes the page content or layout has changed and needs to be respidered, and you could be wrong (or fibbing) in IMS for cloaking or other black-hat reasons.

All my guesses of course.

For the record I treat the MediaBot just like every other bot to limit bandwidth automatically, and it seems to do me no particular harm.

Rgds

Damon