Forum Moderators: martinibuster
What gives me cause for slight concern though is that it fetches the same page multiple times within the same 1-10 second time window. Even though it gets exactly the same page back every time. The pages generally have three code blocks, but are fetched up to four times.
Can I do anything to stop this? Surely one hit per page would be enough - it would save me and Google bandwidth and processing power.
Off hand I can't tell you how to do it though and I apologize for the non-answer.
anyway I'll get you bumped back up where a good coder might see it...
You're better off spending your time on content and traffic development...
Eric
I don`t think there`s any way to regulate the number of hits to a certain page, at least using those methods; it's either all or nothing, and I certainly don`t want to risk blocking that particular bot.
Methinks I should take a look at MediaPartner's incoming headers, maybe it's expecting some kind of If-Changed-Since response my site isn't giving it.
Has anyone else experience this kind of behavior?
66.249.00.000 - - [26/Jul/2005:03:32:59 +0200] "GET /some-page-or-other.html HTTP/1.1" 200 19649 "-" "Mediapartners-Google/2.1"
66.249.00.000 - - [26/Jul/2005:03:32:59 +0200] "GET /some-page-or-other.html HTTP/1.1" 200 19649 "-" "Mediapartners-Google/2.1"
66.249.00.000 - - [26/Jul/2005:03:32:59 +0200] "GET /some-page-or-other.html HTTP/1.1" 200 19649 "-" "Mediapartners-Google/2.1"
66.249.00.000 - - [26/Jul/2005:03:32:59 +0200] "GET /some-page-or-other.html HTTP/1.1" 200 19649 "-" "Mediapartners-Google/2.1"
(some data modified to protect innocent parties).
I`m not losing sleep over it, but from an admin point of view I like to minimize bandwith usage and server load.
I've not yet enough data to say that the mediapartners bot definitely makes use of the information (especially the lastmod timestamp) harvested from XML sitemaps. However, I'm speculating that it will do so after a probably short period of testing.
So if you don't provide a dynamic XML sitemap yet, it's possibly a good idea to refresh your programming skills;)
Got a nice person-written email back from Google's 'contact us' saying that my email (and the copy of the log) was being forwarded to their engineer's for 'review'.
Haven't heard back from them (nor at this point do I expect to) but the bot is still hitting up the same url multiple times after it's been called.
I've gotten used to ignoring it.
That happens on the first page view(s). Once 'the bot' has made its decision about the new page's content, it puts the page on a to-crawl-every-now-and-then list. If you change the content, you rely on a regular re-crawl by Googlebot and/or the AdSense bot to adjust topic targeting of your ads. Sometimes you wait very long for this re-crawl.
That's the point where Google Sitemaps could drastically improve things. The sitemaps protocol defines an attribute 'last modified', populated with a timestamp by the site's underlying script on every content change in the database (or you do it manually on static sites).
Googlebot (and hopefully the mediapartners bot in the future) download the XML sitemap twice a day, harvest fresh (new or modified) content from its page list and re-crawl the updated pages.
If that's not just a mistake in assigning the user agent name in a few occasions (from my tiny set of data it's seemingly not, but I do not have enough data to backup anything), I speculate that the AdSense bot making use of submitted content changes will improve ad targeting to a great degree. That's not a big issue with static content, but all dynamic pages would profit.
<speculation>I guess possible AdSense improvements were part of Google's sitemap master plan from the very beginning. Perhaps AdSense issues were even the project's 'detonator'.</speculation>