| 10:51 am on Jul 30, 2007 (gmt 0)|
You might want to do a search for Google Web Accelerator.
| 12:39 pm on Jul 30, 2007 (gmt 0)|
Is the agent really Googlebot?
That is a Google IP and the Web Accelerator would cause a prefetch of hidden links, because of that it basicly acts like a bot and thus your defense system jabbered at you.
The prefetch can be turned off, it takes just a few lines in .htaccess you should be able to find it if you do a search.
| 12:57 pm on Jul 30, 2007 (gmt 0)|
Not Googlebot user agent.
That's what I thought, it's some kind of Google proxy or something like that or a gateway.
At first, I thought those are human reviewers working for Google, but because of the fact that they seemed to hit the trap urls all to often, I wasn't sure.
Do you know if it's possible at all to use that accelerator as a proxy? In other words, can a scraper use it somehow to copy content?
If not, then I'll just add the whole range to the whitelist and be done with it. If yes, then it gets more tricky.
| 1:09 pm on Jul 30, 2007 (gmt 0)|
That's the latest user agent I'm seeing:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; (R1 1.5); .NET CLR 1.1.4322)"
So blocking by UA won't work.
Also, it seems like all hits are fetches through those IP's, not just pre-loading stuff. I'm not sure, maybe that's how it's supposed to work.
| 1:12 pm on Jul 30, 2007 (gmt 0)|
That is exactly how prefetch works.
Search for it on WebmasterWorld there are ways to turn it off at the server end.
Here ya go:
You may want to search a bit further depending on what you want to do with prefetch there are other way to handle it.
But the information is out there.
[edited by: theBear at 1:20 pm (utc) on July 30, 2007]
| 1:27 pm on Jul 30, 2007 (gmt 0)|
Do you know if it can be used by scrapers in some way?
Like faking requests to the accelerator and pretending to be a toolbar so that Google does the fetching?