Forum Moderators: DixonJones
Host: ***.***.***.***/oldlocation/page
Http Code: 301 Date: Nov 18 - Http Version: HTTP/1.1 Size in Bytes: 5
Referer: valid ref where I know I have a link
Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)/newlocation/page
Http Code: 200 Date: Nov 18 - Http Version: HTTP/1.1 Size in Bytes: #*$!x
Referer: -
Agent: Mozilla/3.01 (compatible;)/style.css
Http Code: 200 Date: Nov 18 - Http Version: HTTP/1.1 Size in Bytes: #*$!x
Referer: [mysite...]
Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)/favicon.ico
Http Code: 200 Date: Nov 18 - Http Version: HTTP/1.1 Size in Bytes: #*$!
Referer: -
Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)
Getting this behaviour from many different IP's every day, but never more than one page.
Seems the 301 confuses it.
I have seen the "Mozilla/3.01 (compatible;)" part be either version 3.01 or 4.0. Why would a browser claim to be older than it is?
The UA isn't Netscape most of the time either, it's some random version of IE.
Is this a smart scraper with a scraping botnet who has discovered he can't trust any links on my page, instead just jumping in, getting one page and heading off?
It hasn't fallen in the bot trap yet.
Paranoia?
This has been my experience in the past in corporate world.