Page is a not externally linkable
Frank_Rizzo - 4:54 pm on Jun 14, 2010 (gmt 0)
For the past few months slurp has been generating a lot of 404's. There are 3 types:
* Genuine 404s from pages which were deleted a while ago.
* 404s from what seems to be badly configured software
* 404s from what seems to be attempts at exploits.
The following are 404s from Yahoo sports pages such as blogs and video sections:
404 GET /nhl/blog/YYYYY/teams/Nashville+Predators/nhl.t.27
404 GET /nhl/players/2848/gallery/im:urn:newsml:sports.yahoo,getty:YYYYYY:nhl,photo,YYYYYYYYYYYY_nashville_pre:1
404 GET /nhl/teams/was
404 GET /nhl/teams/cob
My sector is sports but nothing to do with hockey, or US sports of any kind.
If I look at the referring pages there is no link to my site so is this badly configured software?
The following seem to be some kind of exploit:
404 /myHigherEdJobs/Login/
404 /company/contact.cfm
404 /question/index?qid=20100223114447AAUSrnf
myhigheredjobs is I believe a jobsite app which uses a login admin panel. As with the company/contact.cfm and the question/index they are not on my site and they look as if they are trawling for exploits.
The IP address does look genuine:
>nslookup 67.195.115.156
Server: 217.112.88.90
Address: 217.112.88.90#53
Non-authoritative answer:
156.115.195.67.in-addr.arpa name = b3090812.crawl.yahoo.net.
Authoritative answers can be found from:
115.195.67.in-addr.arpa nameserver = ns2.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns3.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns4.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns5.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns1.yahoo.com.
ns1.yahoo.com internet address = 68.180.131.16
ns2.yahoo.com internet address = 68.142.255.16
ns3.yahoo.com internet address = 121.101.152.99
ns4.yahoo.com internet address = 68.142.196.63
ns5.yahoo.com internet address = 119.160.247.124
So what the heck is going on here? Is this some kind of spoofing in order to crawl my site to get past current bad bot blocking and / or exploit trawling?
As I said on another thread here slurp is excessively crawling the site. I am wondering if some kind of spoofing is going on and that I should totally block the IP.