Page is a not externally linkable
- Yahoo
-- Yahoo Search Engine and Directory
---- Strange 404s from Yahoo Slurp


Frank_Rizzo - 4:54 pm on Jun 14, 2010 (gmt 0)


For the past few months slurp has been generating a lot of 404's. There are 3 types:

* Genuine 404s from pages which were deleted a while ago.
* 404s from what seems to be badly configured software
* 404s from what seems to be attempts at exploits.

The following are 404s from Yahoo sports pages such as blogs and video sections:

404 GET /nhl/blog/YYYYY/teams/Nashville+Predators/nhl.t.27
404 GET /nhl/players/2848/gallery/im:urn:newsml:sports.yahoo,getty:YYYYYY:nhl,photo,YYYYYYYYYYYY_nashville_pre:1
404 GET /nhl/teams/was
404 GET /nhl/teams/cob

My sector is sports but nothing to do with hockey, or US sports of any kind.

If I look at the referring pages there is no link to my site so is this badly configured software?

The following seem to be some kind of exploit:

404 /myHigherEdJobs/Login/
404 /company/contact.cfm
404 /question/index?qid=20100223114447AAUSrnf

myhigheredjobs is I believe a jobsite app which uses a login admin panel. As with the company/contact.cfm and the question/index they are not on my site and they look as if they are trawling for exploits.

The IP address does look genuine:

>nslookup 67.195.115.156

Server: 217.112.88.90
Address: 217.112.88.90#53

Non-authoritative answer:
156.115.195.67.in-addr.arpa name = b3090812.crawl.yahoo.net.

Authoritative answers can be found from:
115.195.67.in-addr.arpa nameserver = ns2.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns3.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns4.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns5.yahoo.com.
115.195.67.in-addr.arpa nameserver = ns1.yahoo.com.
ns1.yahoo.com internet address = 68.180.131.16
ns2.yahoo.com internet address = 68.142.255.16
ns3.yahoo.com internet address = 121.101.152.99
ns4.yahoo.com internet address = 68.142.196.63
ns5.yahoo.com internet address = 119.160.247.124

So what the heck is going on here? Is this some kind of spoofing in order to crawl my site to get past current bad bot blocking and / or exploit trawling?

As I said on another thread here slurp is excessively crawling the site. I am wondering if some kind of spoofing is going on and that I should totally block the IP.


Thread source:: http://www.webmasterworld.com/yahoo_search/4152420.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com