Frank_Rizzo - 9:34 am on Jun 23, 2010 (gmt 0)
Yes there are 2 main issues here. Just to recap:
Badly Configured Links? / Corrupt Serps?
The first is the sports blog / Q&A crawlings which seem badly configured. This is where pages are asked for on my site which currently exist on the yahoo blogs / Q&A. For some reason Yahoo is asking for pages of it's own blog site on my site:
22.214.171.124 - - [15/Jun/2010:22:07:21 +0100] "GET /nhl/blog/puck_daddy?author=Greg+Wyshynski HTTP/1.0" 404 5110 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; [help.yahoo.com...] 0 mysite com "-" "-"
I can not see why this should be an exploit attempt. And this is not a log file spam attempt either as there is only a local side uri in the GET / and not the full url in the referrer.
Yahoo is asking for one of it's own pages, on my site and this seems like some kind of corruption.
Fishing for Exploits
The second is the hit for exploit pages. These are pages which do not exist on my site, and have never existed on my site, but a hacker will look for in order to determine if a site has a the potential to be hacked if it has a particular app.
I am not certain of this but there could be an exploit with that app and that is why pages are being crawled.
Usually potential victim sites are found via serps. e.g. Assume that an older version of an app has a vulnerability. A hacker would attempt to find victims by searching for the app file name, or a version no. or any file in a subdirectory which could indicate the version stored. This can be done with a script to check many sites, or manually via a search engine
search: widgetforum v2.0
If those are searched for then the serps will return sites which have those pages and a hacker can browse the list in order to choose marks and test the full exploits.
If a hacker wanted to test this against a site he wishes to target directly he would call for those pages on the target site.
e.g. mysite com/widgetforum/adminpanel.php
If that did not produce a 404 then he knows I have that file and will then procede to infiltrate it.
Why does Yahoo ask for the pages?
This is the bit I can not understand. Yahoo is DIRECTLY asking for the exploit pages. It is as if the slurp bot has been configured in some way to search out for key pages.
As I said earlier, hackers can use SEs to find potential sites but the serps will only list sites with the vulnerabilities - they have the pages on the site.
But this is not what is happening here:
126.96.36.199 - - [10/Jun/2010:07:00:12 +0100] "GET /myHigherEdJobs/Login/ HTTP/1.0" 404 5066 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; [help.yahoo.com...] 0 mysite com "-" "-"
In this instance Yahoo crawler is hunting for the page and not the client IP address via serps.
Message Board Posting / Website Page with link to my site
Maybe there is a webpage, or a message board posting out there which has a link to this
But I can't find any on yahoo or google. I have searched for mysite and the other potential exploit pages (advanced.cfm, alpha zero victor ..) and none are returned.
Besides, what is the point in a link like that being crawled? It would have no advantage for the hacker as he would never know that the slurp crawl would return a 404 or not.
1. Rogue posts on message board :
'Hi .... link: mysite com/myHigherEdJobs/Login/' ... Bye
2. Any human reading that may decide to click the link and they would get a 404. The rogue hosting site would not know this.
3. Yahoo crawls the rogue post and follows the link and receives the 404. The rogue hosting would not know this.
This could have been the intention of the alpha zero victor exploit. But again it is not configured correctly. If the intention is to fill log files up with links to rogue sites then surely the referring url should be targetted and not the GET / url.
I will let this run for a few days but soon I will have to ban the slurp IP totally. It is constantly setting off the security app alarms.