Forum Moderators: coopster
i read something at webmasterworld last month which suggested putting a
<base href="http://www.example.com" /> tag onto every page, containing the full url of the page. because i use a template system on my site, i included it in the header with a piece of php code...
<?php
echo'<base href="http://'.$_SERVER['HTTP_HOST'].''.$_SERVER['REQUEST_URI'].'" />';
?> when i was going through my logs, i came across a funny url that doesn't exist on my site. it was something like this...
http://www.example.com/example.html/blah.html you can see that the actual url should have been
www.example.com/example.html, and the extra blah.html on the end probably just came from a wrongly typed link. $_SERVER['REQUEST_URI'] included the extra blah.html, when i looked at the page source, the blah.html had naturally been included in the <base href="http://www.example.com/example.html/blah.html" /> tag -- and it was hard-coded onto the page. nothing bad happened though.
but now i am wondering... presumably a lot of people write the urls into their <base> tags in a similar way. do you think it could be a security issue? can someone add something onto the end of your url that would harm you if it was hard-coded onto your page?
If you cache this page and send it to your other users, they could potentially enter data (such as passwords or possibly paypal/credit card info), then have their data sent--not to your server, but to the target of the base href + relative href. Iin this case, that's whatever the attacker injected into the HTTP_HOST value.
So, say an attacker notices that your website uses the base tag and gets a bright idea. He notices that the base tag changes occasionally (possibly from non-www. to www., depending on a factor such as type-in traffic), meaning it's being cached. He then notes at about what time of day it changes. It's a long shot, but from there a skilled attacker could definitely penetrate into your system and steal user data. Not only that, but your site would also appear to users to be "broken" for the day.
It might be better just to hardcode your website address, rather than using HTTP_HOST. :) If you're making something redistributable, request the website address once, then store it in a config file--anything to avoid the HTTP_HOST header.