Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond %{REQUEST URI} not working if URL parameter contains URL

         

Marconi

10:13 am on Aug 7, 2010 (gmt 0)

10+ Year Member


Hello,

I'm having a problem with this rewrite rule here:

RewriteCond %{REQUEST_URI} ^/test.php [NC]
RewriteRule ^(.*)$ http:/[smilestopper]/www.example.com [R=301,L]

The problem:
As soon as I pass a parameter that contains an URL (starting with 'http://') the RewriteCondition doesn't match anymore, it breaks it.

Example:
<a href="test.php?r=http:/[smilestopper]/www.example.com">click</a>

The above condition doesn't match anymore. However, after removing the 'http://' the condition DOES match again. I've been trying a lot of things, but just can't get it to work. Is this a bug, or what do I do wrong ? Also I CAN'T MODIFY this parameter (omitting the http:// prefix) as this is coming from an SSI script running on an SSI ONLY server (no PHP/Perl/ASP). Could someone help please ?

Thank you !

(Apache 2.2.16 on CentOS)

g1smd

10:58 am on Aug 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You should not include protocol and slashes within the attached parameter value.

Don't fight the HTTP specs.

Since you know that this parameter is for a URL, then simply internally re-include the http:// when the receiving script processes the request.

The originator of the script needs to modify it.

phranque

12:34 pm on Aug 7, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], Marconi!

the "/" is a reserved character for urls so perhaps it needs to be encoded as a "%2F".

Example:
<a href="test.php?r=http:%2F%2Fwww.example.com">click</a>

try that and see if it helps.

Marconi

2:03 pm on Aug 7, 2010 (gmt 0)

10+ Year Member


Thanks for your replies.

As I said, I AM NOT ABLE TO CHANGE THAT PARAMETER !
I can ONLY use SSI on the server of the source script !
OK, let's be more specific. What I'm trying to do is something like this:

<img src="counter.gif?r=<!--#echo encoding="url" var="HTTP_REFERER" -->">

I'm rewriting/redirecting this file on THIS server to another domain on my OWN server and trying to process the HTTP_REFERER with PHP. The URL encoding in SSI doesn't seem to work in this code, even explicitly stated.

Again, I ONLY HAVE SSI & mod_rewrite on this server ! NO PHP, NO PERL, NO ASP, nothing :[smilestopper](

Otherwise it wouldn't be a problem at all... That's why I came here in the hope someone might have a solution to my problem.

[1][[b]edited by[/b]: Marconi at 2:13 pm (utc) on Aug 7, 2010][/1]

g1smd

2:12 pm on Aug 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What you are trying to do is use characters that are NOT ALLOWED in that part of a URL.

The only answer to the question "how to do this?" is DO NOT DO IT THIS WAY.

You MUST change the script.

Marconi

2:22 pm on Aug 7, 2010 (gmt 0)

10+ Year Member



@g1smd: Just checking... The funny thing is that you are using the very same server as me :-) Qsl is very restricted, but at least it allows SSI. Any idea how else to get the referrer from there transfered to my server ? And no, I don't like JavaScript.

73

jdMorgan

5:03 pm on Aug 9, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<img src="counter.gif?r=<!--#echo encoding="url" var="HTTP_REFERER" -->">

It looks to me like you're using the wrong server variable here. The query string won't be "visible" to the RewriteRule. It will be present in either %{QUERY_STRING} or %{THE_REQUEST}. It may be decoded in the former and encoded in the latter. Test to find out, using whichever of the following two RewriteConds seems to work, or the shorter one if both work:

RewriteCond %{QUERY_STRING} ^r=.
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /counter\.gif\?r=.
RewriteRule ^counter\.gif$ http://www.example.com/ [R=301,L]

This rule should not affect the query string; It should be appended as-is to the target URL and "pass through" this rule unaffected. If you wish to remove it, end the substitution URL with a "?".

Jim

Marconi

12:55 pm on Aug 13, 2010 (gmt 0)

10+ Year Member



Thanks for that Jim, appreciate it ! But in fact, things are even more complex, I just tried to simplify & abstract here - and the thread went the wrong direction. Sorry for being misleading ! What I wanted is a working RewriteCondition that I would transform into an EXCLUDE rule. OK, let me explain my setup here in depth. The actual rewriting/redirecting to the PHP script on the other domain on my server (including the URL parameter) was working from the very beginning. I am working with two .htaccess files here, one on the server that gets the traffic (that has the <img src="counter.gif?r=http://www.referrer.com"> on it) and the other one on my own (shared) server. On the later I am using a GLOBAL .htaccess file for all my domains, and the PHP counter script (counter.php) has to be on a domain with WordPress installed on it, the URL permalinks are being re-written by the global .htaccess file. The counter.php file is in the root of that domain, the blog is inside a /subfolder and is being redirected from www.domain.com to www.domain.com/blog. What I need to do now is on MY server's .htaccess file to exclude it from rewriting (permalinks) and redirecting (to the /blog subfolder). That's basically the whole story. As I said, pretty tricky and I couldn't get it to work with the code above, i.e.

RewriteCond %{QUERY_STRING} !^r=.
or
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /counter\.gif\?r=.

...isn't working (tried together and separate) when there is an URL passed inside the parameter. The condition just won't match anymore if 'http://' is included. However, omitting the 'http://' part it works as expected, gets excluded and is NOT being rewritten/redirected. This is the only goal I have and the reason for this thread here !

Never mind, I keep trying. To be honest, I'm still not very familiar with RegEx even though my global .htaccess file is already full of it, I get easily confused and it's a real pain working with it. There should be a 'logging feature' for debugging rules, invokable from inside the .htaccess file itself (not via server config only, which is inaccessible on shared hosting). Also I'd like to measure the processing time to see the impact of having many rules included, but this is not possible either. Furthermore, why nobody at ASF (Apache Software Foundation) considered an 'inheritance' or 'include' option for .htaccess ? This would be sooo convenient ! Having a global .htaccess file with - say - IP deny rules and separate, domain specific .htaccess files for each domain (inside their own root directories). In all (web) programming languages there is an option to include code from external sources, so why not including it in .htaccess ? Why must it be that messy and such a pain working with .htaccess anyway ? Every programming language I know is way easier to understand and more logical compared to that. It's such a pain... spending hours or even days to get some non-trivial rules to work. Maybe it's just me and the RegEx I need to look more into. But we live in 2010, it's such a shame :(

So what I did now is I filed a bug report regarding the non-working SSI echo encoding in mod_include at Apache.org bugzilla bugtracking system ([issues.apache.org ]). I bet I only have the JavaScript option left now. I wanted to avoid this as 1) bots usually don't execute JS, but I want to track them as well, and 2) JS can be disabled in the browser or filtered by software. So I though the <img> version would work better to track all traffic, including Robots. I might be wrong.

But thanks again !

jdMorgan

7:01 pm on Aug 13, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is an inheritance rule for mod_rewrite -- but only for mod_rewrite. See "RewriteOptions Inherit".

However, lower-directory-level .htaccess files can override settings made by higher-level .htaccess files *if* the lower-level .htaccess is still in the 'current' directory path after the higher-level .htaccess file executes.

(When relying on this, be sure that all external redirects are done in the higher-level .htaccess files before any subsequent internal rewrites can be invoked. Otherwise, previously-internally-rewritten filepaths will be exposed to Web clients as URLs by subsequent external redirects, making a mess of your listings in search results.)

There's nothing visibly wrong with what you're trying to do here. And believe me, I've seen tons of broken code... subtly broken, and catastrophically broken. Yours is neither.

Consider whether you may have other settings, directives, and/or modules which may be interfering with your current rules. I would particularly investigate whether any 'security' modules are installed on this server, which may make passing a full URL impossible. Such a filter would defeat quite a few malicious exploits. If so, you may need to ask your host to disable that particular filter for your account.

Jim