Forum Moderators: phranque
I recently did a massive mod_rewrite that effects just about every link on my site, and have hit a snag- I escape the URL, and some of the links have a "/" in them. Escaped, this turns into a "%27".
Just noticed the occasional error, and traced it to the links with the "%27" in them- specifically, the "%" character. Here is the initial rewrite:
RewriteRule ^/path/([0-9]+)-.*$ /new_path/redirect.cgi?id=$1 [PT]
The idea is to pull the ID number, followed by a dash, then the page title. Some page titles, though, are "Term1/Term2" which results in a url like:
/path/12547-Term1%27Term2
It's that darn "%" killing it- if I use:
/path/12547-Term127Term2
the rewrite works.
Because I escape the URL, I cannot really remove the "%27" (and because there are spaces and such in Terms 1 & 2, it must be escaped), so I thought I would rework the rewrite. I tried:
RewriteRule ^/path/[0-9]+)-([a-z0-9]).*\%(\w).*$ /new_path/redirect.cgi?id=$1 [PT]
and various versions of that... and no go.
Any ideas?
Thanks!
Dave Koch
By the time RewriteRule "sees" a URL, it has been localized and unescaped, so you should be seeing a "/" there. If that doesn't help, then you can test the unescaped URL (the whole local URL-path, starting with "/") by using {THE_REQUEST} in a RewriteCond. You then have the option of "manually" rewriting the %27 to a slash, or of proceeding with a RewruteRule that uses ".*" or some other ambiguous pattern where you expect the %27 to be.
Ref: [webmasterworld.com...]
Jim
Thanks, funny I escape it only for it to be ultimatelt unescaped. I tried a couple expressions, using %27 and /, and still could not get iot working. Getting back to the initial rewrite, I am not sure what that is not working to begin with.
RewriteRule ^/path/([0-9]+)-.*$ /newpath/rewrite.cgi?id=$1 [PT]
Since ALL I am concerned with at all are those digits before the dash... why doesn't the ".*" just match everything else? I do not care about the dash or %2F or anything else, it's just there for a "friendly" url. Is there something simple I can just put there that will gobble up the problem code?
Dave
Funny, it works perfect on 99% of the pages... but any page that has a "/" in the title it fails on...
I am going to grep the access log...
OK WEIRD!
Access log has this:
(My IP) - - [26/Jul/2005:17:42:24 -0600] "GET /path/1048276-widget_%2F_MoreWidget HTTP/1.1" 200 15178 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
Sorry- had to mow the lawn, make dinner, other things....
OK, there is NOTHING wromg with the snip from the logs- it shows a 200 code- and yet on the site, I get the 404 page- that is what I thought was weird.
%2F, as from the log snip- IS exactly what is escaped. The symbol escaped is "/", but the program doing the escaping is part of a larger program, so it is probably not escaping correctly or something.
But, as evidenced by the log snip, there is something else going wrong... do not yet know what!
dave