Forum Moderators: phranque

Message Too Old, No Replies

A "%" screwing up my Rewrite

how to get around it?

         

carfac

4:08 pm on Jul 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Jim, Everyone!

I recently did a massive mod_rewrite that effects just about every link on my site, and have hit a snag- I escape the URL, and some of the links have a "/" in them. Escaped, this turns into a "%27".

Just noticed the occasional error, and traced it to the links with the "%27" in them- specifically, the "%" character. Here is the initial rewrite:

RewriteRule ^/path/([0-9]+)-.*$ /new_path/redirect.cgi?id=$1 [PT]

The idea is to pull the ID number, followed by a dash, then the page title. Some page titles, though, are "Term1/Term2" which results in a url like:

/path/12547-Term1%27Term2

It's that darn "%" killing it- if I use:

/path/12547-Term127Term2

the rewrite works.

Because I escape the URL, I cannot really remove the "%27" (and because there are spaces and such in Terms 1 & 2, it must be escaped), so I thought I would rework the rewrite. I tried:

RewriteRule ^/path/[0-9]+)-([a-z0-9]).*\%(\w).*$ /new_path/redirect.cgi?id=$1 [PT]

and various versions of that... and no go.

Any ideas?

Thanks!

Dave Koch

jdMorgan

9:40 pm on Jul 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Dave!

By the time RewriteRule "sees" a URL, it has been localized and unescaped, so you should be seeing a "/" there. If that doesn't help, then you can test the unescaped URL (the whole local URL-path, starting with "/") by using {THE_REQUEST} in a RewriteCond. You then have the option of "manually" rewriting the %27 to a slash, or of proceeding with a RewruteRule that uses ".*" or some other ambiguous pattern where you expect the %27 to be.

Ref: [webmasterworld.com...]

Jim

carfac

11:25 pm on Jul 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Jim!

Thanks, funny I escape it only for it to be ultimatelt unescaped. I tried a couple expressions, using %27 and /, and still could not get iot working. Getting back to the initial rewrite, I am not sure what that is not working to begin with.

RewriteRule ^/path/([0-9]+)-.*$ /newpath/rewrite.cgi?id=$1 [PT]

Since ALL I am concerned with at all are those digits before the dash... why doesn't the ".*" just match everything else? I do not care about the dash or %2F or anything else, it's just there for a "friendly" url. Is there something simple I can just put there that will gobble up the problem code?

Dave

jdMorgan

11:27 pm on Jul 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sure, just toss the whole lot out:

RewriteRule ^/path/([0-9]+)- /newpath/rewrite.cgi?id=$1 [PT]

(But your ".*" should have worked anyway, so this is odd. Anything in the log files?)

Jim

carfac

11:46 pm on Jul 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, NOTHING in the log files- maybe I will enable mod_rewrite logs... because I still get a 404...

Funny, it works perfect on 99% of the pages... but any page that has a "/" in the title it fails on...

I am going to grep the access log...

OK WEIRD!

Access log has this:

(My IP) - - [26/Jul/2005:17:42:24 -0600] "GET /path/1048276-widget_%2F_MoreWidget HTTP/1.1" 200 15178 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"

jdMorgan

12:03 am on Jul 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, %27 is a single quote, and %2F is a slash. So is your 'escaping' routine encoding widget/MoreWidget to widget_%2F_MoreWidget, or what? (I can't tell what's wrong with the logged URL you posted, because I don't know what you expected it to look like...)

Jim

carfac

4:03 am on Jul 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Jim-

Sorry- had to mow the lawn, make dinner, other things....

OK, there is NOTHING wromg with the snip from the logs- it shows a 200 code- and yet on the site, I get the 404 page- that is what I thought was weird.

%2F, as from the log snip- IS exactly what is escaped. The symbol escaped is "/", but the program doing the escaping is part of a larger program, so it is probably not escaping correctly or something.

But, as evidenced by the log snip, there is something else going wrong... do not yet know what!

dave