Forum Moderators: phranque

Message Too Old, No Replies

passing & / + etc through mod rewrite

using php urlencode to pass a variable

         

proper_bo

10:58 pm on Dec 21, 2005 (gmt 0)

10+ Year Member



My problem is this:
I cannot seem to get the full string to pass to the page using mod rewrite and urlencode.

the site rewrites from (for example) /search-business-Portable%20Bar%20%26amp%3B%20Catering.html

to

search.php?type=business&infoid=Portable%20Bar%20%26amp%3B%20Catering

the rewrite line is:

RewriteRule ^search-business-(.+).html /search.php?bustype=$1 [L]

I use urlencode to build the links so all special characters are turned into %26 etc.

BUT it doesn't work when there is anything other than + or %20
it still passes the variable but only up to the special character. In the example above it will would pass 'Portable Bar ' (including the space after but not the &)

does anyone know what I am doing wrong here?

[edited by: jdMorgan at 1:39 am (utc) on Dec. 22, 2005]
[edit reason] Disable smileys in code. [/edit]

jdMorgan

1:39 am on Dec 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



See the [NE] flag of RewriteRule.

Also, I suggest reviewing RFC 2396 [faqs.org] - Uniform Resource Identifiers (URI): Generic Syntax

Jim

proper_bo

9:37 am on Dec 22, 2005 (gmt 0)

10+ Year Member



jdMorgan,

thank you for your suggestions.

I eventually found the solution (or what I think is the solution) at [totalchoicehosting.com...]

it basically says that there is a problem with the rewrite of & (& ) and the solution is to double encode them.
I first tried using your [NE] suggestion but it failed so I then put
$str = preg_replace("/&/","%26",$str);
into my code. It means that url's will then have %2526 for & (& ) but thats not the end of the world.

Any further comments on this? Do I have it all wrong?

proper_bo

3:02 pm on Dec 22, 2005 (gmt 0)

10+ Year Member



I would just like to add to this.

I am now doing the following:
urlencode(urlencode($url))
to all the url's that may contain special characters.

this gets around what appears to be a fault with mod rewrite where it decodes the url query string instead of just passing it.

its a workaround that appears to work but is not really the ideal.

has anyone else had experience with this problem?

jdMorgan

3:40 pm on Dec 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not a fault with mod_rewrite. mod_rewrite's RewriteRule works on *URLs*, and the very semantics of the phrase "encoded URL" gives a clue as to what's going on: Since it is a server's job to handle URLs (and not encoded URLs), the server is decoding your URL as it should.

You can access the un-decoded *URI* by using RewriteCond and the variables %{REQUEST_URI} and/or %{THE_REQUEST}, and then use it as a back-reference in RewriteRule with [NE] if required.

Long-term, I would suggest you eliminate all reserved characters (as outlined in the RFC I cited) from your URLs in the interest of the 'appearance' of your URLs in search results, efficiency, and site usability, especially from a type-in URL standpoint.

Jim

proper_bo

4:48 pm on Dec 22, 2005 (gmt 0)

10+ Year Member



Ok,

I can now get it to echo the two things you mentioned but neither are 'just' the part I am trying to turn into a value.

but never mind that. I may aswell do the work now rather than later so here is my question.

If I want to do as you suggested above and remove all special characters from the url what do you suggest?

In this case it is passing a term to search. What I could do is use a new table in the db to list all unique search options. The url could then be a number which corresponds to the id of the search term. Good idea?

jdMorgan

5:50 pm on Dec 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would avoid id numbers as well; I assume that your purpose in putting the search terms in the URL is to help with search engine ranking by including keyword-in-URL.

If that's the case, then before creating a link to display on your pages, run each search string through a PHP preg_replace and strip out everything except characters that are allowed (unencoded) in a URL. Replace spaces with "+" or "-" and discard pretty much everything else except alphanumeric characters. You may in fact already do much of this on the input side of your on-site search script. Take a look at the search strings passed to your site in referrals from Google, Y, and MSN for examples.

If you need to look up content based on this URL being requested from your server, then it may be necessary to add a new field to your database --a 'cleaned-up' description as used in the URL-- and to use that description instead of the original free-text version as the select lookup key.

The end result is that you'll have URLs like /search-business-portable-bar-catering.html which is easier for people to remember and type in, easier for search engines to parse, and easier for your server-side code to handle.

That's about as specific as I can get, since I don't 'know all about' your site.

Jim

proper_bo

6:19 pm on Dec 22, 2005 (gmt 0)

10+ Year Member



Genius. A cleaned up column in the business database.
I'm speechless. Why didn't I think of this!

Cheers.

Funny how one discussion can end up somewhere completely different.