Forum Moderators: phranque
the site rewrites from (for example) /search-business-Portable%20Bar%20%26amp%3B%20Catering.html
to
search.php?type=business&infoid=Portable%20Bar%20%26amp%3B%20Catering
the rewrite line is:
RewriteRule ^search-business-(.+).html /search.php?bustype=$1 [L]
I use urlencode to build the links so all special characters are turned into %26 etc.
BUT it doesn't work when there is anything other than + or %20
it still passes the variable but only up to the special character. In the example above it will would pass 'Portable Bar ' (including the space after but not the &)
does anyone know what I am doing wrong here?
[edited by: jdMorgan at 1:39 am (utc) on Dec. 22, 2005]
[edit reason] Disable smileys in code. [/edit]
Also, I suggest reviewing RFC 2396 [faqs.org] - Uniform Resource Identifiers (URI): Generic Syntax
Jim
thank you for your suggestions.
I eventually found the solution (or what I think is the solution) at [totalchoicehosting.com...]
it basically says that there is a problem with the rewrite of & (& ) and the solution is to double encode them.
I first tried using your [NE] suggestion but it failed so I then put
$str = preg_replace("/&/","%26",$str);
into my code. It means that url's will then have %2526 for & (& ) but thats not the end of the world.
Any further comments on this? Do I have it all wrong?
I am now doing the following:
urlencode(urlencode($url))
to all the url's that may contain special characters.
this gets around what appears to be a fault with mod rewrite where it decodes the url query string instead of just passing it.
its a workaround that appears to work but is not really the ideal.
has anyone else had experience with this problem?
You can access the un-decoded *URI* by using RewriteCond and the variables %{REQUEST_URI} and/or %{THE_REQUEST}, and then use it as a back-reference in RewriteRule with [NE] if required.
Long-term, I would suggest you eliminate all reserved characters (as outlined in the RFC I cited) from your URLs in the interest of the 'appearance' of your URLs in search results, efficiency, and site usability, especially from a type-in URL standpoint.
Jim
I can now get it to echo the two things you mentioned but neither are 'just' the part I am trying to turn into a value.
but never mind that. I may aswell do the work now rather than later so here is my question.
If I want to do as you suggested above and remove all special characters from the url what do you suggest?
In this case it is passing a term to search. What I could do is use a new table in the db to list all unique search options. The url could then be a number which corresponds to the id of the search term. Good idea?
If that's the case, then before creating a link to display on your pages, run each search string through a PHP preg_replace and strip out everything except characters that are allowed (unencoded) in a URL. Replace spaces with "+" or "-" and discard pretty much everything else except alphanumeric characters. You may in fact already do much of this on the input side of your on-site search script. Take a look at the search strings passed to your site in referrals from Google, Y, and MSN for examples.
If you need to look up content based on this URL being requested from your server, then it may be necessary to add a new field to your database --a 'cleaned-up' description as used in the URL-- and to use that description instead of the original free-text version as the select lookup key.
The end result is that you'll have URLs like /search-business-portable-bar-catering.html which is easier for people to remember and type in, easier for search engines to parse, and easier for your server-side code to handle.
That's about as specific as I can get, since I don't 'know all about' your site.
Jim