|.htaccess rewrite rule to shorten url|
I'm not a big .htaccess friend and have been using it just for simple redirects
Sorry couldn't find anything pertaining to my case at Introduction to mod_rewrite [webmasterworld.com] - so I'm referring to big Apache heads for help.
Script writes the following unfriendly URL when a request to the home page is made:
all internal urls are then interpereted like that: cgi-bin/scriptdir1/scriptdir2/somepage.htm?SomeScriptVariable.
I just want it to be without cgi-bin/scriptdir1/scriptdir2 part.
I've compiled the following set of rewrite rules which I don't want to test without Somebody Smart's resolution:
RewriteRule ^cgi-bin/wspr/planet/index.htm$ index.htm [L] - just for this particular URL -
RewriteRule ^cgi-bin/wspr/planet/(.*)$1?(.*)$2 $1?$2 [L] - the mask for all other cases -
I'm not sure as to the proper syntax of wildcards here so I would appreciate any assistance.
That code won't work, so I'd suggest you don't try it... :(
The bigger question here is, what are you trying to accomplish?
If you want your URLs to appear to be shorter and friendlier to users and to search engines, then the general approach is this: Modify your scripts to output friendly URLs.
Clients and search engines will see and use those URLs.
When clients request a friendly URL, it is intercepted by mod_rewrite, and translated back into the long form needed to properly access your existing directory structure, and to call your script.
Mod_rewrite works in the time after a request is received by your server, but before any content is served and before any scripts are called. Therefore, it cannot be used to change URLs after your script has output them in a response to a client browser or search engine.
I think this may change your question, but if not, we can discuss your code. The main problem with it is that your back-reference syntax is incorrect; "$1" and "$2" should appear only in the substitution string in the rule, not in the pattern. See the Apache mod_rewrite documentation [httpd.apache.org], specifically the discussion of back-references in the RewriteRule section.
And sorry for my lame code:(
|When clients request a friendly URL, it is intercepted by mod_rewrite, and translated back into the long form needed to properly access your existing directory structure, and to call your script. |
If I get this right I will then need to re-request the long form from a newly created URL probably like this:
RewriteRule ^cgi-bin/wspr/planet/([^/]+)\.htm$ $1.htm [R,L]
RewriteRule ^([^/]+)\.htm$ cgi-bin/wspr/planet/$1.htm [L]
Will it keep somename.htm in the URL but refer to
cgi-bin/wspr/planet/ file to call the script to produce content?
Thanks a lot for help
No, you won't need two rules. Just follow the three-item list I posted above. There's no (easy) way I know of to avoid modifying the script, because it is the "agent" that outputs page content (and therefore the on-page links) sent back to the client browser or search engine. So, it must produce the short URLs that "you want the public to see." Then you use mod_rewrite to convert those incoming short URLs (when they are requested by clients) back into whatever long form you need to work with your site's directory setup and scripts.
This might help -- Here's a simplified sequence of events in an HTTP GET transaction: Client browser or search engine sends a page request (URL) to your server
Mod_rewrite on your server modifies the requested URL and converts it to local filesystem path
Server reads that file from disk and either serves (transmits) the file, or runs it as a script (depending on its MIME-type & server setup)
Client browser displays the file content sent by the server, or search engine spider analyzes the content
User clicks, or search engine requests, a link on that page
As you can see, because of its place in the sequence of events, mod_rewrite can change a requested URL to a filename other than what was requested, but it cannot change an on-page-link URL that is being sent back to the client - the script must do this, or if you are serving a static document, the link URLs must be changed on the requested page itself.
My somewhat-terse warning about the code was meant only to prevent you from taking your site down with a 500-Server Error - which is never a good thing. :o