Forum Moderators: phranque
But I really want it to be
http://www.example.com/mushroom/spiders.html
But the CMS support tells me the ".html" will cause problems.
Now, I'm turning to server-side solutions. Could I simply use a mod_rewrite to do "ignore" or "remove" the .html part before sending the request to the CMS? But I don't want this to be a redirect.
I know very little of mod_rewrite. If anyone could help, that would be cool.
Thanks much.
[edited by: jdMorgan at 7:29 pm (utc) on Sep. 4, 2005]
[edit reason] Example.com [/edit]
If you aren't worried about duplication and have control over the links, you could just add (\.html)? to the end of the left side of the ruleset(s) and both with html and without would be served the information. Then you just have to change your links. Beyond there it gets more complex.
Assuming *all* pages are from the CMS, you could also use a single rule at the begining of the CMS ruleset:
RewriteRule ([^.]+)\.html /$1
Please note: The [L] flag is left off intentionally. This will treat any page ending in .html as the extensionless equivellent and that will be evaluated against all current rules and served content accordingly.
Again duplicate content is going to be the tough part to get around...
Would you post an examplized portion of the ruleset your CMS is using to serve the content to the pages?
Justin
Your mod_rewrite works PERFECTLY!
RewriteRule ([^.]+)\.html /$1
THANKS much for your help. I really appreciate it.
I have translated from (1) to (2)
(1) http://www.example.com/news/excellent.html
(2) http://www.example.com/index.php/default_admin/news/excellent.html
Well, (1) and (2) both work now. That's what I want in the first place.
I'm using Ezpublish so if you are familiar with this CMS, you will know it adds weird index.php/{access} to every URL as well as ignoring .html altogether.
Having converted the dynamic asp urls to static ones, which works great, i have also changed all the links in my site to those new converted static pages.
My question is that the search engines have cashed the earlier asp string pages and list these in the index - these pages still work but obviously have no links to them from the site.
Will the search engine just replace the pages as it goes or will it think its duplicate content and start deleting them?
Anyone have experience of this?
Here is an example:
www.yoursite.com/yourfile/1/yourpage.html
is the same as
www.yoursite.com/yourasppage.asp?id=1&page=yourpage.
This would be the main redirect:
RewriteRule ^yourfile/([^/]+)/([^.]+)\.html /yourasppage.asp?id=1&page=yourpage [L]
The converse is:
RewriteCond %{THE_REQUEST} yourasppage\.asp\?id=([0-9]+)&page=([a-z]+)
RewriteRule ^yourasppage\.asp$ /yourfile/%1/%2.html [R=301,L]
You will need to do this for all possible variables in your query strings, usually you can group a few, but most of the time they it will require quite a few individual rules.
THE_REQUEST will only rewrite original requests, not all requests, so browsers and SEs requesting the page are redirected, but the information you need from them that is requested internally will still be served - This is the only way to do this type of rewrite without an infinite loop.
For efficiency I would recommend you put the new rules and conditions at the end of your current file.
Hope this helps.
Justin