akameng

msg:3480735 | 11:39 am on Oct 18, 2007 (gmt 0) |
hey, Try this: RewriteEngine on RewriteBase /info/ #exclude /info/a-webpage/ by!^/info/.+/.* RewriteCond %{REQUEST_URI}!^/info/.+/.* #include only ^/info/(.+)\.html$ RewriteRule ^/info/([^/]+)\.html$ /$1/ [R=301,L]
|
jdMorgan

msg:3480772 | 12:35 pm on Oct 18, 2007 (gmt 0) |
This approach is sort of backwards; A redirect does not 'create' an extensionless URL. In fact, a redirect does not 'create' a URL at all; URLs are defined by the links on your pages. Only filenames are defined on or by your server. To deploy extensionless URLs: 1) Edit your pages (or your page-generation script) to link to extensionless URLs 2) Add mod_rewrite code to internally rewrite those URLs, when requested from your server, to the correct-extension file. 3) Optional: Detect client requests for URLs with extensions, and externally redirect those to the extensionless URL. The purpose of this is to 'recover' old backlinks and user bookmarks, and to speed up the switchover to your extensionless URLs in search engine results. So basically, you're trying to do step 3 here without doing the other two steps. This will result in your visitors having to go through the added delay of an external redirect for every extensionless page request, and complicate the search engines' job of indexing those pages. Also, I question your use of RewriteBase, I don't think you need it here. And further, extensionless files should not end with a slash; URLs ending with a slash indicate a directory not a file, and this will likely also cause you problems/complications with linked objects on your extensionless pages. Here are examples of the two rules you might use to implement extensionless URLs for /info .html files, assuming you have changed the links on your pages to remove the .html extensions for URL-paths resolving to the /info directory:
RewriteEngine on RewriteBase / # ## Internally rewrite extensionless /info URLs to existing .html files # If no filetype extension on requested URL RewriteCond %{REQUEST_URI} !\.[a-z0-9]+$ # If URL plus extension exists as a file RewriteCond %{REQUEST_FILENAME}.html -f # Internally rewrite to file with extension RewriteRule ^info/(.*)$ /info/$1.html [L] # ## Externally redirect old .html-extension /info URLs to new extensionless URLs # If direct client request for .html files RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^.]+\.html\ HTTP/ # Externally redirect to URL without extension RewriteRule ^info/([^.]+)\.html$ http://www.example.com/info/$1 [R=301,L]
These are freshly-typed and untested. A known limitation is that the simple regex patterns shown here do not support URLs with periods in the directory pathnames. The check for 'file exists with .html extension' is not strictly required for your simple application. However, I show it here in case you might like to add another file extension later. For example, if the requested URL-path does not resolve to an existing .html file, you could add another rule to check to see if it exists as a .htm or .shtml file. If you only ever plan to support one filetype, you can comment-out or delete the RewriteCond for .html file-exists checking for improved performance. Jim
|
akameng

msg:3481210 | 7:43 pm on Oct 18, 2007 (gmt 0) |
I am sure that jdMorgan have a best solution, But I will suggest only modify this line: RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^.]+\.html\ HTTP/ to RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\s/[^.]+\.html\sHTTP/\d\.\d$ only escaped(\ ) space to \s for easy understanding and HTTP/1.1 or any version to: HTTP/\d\.\d$ because THE_REQUEST contain The full HTTP request line sent by the browser to the server (e.g., " GET /index.html HTTP/1.1 "). This does not include any additional headers sent by the browser. the method will be (OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT)
|
jdMorgan

msg:3481335 | 9:52 pm on Oct 18, 2007 (gmt 0) |
Because the pattern ending in HTTP is not end-anchored, there is no need to specify anything past the end of "HTTP/". The difference between "\ " and "\s" is largely a matter of style. Jim
|
g1smd

msg:3481417 | 11:44 pm on Oct 18, 2007 (gmt 0) |
When I use "extensionless URLs" they are actually index files each in their own folder. The URL ends with a trailing / every time.
|
MrBlack

msg:3481432 | 11:57 pm on Oct 18, 2007 (gmt 0) |
Thanks for the help guys. g1smd, thats a great idea too and will probably be the best solution for me. But do you need to place a htaccess in every directory to remove the index.html or have you done this with the htaccess in the root? [edited by: MrBlack at 12:00 am (utc) on Oct. 19, 2007]
|
g1smd

msg:3481447 | 12:09 am on Oct 19, 2007 (gmt 0) |
I use the .htaccess file in the root to control everything on the whole site. All requests for (default¦index)\.(php(4¦5)?¦html?¦cfm¦aspx?) are stripped back to the preceding "/". All requests with parameters on the end have those stripped too.
|
MrBlack

msg:3481480 | 12:53 am on Oct 19, 2007 (gmt 0) |
Ok, this is what I have come up with to remove index.html from the urls in root and all sub directories.... RewriteEngine on RewriteCond %{THE_REQUEST} ^GET\ /.*/index\.html\ HTTP/ RewriteRule (.*)index\.html$ /$1 [R=301,L] Can you see any problems with it? Really appreciate your help guys!
|
jdMorgan

msg:3481509 | 1:31 am on Oct 19, 2007 (gmt 0) |
Unless you use the generic pattern I show in the code I posted above, you'll also want to include the HEAD method, as well as GET. Jim
|
MrBlack

msg:3486578 | 8:33 pm on Oct 24, 2007 (gmt 0) |
Ok, I have now come up with the following code which removes the index.html from the url for the subdirectory and sub-subdirectory RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*/index\.html\ HTTP/ RewriteRule (.*)index\.html$ /$1 [R=301,L] However I cannot make it work for the root index.html aswell. Any ideas where I am going wrong?
|
jdMorgan

msg:3486636 | 9:25 pm on Oct 24, 2007 (gmt 0) |
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/ RewriteRule ^(([^/]+/)*)index\.html$ http://www.example.com/$1 [R=301,L]
Jim
|
MrBlack

msg:3486658 | 9:44 pm on Oct 24, 2007 (gmt 0) |
Thanks very much
|
g1smd

msg:3486668 | 9:59 pm on Oct 24, 2007 (gmt 0) |
So that I can slot the same code on to every website, I don't just test for index.html requests. I test for (default¦index)\.(php(4¦5)?¦html?¦cfm¦aspx?) and all of those redirect. It also partly hides which technology the site is actually using.
|
MrBlack

msg:3488044 | 4:44 am on Oct 26, 2007 (gmt 0) |
| So that I can slot the same code on to every website, I don't just test for index.html requests. I test for (default�index)\.(php(4�5)?�html?�cfm�aspx?) and all of those redirect. It also partly hides which technology the site is actually using. |
| Sounds good. How would you slot this into the code that jdmorgan provided? I am particularly interested in checking for index.php as well as I am currently converting a site running on php to straight html. The php site had the urls rewritten to .html extensions apart from the root index.php page. Sorry, I am a newbie when it comes to this :)
|
g1smd

msg:3488252 | 12:52 pm on Oct 26, 2007 (gmt 0) |
Replace each: index\.html in the code with (default¦index)\.(php(4¦5)?¦html?¦cfm¦aspx?) instead. Remember to replace the forum pipe symbols with the correct pipe symbols when you edit this code.
|
hailg03

msg:3614001 | 7:29 am on Mar 29, 2008 (gmt 0) |
How do I redirect urls with periods. This is what I am using right now and it works for urls without periods. Example how do i get all urls similar to www.example.com/you.are.html to www.example.com/you.are RewriteEngine on RewriteBase / RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^.]+\.html\ HTTP/ RewriteRule ^([^.]+)\.html$ /$1 [R=301,L] Thanks
|
jdMorgan

msg:3614220 | 4:12 pm on Mar 29, 2008 (gmt 0) |
Change the rule pattern:
RewriteRule ^(.+)\.html$ /$1 [R=301,L]
The original pattern was written for pattern-matching efficiency, but specifically excludes periods anywhere in the URL-path, except preceding the filetype. See the regular-expressions tutorial cited in our forum charter for more info. Jim
|
hailg03

msg:3614266 | 5:25 pm on Mar 29, 2008 (gmt 0) |
Hey Jim, I did what you said and also changed RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^.]+\.html\ HTTP/ to RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(.+)\.html\ HTTP/ Which got the result I wanted. Let me know if this change is fine. I really don't know much about mod rewrite.
|
|