Forum Moderators: phranque

Message Too Old, No Replies

URL Rewrite, removing extensions

need both trailing slash and no trailing slash

         

StoutFiles

4:08 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My website currently rewrites extensions as such:

RewriteRule ^(.*)\/$ $1.php [NC]

example.com/toy.php
becomes
example.com/toy/

This is fine except for the fact that example.com/toy (without trailing slash) returns a 404. Firefox will bookmark the website with the traing slash, IE will not. Is there anyway I can rewrite the .php URL to work for both example.com/toy and example.com/toy/ ?

[edited by: StoutFiles at 4:08 pm (utc) on Sep. 26, 2008]

jdMorgan

4:15 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I suggest that you *never* use the trailing slash for linking files or on-page objects; trailing slashes indicate a directory or directory index (See HTTP URL specification).

Rewriting both slash- and non-slash URLs directly to the script will create duplicate content.

Add an external redirect to force the client to add or remove the slash, as desired. This will prevent the dupe-content problem.

There is no meed to escape slashes in mod_rewrite regular expressions.

Jim

StoutFiles

4:24 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've set up all my links to use the trailing slash...I guess what I'm asking is how I should rewrite example.com/toy rewritten as example.com/toy/ in .htaccess.

It would be similar to having example.com rewritten as www.example.com when typed into the url bar.

g1smd

4:30 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** Is there anyway I can rewrite the .php URL to work for both example.com/toy and example.com/toy/ ? ***

If both work, and both return content with a "200 OK" status, then your site is serving Duplicate Content.

One can return content, and the other must return either a 404 error, or a 301 redirect to the canonical form.

StoutFiles

4:54 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I guess I could 301 toy to toy/. Would search engines have an easier time with toy rather than toy/ though?

g1smd

4:56 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Technically,
/toy is a file, and,
/toy/ is a folder, or an index page within that folder.

jdMorgan

5:33 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The following code is correct to answer your question. However, I recommend that you *do not* add a trailing slash to filenames. This is a mistake which may cost you in the future. I recommend that you follow the HTTP specification, remove the trailing slashes from your links, and then modify your code to 'correct' the old slashed links. You will likely have to deal with problems and complications related to these trailing slashes for years. Please don't say I didn't warn you...

# Externally redirect extensionless URLs to add a trialing slash (this is non-HTTP-compliant and not recommended)
RewriteCond $1 !([^/]+/)*([^.]+)$
RewriteRule ^(.*[^/])$ http://www.example.com/$1/ [R=301,L]
#
# Internally rewrite extensionless trailing-slashed URLs to php scripts
RewriteRule ^([^.]*)/$ $1.php [l]

The RewriteCond prevents URLs with file extensions, such as robots.txt or your $1.php files themselves from being redirected.

Good luck...

Jim

StoutFiles

6:35 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, they aren't actually filenames.

example.com/toy/ in actually a homepage for toys. I rewrite ?toyvariable=ball so that the address is now example.com/toy/ball without the trailing slash.

Yes, toy.php is a file name and not a folder but it might as well be a folder with the way I'm using it. Instead of toy.php and toy.php?toyvariable=ball you get toy/ and toy/ball.

Thank you for the rewrite codes. I will look into fixing slashes in the future...the way the site is put together it will stay just as complicated now as it would be to fix slashes down the line.

[edited by: StoutFiles at 6:37 pm (utc) on Sep. 26, 2008]

jdMorgan

6:43 pm on Sep 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Well, they aren't actually filenames.

Doesn't matter.

The fact that you had trouble with IE should be telling you something...

Jim

StoutFiles

7:24 am on Sep 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is there any way to 301 or rewrite the url's for just a few of the pages instead of all of them? On many of the pages I don't want a trailing slash, just a couple.

Example:
example.com/toy will be rewritten or redirected to example.com/toy/
example.com/phone will not be altered.

So far I can only find examples that effect all url's, and when I try to 301 just example.com/toy to example.com/toy/ in .htaccess I get example.com/toy////////////////////// etc.

g1smd

8:50 am on Sep 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's usually because you created a loop either by not end anchoring your pattern (that means it matches "begins with" for everythng), or not making it specific enough, so after redirecting it still matches as a pattern that needs to be redirected.

StoutFiles

4:38 am on Sep 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I've worked the trailing slash out of all my files. I originally planned it with the slash after I read multiple articles saying search engines preferred the trailing slash as the page appears more static.

However, you've been at this much longer than I have jd, so I'll take your word for it.

jdMorgan

3:48 pm on Sep 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, then something like this should reverse the previous functions:

# Externally redirect direct client (browser or robot) requests
# for URLs ending in ".php" to extensionless URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*[^.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.php$ http://www.example.com/$1 [R=301,L]
#
# Externally redirect extensionless URLs which will resolve to existing php files
# to remove the trailing slash, except for URLs starting with "this" or "that"
RewriteCond %$1 !^(this¦that)
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+)/$ http://www.example.com/$1 [R=301,L]
#
# Internally rewrite extensionless URLs to php scripts, except for URLs starting with
# "admin", "stats", or "phone", or which resolve to existing directory index pages
RewriteCond %$1 !^(admin¦stats¦phone)
RewriteCond %{REQUEST_FILENAME}/$1 !-d
RewriteRule ^(([^/]+/)*[^./]+)$ $1.php [L]

Replace the broken pipe "¦" characters in the RewriteConds with solid pipe characters before use; Posting on this forum modifies the pipe characters. I provided the two example RewriteConds as examples for URLs which should be excluded from redirection or rewriting; Modify or delete as needed.

If you already have a rule to redirect direct client requests for "index.php" to "/", then it should go before the first rule here. And if you already have a rule to redirect non-canoniucal hostname requests, then it should go after the second redirect posted here. In general, put all external reidrects first, in order from most-specific to least-specific, followed by all internal rewrites, again from most-specific to least-specific.

"Most-specific" rules will have a very-specific pattern, and affect one or only a very few requests. Least-specific rules will have very general pattersn, and so will have the potential to affect many or almost all requests -- for example, the domain canonicalization redirect rule will redirect *any* request for *any* URL, of the requested hsotname is incorrect.

Jim