Forum Moderators: phranque

Message Too Old, No Replies

Rewrite trailing slash to .html extension

         

madmatt69

5:53 pm on Mar 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi all,

I'm trying to do some mod rewrite magic and not having a lot of success.

I have a blog where the links currently look like "site.com/category/topic-title/"

I'm trying to change it to be:
"site.com/category/topic-title.html"

Any suggestions on how to do this? I was thinking either ea 'redirect match' or a 'RewriteRule' but i just can't seem to get either one working.

"RedirectMatch 301 ^/category/([^/]+/)*(.*)$ [mysite.com...]

Is that totally incorrect?

Thanks for any help!

Swanny007

6:36 pm on Mar 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why do you want the .html extension? Just a few days ago I used mod_rewrite to drop the extension altogether :-) Is it for SEO?

I haven't tested this but something like this should work:
RewriteRule ^category/([A-Za-z0-9]+)/$ http://www.example.com/category/$1.html [R=301,NC]

I use a similar rule to force /category/ to /category (no trailing slash, no extension).

jdMorgan

10:34 pm on Mar 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you want to change the URL (as listed in search engines) then you must modify the blog script to put the new URLs in the links on your pages. You can't use mod_rewrite to change a URL, you can only use it to change the filepath associated with a URL, when that URL is requested from your server, or to redirect an old URL to a different URL once the links to that old URL are changed to the new URL. The URLs in the links on your pages define the URL seen by the world.

And I also agree that adding the .html extension is counter-productive in several ways: It makes your URLs longer, it dilutes any keyword-in-URL ranking advantage by doing so, and it guarantees that you'll have to change your URLs again in the future if, say, you switch to .php.

Most Webmasters who can do so are currently removing file extensions, not adding them. The only change I'd recommend for a site like you describe is to remove the trailing slashes, since they too are a waste of bandwidth.

Jim

madmatt69

12:09 am on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Heya,

What I wanted to do was make sure the old posts are redirected to the new ones.

I've heard the debate about the file extension and it seems for everyone that says drop them, there are others who say add them.

Have you read anything somewhat 'definitive'? I was just going to do it for consistency.

jdMorgan

12:42 am on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Definitive? Maybe I should rephrase that: Most knowledgeable Webmasters who can do so are currently removing file extensions, not adding them.

You may take that as definitive if you like. I do.

Jim

jdMorgan

12:56 am on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Given your initial example, you can simplify your directive:

RedirectMatch 301 ^/category/([^/]+)/$ http://www.example.com/category/$1.html

The equivalent mod_rewrite rule posted above can also be simplified:

RewriteRule ^category/([^/]+)/$ http://www.example.com/category/$1.html [R=301,L]

Jim

madmatt69

1:01 am on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm. It'd be interesting to see a case study or a before & after.

One argument I've heard is to at least remove the trailing slash, to make the post one level closer to the root.

Thanks for that updated code! I'll give it a shot...but now I'm not sure I want to do it :)

madmatt69

1:03 am on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Now I just read this on a blog, apparently from a presentation by Matt Cutts:

1. Google doesn’t care about link depth (i.e. the number of slashes in your permalink won’t matter)

2. The file extension doesn’t matter - you could call files as php, html or even mattcutts.. this is not taken into consideration while calculating the rankings but don’t use the .exe extension.

madmatt69

8:24 pm on Mar 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So this is the rewrite code I've been trying, and I still can't get it to work.

Basically I've gone with JD's suggestion and am gonna drop the file extensions, but keep the category as a directory.

So - I'd like to go from: www.example.com/category/thepage/
to:
www.example.com/category/thepage

I've tried this code (and variations thereof), but can't seem to get it to work..either the redirection doesn't work at all, or it drops the 'category' part, or it changes nothing:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /category-name/[^.]+\/
RewriteRule ^category-name/([^.]+)\/$ http://www.example.com/$1 [R=301,L]

Can anyone help me refine that? I'd appreciate it.

Thanks!

[edited by: jdMorgan at 11:25 pm (utc) on Mar. 12, 2008]
[edit reason] example.com [/edit]

jdMorgan

11:23 pm on Mar 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, let's back up a bit and re-group to get all this info in a fresh post:

Where did the RewriteCond come from -- And why did you add it?
What is an example URL you are requesting?
What URL do you want that URL to redirect to?
Where is the content for the new URL to be served from, that is, what is the filepath?
Do you have a rule to do a rewrite to implement this previous step?
And lastly, what is the URL-path to this .htaccess file.

Please don't confuse URLs with filepaths in your answer to the questions above, that'll just delay the resolution of the problem.

I suspect that either the code is not where it should be, or perhaps you want to back-reference more of the requested URL, but can't be sure.

Jim

madmatt69

1:19 am on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ha, ok sorry for all the confusion.

I was trying to use a similar solution as found in this thread:
[webmasterworld.com...]

Example requested URL:
www.example.com/category/thepage/
The url I want it to go to is the same, just drop the trailing slash:
www.example.com/category/thepage

File path should just be 'home/examplesite/category/thepage'

As far as I know there are no other rules affecting this particular rewrite rule. However I'm using wordpress, so its possible that there's something hidden, but I doubt it.

I'm putting the rules in my httpd.conf (as its a dedicated server).

I hope that helps explain things a little bit!

jdMorgan

1:31 am on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Any of the following five rules should work, depending on what you're trying to do and where...

.htaccess code:
RewriteRule ^category-name/([^.]+)/$ http://www.example.com/category-name/$1 [R=301,L]
RewriteRule ^(category-name/[^.]+)/$ http://www.example.com/$1 [R=301,L]


httpd.conf code:
RewriteRule ^/category-name/([^.]+)/$ http://www.example.com/category-name/$1 [R=301,L]
RewriteRule ^/(category-name/[^.]+)/$ http://www.example.com/$1 [R=301,L]
RewriteRule ^(/category-name/[^.]+)/$ http://www.example.com$1 [R=301,L]

Jim

madmatt69

1:55 am on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks!

Seems to work - for the most part. The final trailing slash is removed, however if I want to go to just /category-name/ the trailing slash on category-name is also removed so it looks like '/category-name'

Is there something we can add to that rewrite rule to remedy that or is it perhaps wordpress messing with things?

jdMorgan

2:06 am on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The patterns in the code posted here all require that the path-part following "category-name/" contain at least one character *and* that this additional path-part be followed by a (second) trailing slash. Therefore, you'll have to look elsewhere to find the cause of this new problem.

Jim