Forum Moderators: phranque

Message Too Old, No Replies

Redirecting URLs not working

mod_rewrite, htaccess

         

Rabash

8:01 am on Aug 1, 2012 (gmt 0)

10+ Year Member



Hello,

During the last years I've been using this URL format in my website:
example.com/news/title-of-article/

But now I want to use this new one:
example.com/category-name/title-of-article/

Which is the rule that I have to add to the .htaccess to make it work? I've tried a lot of combinations without success. Here are a couple of my failed attempts:
RewriteRule ^news/$1/$ ([a-z-]+)/([a-z-]+)/ [R=301,L]
RewriteRule ^news/([a-z-]+)/$ ([a-z-]+)/([a-z-]+)/ [R=301,L]

phranque

8:22 am on Aug 1, 2012 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld, Rabash!

how do you decide which category-name to use or is it one category-name for all urls?

Rabash

8:34 am on Aug 1, 2012 (gmt 0)

10+ Year Member



Thanks Phranque,

There are 25 different categories. Every time that I write an article, I select one of them.

The problem is that I don't know how to link the generic term "news" to each category name.

g1smd

9:11 am on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In a redirect coded using a RewriteRule, the RegEx pattern to match the requested URL goes on the left. The pattern may contain parentheses to "capture" stuff for later re-use.

On the right should be a literal URL with protocol and hostname. This part may contain $n backreferences reusing stuff captured with parentheses in the RegEx pattern.

You'll need to change the links on your pages to point to the new URLs that you want people to "see" and "use". URLs are defined in links.

Mod_rewrite cannot and does not "change" URLs. It works only after the link has been clicked.

For people still requesting the old URLs, you'll need redirects to the new URLs.

To make the new URLs work, you'll need rewrites to connect those external URL requests to the actual script inside the server that will deliver the content.

However, I'd question adding category name to the URLs. This is often a bad idea as it can lead to a variety of duplicate content issues.

lucy24

9:56 am on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Aaaack! g1, could you JUST ONCE not type faster than me?

I was going to say:

The problem is that I don't know how to link the generic term "news" to each category name.

You can't, unless the information is already stored somewhere. Do the old URLs include any extra bits like query strings that contain the category info?

If not, you'll have to do some business with databases and php scripts. Step one is a simple rewrite:

RewriteRule ^news/(nameofarticle)/ /fixup.php?id=$1 [L]

The "fixup.php" page will take the name of the article, look it up in your database, pop it into the URL, and finally issue a redirect to

www.example.com/category/nameofarticle

(please! no trailing slash!)

But I hope you speak php because I don't ;)

Rabash

10:38 am on Aug 1, 2012 (gmt 0)

10+ Year Member



Thanks for your answers,

Following your advices, I've changed the URL format generated by the system to a new one without any mention to categories:
http://www.example.com/title-of-article/

Then I've added this rule to the .htaccess:
RewriteRule ^([a-z0-9]+)/$ news/$1/ [R=301,L]

Now, if I write a URL like http://www.example.com/news/title-of-article the system shows me the correct contents, but the permalink that appears at the address bar is not the new one I was expecting (http://www.example.com/title-of-article/) but http://www.example.com/news/title-of-article/

I'm I doing something wrong? Google will recognize the permanent redirection that I've added with the RewriteRule?

g1smd

10:43 am on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All links on the page must point to the URLs that you want users to see and use. URLs are defined in links.

It is very bad form to click on a navigation link and be redirected to a different URL.

However, you've now got a bigger problem on your hands. Without the /news/ bit in the URL, how do you stop requests for /robots.txt and other such files being rewritten?

Rabash

10:46 am on Aug 1, 2012 (gmt 0)

10+ Year Member



After re-reading my messages, I think that maybe I haven't explained properly that the system generates the new permalinks by himself.

What I want to do with the RewriteRule is:
a) Tell Google that I changed the URL format permanently to a new one
b) I also want to redirect the users that come to my website from old links added to other blogs, forums or other sites.

g1smd

11:09 am on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



a) You tell Google about the new URLs by altering the URL shown in the permalink. Mod_rewrite cannot "change" a URL. URLs are defined in links.
b) The redirect from old to new URL informs users and searchengines to use the new URL whenever they request the old URL.

Rabash

4:10 pm on Aug 1, 2012 (gmt 0)

10+ Year Member



Thanks for your answers, but I still have a couple of doubts. Let me recapitulate my situation:

During the last years I've been using this kind of permalink:
http://www.example.com/news/title-of-article/

Now, the new version of my CMS offers me the possibility to use this permalink:
http://www.example.com/title-of-article/

I think that the last URL is better than the previous one, but before applying these changes in my website I need to use 301 redirects to send Google and users from the old pages to the new ones. Although I'm trying to learn how mod_rewrite works reading a lot about the subject, my knowledge about this module is still very limited, so that's why I'm here asking to experts like you ;-)

The URL Rewriting Guide [httpd.apache.org] published at the apache.org page explains that:

Content Handling
From Old to New (intern)
Description:
Assume we have recently renamed the page foo.html to bar.html and now want to provide the old URL for backward compatibility. Actually we want that users of the old URL even not recognize that the pages was renamed.
Solution:
We rewrite the old URL to the new one internally via the following rule:
RewriteEngine on
RewriteBase /~quux/
RewriteRule ^foo\.html$ bar.html


The old URL is on the left and the new one on the right. Then, why I can indistinctly use one of these two expressions and the system, when I write at the address bar a permalink like example.com/news/title-of-article, shows me in both cases the contents of its "brother" example.com/title-of-article without any error warning?

RewriteRule ^news/([a-z0-9]+)$ $1/ [R=301,L]
RewriteRule ^([a-z0-9]+)$ news/$1 [R=301,L]


Supossedly, only the first one is correct, isn't it?

My second doubt is related to Google. These rules (well, I suppose that only one of them), as I wrote them, properly notify the search engines about the changes and the link between the old and the new URLs? In other words: I won't be penalized by Google if I use the RewriteRule in this way?

Thank you.

g1smd

4:22 pm on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Their example RewriteRule is for an internal rewrite.

Your example RewriteRule is for an external redirect.

Those are two separate things even if the code is only slightly different.


But there's other things to discuss first. Their example has the old URL on the left and the new internal filename on the right. However, that example specifically makes a duplicate content problem because the new file on the server can now be accessed using the old URL or the new URL. It's a bad example.

You need something different.

Link to the new URLs from the pages of your site. URLs are defined in links.

Redirect requests for old URLs to the new URL. On the left will be the requested URL. On the right will be the rule target and for a redirect this should include the protocol and hostname.

URLs are used "out there" on the web. Filepaths are used "here" inside the server. RewriteRule can be used to redirect a URL request to a different URL or it can be used to rewrite a URL request to fetch content from the non-default location inside the server without revealing what that location is.

A redirect is a URL to URL translation.

A rewrite is a URL to filepath translation.

A RewriteRule can be configured for either of these actions.

Rabash

8:01 pm on Aug 1, 2012 (gmt 0)

10+ Year Member



Ok, so if I understood you correctly, first of all I have to add this rule to the .htaccess:

RewriteRule ^news/([a-z0-9]+)$ http://www.example.com/$1/ [R=301,L]


Then I have to avoid duplicate content by changing the old type of permalinks present at my articles for the new format. I can download a copy of my database and easily change those permalinks, so I think there won't be any problem.

But obviously, there's nothing I can do to change the URLs of other websites that are linking my contents using the old permalinks.

Given that all my internal links will have the new URL format and that the RewriteRule will notify the search engines about the permanent redirect, those permalinks present at third websites will be a problem for me due to the duplicate content?

g1smd

10:24 pm on Aug 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No. They won't cause a problem.

Your RewriteRule coded as a redirect will tell visitors requesting the old URL to make a new request for the new URL. Likewise it tells Google to update what it lists in the SERPs. Google also needs to see that the pages of your site link to the new URL and no longer link to the old URL.

Your RewriteRule coded as a rewrite will associate requests for the new URL to the internal filepath location that will deliver the content, while keeping the location details secret from the outside world.

URLs are used "out there" on the web. Server filepath are used "here" inside the server. They are not at all the same thing, and are merely associated by the specific server configuration in use.

Rabash

7:18 am on Aug 2, 2012 (gmt 0)

10+ Year Member



Thank you very much for your help :-)