Forum Moderators: phranque

Message Too Old, No Replies

Redirect Problem

         

AlexBaduca

8:32 am on Jan 15, 2010 (gmt 0)

10+ Year Member



Hello,

I have a rather strange problem. I'm using a htaccess to redirect from some SEO friendly urls to the scripts i normally use. On the development server it all worked ok but as soon as i moved it to production it stopped working. Rules look like this:


RewriteRule ^portofoliu/(\d+)-(.+).html index.php?op=projects&proj_id=$1 [NC,L] <- This does not work
RewriteRule ^portofoliu.html index.php?op=projects [NC,L] <- This works

If anyone could point me in the right direction to what settings might affect this i would be grateful.

Thank you,
Alex

jdMorgan

5:15 pm on Jan 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you sure your production server supports PCRE (PERL-compatible regular expressions)?

If not, replace "\d" with "[0-9]", which is the POSIX regex equivalent.

Also, get rid of unnecessary parentheses, escape all characters which have meaning as regex tokens if you want to match that literal character, and fully-anchor your patterns whenever possible: For example:


RewriteRule ^portofoliu/([0-9]+)-[b].+\.h[/b]tm[b]l$[/b] index.php?op=projects&proj_id=$1 [NC,L]

You might get a slight performance gain on busy server with an additonal tweak:

RewriteRule ^portofoliu/([0-9]+)-([^.]*\.)+html$ index.php?op=projects&proj_id=$1 [NC,L]

Another recommendation is that you should not use [NC] on rules like this one, as it allows the same content to be accessed by more than one URL (due to the case-insensitivity of the rewrite). As URLs which differ in case are not considered to be 'the same' thing, this creates 'duplicate-content' problems, and allows two or more URLs to compete with each other for ranking (due, for example, to people making casing errors in links on their sites or even on your own).

Unfortunately, the same problem exists on an even larger scale, and includes an even more sinister threat: Putting on my evil-competitor hat, I can link to your site as example.com/portfoliu/666/uncredited-plagiarized-and-stolen-compositions.html$, and that link will return exactly the same content as the URL "example.com/portfoliu/666/proper-title-here.html". Thus, not only can I create a duplicate-content problem for you, I can also create a negative keyword association for all of those duplicate pages, and thus impugn your reputation.

It is fairly critical that your script accept the 'title string' as a second parameter and go check it against the page-generation database. If it matches, serve the content. If it doesn't match, the script should generate a 301-redirect to the correct and canonical URL.

That would make your original rule look like this:


RewriteRule ^portofoliu/([0-9]+)-(.+)\.html$ index.php?op=projects&proj_id=$1[b]&check-title=$2[/b] [NC,L]

Best practice is to select one URL as the canonical URL for any given 'piece of content' and 301-redirect any and all variants to that canonical URL.

Jim

AlexBaduca

7:04 am on Jan 18, 2010 (gmt 0)

10+ Year Member



Thank you Jim.

The problem was the configuration of the web-server that didn't allow for PCRE support.

About the canonical urls. I'm surely going to use them. The version you saw is for deployment only i still have to rough out the edges before lunching it to the internet.

Again thank you for your time and the help provided :)