Forum Moderators: phranque

Message Too Old, No Replies

htaccess redirect issue

         

Simsi

3:59 pm on Oct 27, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wondering if anyone could help me here. Doing my nut over this lol.

I have news items that get rewritten in htaccess to a script that pulls from a DB. One part of the URL is the date in format yyyymmdd. Trouble is some of the old links in to items from other sites reference the date as yyyy-mm-dd (ie: with slashes) leading to the issue of dupe content with the articles referenced without the slash.

No matter what I try the slashed versions keep the slash in the URL.

I have this in my htaccess


Options +FollowSymLinks
RewriteEngine on

DirectoryIndex home.php index.php
RedirectPermanent ^20(.*)-(.*)-(.*)/(.*)/ http://www.widgets.com/news/20$1$2$3/$4/
RewriteRule ^20(.*)/(.*)/ /news/retrieve_newsitem.php?date=20$1&slug=$2


...to redirect the slashed version to a non-slashed version before doing the rewrite. But the URl still retains the slashes after the rewrite.

Any help would be very much appreciated.

Cheers

Simsi

g1smd

4:22 pm on Oct 27, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Never use (.*) multiple times like that. The pattern has to be parsed hundreds of times to find a match.

This code, including "news/" at the beginning of the pattern should fix it:

RewriteRule ^news/20([^-]+)-([^-]+)-([^/]+)/([^/]+)/ http://www.widgets.com/news/20$1$2$3/$4/ [R=301,L]


Actually, that code is still too lenient, as it would try to rewrite requests for example.com/20GT-ER-PM/whatever/ too.

I'd tighten it up a bit more, so that it only matches exactly two digits each time:

RewriteRule ^news/20([0-9]{2})-([0-9]{2})-([0-9]{2})/([^/]+)/ http://www.widgets.com/news/20$1$2$3/$4/ [R=301,L]


After this code, include your standard non-www to www domain canonicalisation rules.

jdMorgan

5:08 pm on Oct 27, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the code is in /news/.htaccess (as implied by your implication that the existing rewrite to the script works), then adding "news/" at the beginning of the first (new) rule's pattern won't be required.

However that existing rule itself could do with some optimisation as well.

So you'd end up with:

RewriteRule ^20([0-9]{2})-([0-9]{2})-([0-9]{2})/([^/]+)/ http://www.widgets.com/news/20$1$2$3/$4/ [R=301,L]
RewriteRule ^20([0-9]{6})/([^/]+)/$ /news/retrieve_newsitem.php?date=20$1&slug=$2 [L]

Jim

[edited by: jdMorgan at 5:30 pm (utc) on Oct 27, 2010]

Simsi

5:12 pm on Oct 27, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks guys. I'll give that a whirl. Yes, the htaccess is in the news directory fwiw. Thanks also for the tip on * g1smd.

Edit: Worked like a dream :) Thanks