homepage Welcome to WebmasterWorld Guest from 54.243.23.129
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
URL rewrite with 301
troyid




msg:4082480
 8:00 pm on Feb 17, 2010 (gmt 0)

Here is my situation.

I run a news script which produces pages like http://www.example.com/cgi-bin/news.cgi?a=article&ID=1265144816

I have setup a htaccess rule that produces a html page.

RewriteRule news/([0-9]+)\.html$ /cgi-bin/news.cgi?a=article&ID=$1 [L]

The only problem is that it does not 301 the /cgi-bin/news.cgi?a=article&ID=$1 [L]
version to /news/1265144816.html so I might get penalized for duplicate content.

Any help would be appreciated.

[edited by: jdMorgan at 11:05 pm (utc) on Feb 17, 2010]
[edit reason] example.com [/edit]

 

jdMorgan




msg:4082618
 11:24 pm on Feb 17, 2010 (gmt 0)

There is no duplicate-content "penalty" -- That's a very pernicious Webmaster Myth. While there may indeed be "penalties" for massive quantities of intentional duplicate content, there is no penalty for minor, accidental duplicate content.

The "penalty" is that you have two (or more) URLs competing with each other for incoming links and PageRank/Link-popularity, and this dilutes the ranking of each of them.

Your RewriteRule "creates" nothing. All it does is tell the server to serve the content from the internal filepath /cgi-bin/news.cgi?a=article&ID=1234 when an HTTP request for the URL /news/1234.html is received from a Web client.

Keeping URLs and filepaths as separate and distinct concepts, associated *only* by the action of a server, will help a lot when thinking about rewrites and redirects.

The only problem is that it does not 301 the /cgi-bin/news.cgi?a=article&ID=$1
version to /news/1265144816.html so I might get penalized for duplicate content.

Were you expecting it to? That's not what your rule does... So you need a second (and complementary) rule to implement that function:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /cgi-bin/news\.cgi\?a=article&ID=([0-9]+)(&[^\ ]*)?\ HTTP/
RewriteRule ^cgi-bin/news\.cgi$ http://www.example.com/news/%1.html [R=301,L]

This new rule should precede your rewrite code posted above, and it should precede your domain canonicalization redirect and any other less-specific redirects. These redirects should then be followed by your existing internal rewrites, again in order from most-specific to least-specific.

The complex RewriteCond is required to differentiate the /cgi-bin/news.cgi?a=article&ID=1234 script-path being directly requested by a client as a URL, as opposed to being internally-requested as the result of your existing internal rewrite rule. Without this test, the two rules would unconditionally countermand each other, resulting in an 'infinite' loop.

Jim

troyid




msg:4082628
 11:58 pm on Feb 17, 2010 (gmt 0)

Hi Jim,

I love your explanation regarding duplicate penalties.

I implemented your rule and it almost worked. When I visit http://www.example.com/cgi-bin/news.cgi?a=article&ID=1265144816 it 301's to http://www.example.com/news/1265144816.html?a=article&ID=1265144816

I just need it to 301 to http://www.example.com/news/1265144816.html

jdMorgan




msg:4082644
 12:20 am on Feb 18, 2010 (gmt 0)

Yeah, I forget that almost every time I re-type this code...

It should be:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /cgi-bin/news\.cgi\?a=article&ID=([0-9]+)(&[^\ ]*)?\ HTTP/
RewriteRule ^cgi-bin/news\.cgi$ http://www.example.com/news/%1.ht[b]ml?[/b] [R=301,L]

Jim

troyid




msg:4082652
 12:32 am on Feb 18, 2010 (gmt 0)

Beautiful! Works a treat. Thanks Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved