Forum Moderators: phranque

Message Too Old, No Replies

Remove duplicate content 301 redirect

         

ureche

6:43 pm on Mar 19, 2009 (gmt 0)

10+ Year Member



Hello, i am sorry if a solution has been posted to this kind of problem, but it seams i cannot find it.

I want to remove duplicate content:

/matematica-clasa-p-2211-c-640.html
/matematica-clasa-p-2211-c-505.html

this is the same page, the new link looks like:

/matematica-clasa-p-2211.html

the new and the old link are the result of mod rewrite, the original link look like this:

product.php?product_id=2211

Next the question, how should i make the 301 redirect to remove the links from google duplicate content:

matematica-clasa-p-SSS-c-YYY.html to matematica-clasa-p-SSS.html [L,R=301]

or

matematica-clasa-p-SSS-c-YYY.html to product.php?product_id=SSS [L,R=301]

i'm afraid that using this method, all my links in google will look like product.php?product_id=SSS and i will lose the seo urls.

Thanks, and sorry for my bad english.

jdMorgan

7:46 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To remove "-c-<numbers>" from your URL, you could use:

RewriteRule ^matematica-clasa-p-([0-9]+)-c-[0-9]+\.html$ http://www.example.com/matematica-clasa-p-$1.html [R=301,L]

This assumes that you already have other working rules in the file. If not, add

Options +FollowSymLinks
RewriteEngine on
#

above that rule.

Also, to get the best results, make sure that links to matematica-clasa-p-2211-c-640.html no longer appear on your Web site.

Jim

g1smd

7:51 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need two redirects and one rewrite here.

- One redirect from dynamic URL format to new URL format.

- One redirect from old hyphenated URL to new URL format.

- One rewrite from URL format to old internal dynamic filepath.

Additionally, links on your pages should link to the new URL format.

The redirects should fire off only for direct client accesses requesting those URLs.

nealrodriguez

8:03 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



here's the syntax as per [httpd.apache.org...]

RedirectPermanent URL-path URL

in your case:

RedirectPermanent /product.php?product_id=2211 [domain.com...]

ureche

8:05 pm on Mar 19, 2009 (gmt 0)

10+ Year Member



Hi, thank you for your help, that link does no longer appear in my sitemap. From your example i came up with this:

RewriteRule ^(.*)-p-([0-9]+)-c-[0-9]+\.html$ $1-p-$2.html [R=301,L]

and its working, i get redirected to matematica-clasa-p-2211.html

You are correct i have other rules in htaccess and this poses to another question. My htaccess looks like:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /

RewriteRule ^(.*)-p-(.*).html$ product.php?product_id=$2
RewriteRule ^(.*)-p-([0-9]+)-c-[0-9]+\.html$ $1-p-$2.html [R=301,L]

I tried with the rule above the first one, like this:

RewriteRule ^(.*)-p-([0-9]+)-c-[0-9]+\.html$ $1-p-$2.html [R=301,L]
RewriteRule ^(.*)-p-(.*).html$ product.php?product_id=$2

But the redirection is not working, can you explain why?

Thanks.

ureche

8:45 pm on Mar 19, 2009 (gmt 0)

10+ Year Member



Thank you all for your answers. I think RedirectPermanent wont work for me because i have about 300 links like that.

jdMorgan

9:14 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do not use the ambiguous pattern ".*" when a more-specific pattern can be used. I suspect that it the cause of your problem, because ".*" matches everything, anything, or nothing. It means "match any number (including zero) of any characters."

Escape literal periods in patterns as shown in my code. Requested URL-path "x.html" is matched by the pattern "^x\.html$"

Jim

g1smd

10:35 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ensure that the redirects contain the full domain name in the target URL.

Ensure that the redirects only activate for direct client requests. That is, add a RewriteCond that looks at THE_REQUEST.

List the redirects first and the rewrite last.

Add [L] to the end of the rewrite.

ureche

10:52 pm on Mar 19, 2009 (gmt 0)

10+ Year Member



Hello, i manage to list the redirects first and the rewrite last. But i have a question about:

"Ensure that the redirects only activate for direct client requests."

Shouldn't the redirect activate on all cases? the primary goal is to remove the duplicate content from google.

Thanks.

g1smd

11:30 pm on Mar 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In full; ensure that the redirects only activate for direct client requests, and do not activate as a result of the internal pointers simply being updated during a rewrite. This avoids an infinite rewrite-redirect loop occurring.

Let's also see your full code for the two redirects and one rewrite.

g1smd

6:00 pm on Mar 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



How are you getting on with this?

ureche

12:13 am on Mar 23, 2009 (gmt 0)

10+ Year Member



Hello, sorry for not getting back. Here is how my htaccess looks:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /

# this is the rule i have addred for my 301 redirect
RewriteRule ^(.*)-p-(.*)-c-(.*).html$ $1-p-$2.html [R=301,L]

# this is from a seo url contribution i installed
RewriteCond %{QUERY_STRING} ^options\=(.*)$
RewriteRule ^(.*)-p-(.*).html$ product.php?products_id=$2%1
RewriteRule ^(.*)-p-(.*).html$ product.php?products_id=$2&%{QUERY_STRING}

The above works, i didn't have time to study the regular expression so i can remove the ambiguous pattern. I have tested the header with [tools.seobook.com ] and i get 301 moved for all my links:

macromedia-flash-professional-p-1108-c-173.html&#8206;
matematica-clasa-tccd-p-2211-c-505.html&#8206;
psihoterapie-cognitivcomportamentala-psihanaliza-p-2604-c-518.html&#8206;

and so on, all of them redirect to link like:

macromedia-flash-professional-p-1108.html&#8206;
matematica-clasa-tccd-p-2211.html&#8206;
psihoterapie-cognitivcomportamentala-psihanaliza-p-2604.html&#8206;

Thank you for your help. I will look at the "RewriteCond that looks at THE_REQUEST" you suggested.

g1smd

12:32 am on Mar 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ensure that the redirects contain the full domain name in the target URL.

Test your existing rules with both non-www and with www requests to see why you need that.

Add [L] to the end of the rewrites.

There is one redirect missing. Check the top of the thread where I listed all the steps that you will need.

.

You have two rewrites with the same input pattern. Is that right?

The first rewrite only fires if the "options" parameter is present, and only if is the ONLY parameter present. This is due to you start and end anchoring the pattern.

The second rewrite can only fire if the "options" parameter is not present, OR, if is present but has other parameters before or after it. Is that what you want?

Be aware that a RewriteCond can only affect the single RewriteRule immediately following it.

ureche

11:15 am on Mar 23, 2009 (gmt 0)

10+ Year Member



Hello. I believe you are referring to "One redirect from old hyphenated URL to new URL format." If i'm not mistaken this should look like the example jdMorgan gave:

RewriteRule ^matematica-clasa-p-([0-9]+)-c-[0-9]+\.html$ http://www.example.com/matematica-clasa-p-$1.html [R=301,L]

But why should i redirect from old hyphenated when i have this rule?

RewriteRule ^(.*)-p-([0-9]+)-c-[0-9]+\.html$ http://www.example.com/$1-p-$2.html [R=301,L]

The problem is that i have a few hundreds links like:

text-text-text-p-([0-9]+)-c-[0-9]+\.html$

About the two rules under the rewritecond i didn't write them, they came with a contribution that is working alright.

Thanks.

g1smd

12:08 pm on Mar 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As far as I can see those old requests now pass through a double redirect. That redirection chain is to be avoided.

The other redirect that I am on about is when anyone directly requests your 'real' script URLs, that is URLs with parameters in.

ureche

7:08 pm on Mar 23, 2009 (gmt 0)

10+ Year Member



Hello, yes but how can i avoid double redirect chain? I know is bad but writing 700 links like that in htaccess i don't know if its a good idea.

I have looked into my scripts and the contribution and there is no option attribute used. So this looks like this now without any rewritecond:

RewriteRule ^(.*)-p-(.*).html$ product.php?products_id=$2&%{QUERY_STRING} [L]
RewriteRule ^(.*)-c-(.*).html$ index.php?cPath=$2&%{QUERY_STRING} [L]