Forum Moderators: open

Message Too Old, No Replies

http:// = http://www. =>duplicate content

         

davidpbrown

6:51 pm on Oct 10, 2003 (gmt 0)

10+ Year Member



I've just recognized ~one quarter of my listing in Google are of the type [widget.com...] rather than the [widget.com...] they should be. Many of them, although not all, are duplicating.

My host appears to be forwarding http:// 302 [www....] but some slip past and Google lists them.

It's not clear to me whether the duplicates are a result of this, or whether it is Googles picking up the second before releasing the first as the site updates.

I'm wondering whether Google understands they are duplicates, how common this is and whether I'm being penalized currently, though I've not noticed it, or whether my reduced numbers of page listing would hit any ranking.. hope not. Aside from not wanting bloated listings the 302 appears to be dragging out Google's removal of old pages as the server header is seen 302 rather than any other I might have set.

Can someone suggest the htaccess script to send http:// 301 [www....] to overide the server config's 302? or is there a better way?

Thanks
davidpbrown

[edited by: davidpbrown at 7:16 pm (utc) on Oct. 10, 2003]

Yidaki

7:01 pm on Oct 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Can someone suggest the htaccess script to send http:// 301 [www....]

Redirect 301 - how is this really done?
redirecting widget.com to www.widget.com [webmasterworld.com]

plasma

7:06 pm on Oct 10, 2003 (gmt 0)

10+ Year Member



ALWAYS use 301 (permanently moved)
-------------------8<----------------------------------------------------
RewriteEngine on
RewriteBase /

RewriteRule ^$ [widget.mycom...] [R=301,L]
RewriteRule ^(.+) [widget.mycom...] [R=301,L]
-------------------8<----------------------------------------------------

1. mod_rewrite needs to be enabled for this
2. In this example with_www and without_www MUST be in different directories. It could be done smarter, but this unconditional example is easier

claus

7:10 pm on Oct 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This snippet should do it, at least it works nicely for me:

----------------------------

RewriteEngine on
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

----------------------------

You are perhaps missing one thing in your considerations: Links. Googles use of www/non-www reflects the use of people that link to you. You might even have different Toolbar PR on the pages, depending on www or not.

Personally, i examined this, and found that people tended to link to me more often without the www, and as a consequence my index page had a higher toolbar PR when viewed outside the www subdomain. So, i now use the rule reversely - redirecting requests for my www subdomain to my domain (without www).

If you decide to use the code above in your .htacess, it will take around three weeks before Google shows the change in the SERPS.

/claus

Yidaki

7:25 pm on Oct 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RewriteRule ^$ [widget.mycom...] [R=301,L]
RewriteRule ^(.+) [widget.mycom...] [R=301,L]
Ohps, query_string, ts ts ts ... ;)

davidpbrown

7:40 pm on Oct 10, 2003 (gmt 0)

10+ Year Member



Thanks for the quick replies!

claus.. I did think of linking as a cause but having recently changed the URL structure, there hasn't been time for many direct links to build up and the redirects give the www. address'.

Regards
davidpbrown

rainborick

7:58 pm on Oct 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've done enough programming and playing with .htaccess to get the gist of most of the snippets posted here, for which I am certainly grateful. However comma... I would also be very grateful if someone could guide me to a similar solution for a Microsoft IIS server which is entirely foreign to me. Thanks!

claus

8:20 pm on Oct 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> IIS

It's true that we tend to focus on Apache here at WebmasterWorld. For IIS there is a thing called ISAPI filters that might be useful for some of the htaccess tasks. Otherwise, there is a "Management Console" in which you can make redirects and other types of htaccess tasks. This is about all i know about it.

/claus

plasma

9:23 pm on Oct 10, 2003 (gmt 0)

10+ Year Member



Ohps, query_string, ts ts ts ... ;)

Am I missing something?

Yidaki

7:21 am on Oct 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Am I missing something?

Lol, no, but me ... i totally misread your post. Sorry. ;)

claus

11:33 am on Oct 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I might be wrong, as there are always more than one way to do things, but i don't really understand the example including the query string. This is what it seems to do, as i read it:

RewriteRule ^(.+) http;//www.example.com/
$1?%{QUERY_STRING}
[R=301,L] 

(After http, ":" is replaced by ";" for visual reasons only - using colors, this forum messes up the quote if it's a ":")

If you use this line only (as the matches and the [L] flag in the first line implies), the query string will be included in "$1" as you backreference using "^(.+)", i.e one or more characters. It seems to me that (1) will be rewritten to (2):

(1) example.com/something.php?bla&bla=bla
(2) www.example.com/something.php?bla&bla=bla
?bla&bla=bla

That is, it will insert an extra "?" and duplicate the query string. Further, the questionmark after $1 means that it will be inserted even when there is no querystring, ie. (a) redirects to (b):

(a) example.com/something.php
(b) www.example.com/something.php
?

Plus, it will redirect all requests, not just those without "www." but also those with "www." - causing some extra work for the server. In essence, it seems to me that it will do the job, and then some more. I'm not sure this "something more" is a good idea in all cases.

/claus