Forum Moderators: open

Message Too Old, No Replies

links to domain.com vs. www.domain.com

Aftermath of htaccess 301 redirection

         

khuntley

2:38 am on Jun 26, 2003 (gmt 0)

10+ Year Member


Background: I put up a new site as of mid-Feb and spent two months on SEO specifically for google. At that point from experience I knew I should rank around four or five after the next update. Dominic hit and the site moved to around 200 for a moderately competitive search term.

Having been around for a while and knowing that Dominic was a little different, I knew to play it cool and wait.

I then saw the start of Esmerelda and saw only minor fluctuation in ranking of 180-210. At that point I knew that there must be something wrong. I then saw that DMOZ linked to me at http://domain.com, and I had all higher internal pages also not using www in link to index due to a hotkey I assigned to homepage URL text leaving out www. I added www in internals and added the following to htaccess:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.domain\.com
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

Two days later, on Thursday of last week during the update, the site skyrocketed to number 4, for a moderately competitive search term on some of the datacenters (due to freshie finding the redirection?). I'm sorry I can't remember whether or not it was the datacenters with or without the new update.

Four days after that, on Monday, the site dropped to around 200 again on all datacenters.

Some additional info...several hundred internal pages rank very well for their terms; it's just the index that I can't get up there for the main search term.

So the question is, was dropping back due to freshie finding the redirect and temporarily causing the #4 ranking and everything will be fine assuming something like another traditional update? Are there other steps I should take? With the background info I provided was the problem in fact due to the missing www in internal and external links?

I found dozens of posts lately about what to do to fix the missing www prob (I knew how to do that) but nothing about the aftermath. And I know Brett has some experience with this and WW...any thoughts Brett, or others?

Thanks in advance,
Kevin

SebastianX

8:50 am on Jun 26, 2003 (gmt 0)

10+ Year Member



It seems DeepFreshBot just deletes the page in the index and ignores 'Location: newURL'. I've no evidence it really ignores the new URL, at least it gets not crawled since DeepFreshBot exists. Maybe a bug or the new URLs got scheduled for a crawl in the next year or so.

Brett_Tabke

10:11 am on Jun 26, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



There is something definitely different this time. I say that simply because of the volume of the posts on this topic.

I put in the redirect to remove thousands of duplicate appearing content.

khuntley

2:57 pm on Jun 26, 2003 (gmt 0)

10+ Year Member



Thanks Brett and SebastianX,

After still more research I am somewhat confident that:

1. Freshie treats even a new index by 301 redirect as a new page that is temporary in the G index until an update.

2. Having internal and external links to index without www will get you a penalty of no links to index counted.

3. The penalty is for the index alone.

4. Internal pages rank normally for their own search terms.

5. The penalty is just that, a penalty and not a ban, as the index ranks on it's own merit without the links.

6. Internal links to index without www must be present for this to occur.

Kevin

khuntley

3:16 pm on Jun 26, 2003 (gmt 0)

10+ Year Member



Wow,
the site is just now back to fourth position on three datacenters. I'm posting this only because I never, ever would have thought that the 301 fix would be permanent without an update. I think this may lend more weight to the rolling update theory.
Kevin

Kratzy

3:17 am on Jun 27, 2003 (gmt 0)

10+ Year Member



I've recently done the same thing to one of my domain names, I did it just after the Esmerelda update, so I'm going to have to wait till the next update to see what happens.

I have about 10 domain names (3 of which were popular) that all point to the same site, wanting to consolidate them I did the same thing as you with the mod_rewrite redirection.

What'd be really nice is if google was smart enough to work out that www.olddomain.tld is being permanently redirected to www.domain.tld so delete www.olddomain.tld from the index and any links that point to it, calculate as PR against www.domain.tld

khuntley

5:45 am on Jun 28, 2003 (gmt 0)

10+ Year Member



Kratzy,
Well it is clear to me that google is not smart enough to do that yet. I don't mean to be negative, but it is clear that you need to have your domain and www eggs in the right basket.

An update...site is now solidly #4 in all data centers from htaccess 301 permanent redirect. Traffic way up.

Lesson -- always have the 301 redirect to www in your htaccess in case you miss internal or external links to [domain.com...] without the www

Lesson #2...this in the past would always have taken a traditional update to fix...Rolling update?

Kevin

mmr82

7:17 am on Jun 28, 2003 (gmt 0)

10+ Year Member



My host support told me

"this is not possible, this would create a loop using a .htaccess file"

Is there any other way to redirect [mydomain.com...] to [mydomain.com?...]

jdMorgan

7:45 am on Jun 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



mmr82,

Er, umm... Your tech support is wrong. It works just fine in .htaccess. The trick is that you must use mod_rewrite to do a conditional redirect - that is, to redirect from HTTP_HOST=non-www.domain to www.domain, but not the other way round.

Your host support is partially correct in that you cannot use the various Redirect directives to do this without creating a loop, because they do not examine the HTTP_HOST in the request header, and therefore cannot act conditionally based upon it.

If you don't have access to mod_rewrite in .htaccess, you could do it with a script, server-side. Client-side scripts can do a 302 redirect, but I'm not sure that they can do a 301.

Ref: Introduction to mod_rewrite [webmasterworld.com]

HTH,
Jim

khuntley

2:23 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Jim,
You need to get credit for the original fine tuning of the htaccess language; I read a few of your posts with different suggestions for dealing with this -- thank you.

I wonder how many people are floating around in never-never land in the serps because of this?

Kevin

James_Dale

3:39 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Kevin, not sure I understand your example:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.domain\.com
RewriteRule ^(.*)$ [domain.com...] [R=301,L]

Wouldn't the above just be a redirect from www.domain.com to www.domain.com (in other words, the exact same thing?)

Shouldn't it be more like this:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^domain\.com
RewriteRule ^(.*)$ [domain.com...] [R=301,L]

so that [domain.com...] goes straight to [domain.com?...] Maybe I'm being thick here? Completely new to mod_rewrite...

SebastianX

3:53 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Read:
If host not (!) www.domain.com redirect permanent to www.domain.com. Redirects domain.com, www2.domain.com, foo.domain.com ... to www.domain.com.

James_Dale

3:57 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



ah! of course...thanks a lot!

James_Dale

5:26 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



If anyone wants to copy and paste this code, I'd like to point out that the example code above is missing a space. There should be a space before the not (!) highlighted in bold below:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.domain\.com
RewriteRule ^(.*)$ [domain.com...] [R=301,L]

claus

6:18 pm on Jun 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



khuntley and others, this is a hard statement:

external links to index without www will get you a penalty of no links to index counted

- i simply have to ask:

Q: Does this phenomenon occur due to

a) inconsistency (both www. and non-www links), or

b) some specific "dislike" of domains without www.

?

I am planning to use a modified version of the redirect script. The most important modification is to reverse it, that is: all incoming or crosslinking traffic to an URL "with www" will end up at the same domain "without www".

I will not do this to please the SE's, but for usability. Simply put, I will teach my users by experience, that they only need to enter the site name in their browser address bar, without "www.", "xyz." or other stuff placed in front of it.

Then, if a) above is the case, i have no problem. If b) is the case, at least one major SE will discriminate against my approach, and that might be a problem.

Thanks
/claus

WebMistress

7:40 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Maybe I'm not getting this, but with all the links to yahoo.com, wouldn't it likely be that many people link to yahoo.com and many link to www.yahoo.com? If so, and if this penalty/duplicate content theory is true, why isn't yahoo penalized?

James_Dale

7:56 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Because Yahoo are already using the same type of method. If you type yahoo.com into your browser, by default you are taken to [yahoo.com...]

WebMistress

8:15 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



ahhhh, good point, James_Dale...

WebMistress

8:20 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



James_Dale, however, georgewbush.com and pepsi.com do not do this

James_Dale

9:38 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



pepsi.com PR0 (0 inbound links)
www.pepsi.com PR4 (382 inbound links)

georgebush.com PR3 (4 inbound links)
www.georgebush.com PR7 (3370 inbound links)

I don't believe there is a penalty, but very few sites have many inbound links without the www prefix. This update Google has had particular problems with consolidating PR levels (and numbers of inbound links) between the two types of domain. Using mod_rewrite allows Google to consolidate the totals.

To be honest, I don't fully understand myself...but I believe Google has been sporadically associating the link totals for domain.com with what is actually www.domain.com, unless:

1. A mod_rewrite is in place,
or ...
2. Google hasn't found inbound links in the form [domain.com...]

WebMistress

9:52 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



James_Dale, just a sidenote, I actually used georgeWbush.com

georgewbush.com PR6 1230 backlinks
www.georgewbush.com PR6 1230 backlinks

They seem to be resolved same for that site....hhhhmmmm

James_Dale

9:59 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



If Google has never come across any inbound links in the form [georgewbush.com,...] then it has never found a domain with which to confuse the link totals. It's pretty sporadic, as I say, but this does seem to be the case here - there are no inbound links to [georgewbush.com....] If you run that search in Google, it puts the www prefix in for you.

WebMistress

11:19 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



I have been working on trying to redirect domain.com to www.domain.com, and because I host sites with guestboooks which use domain.com/subdomain/cgi-bin/guestbook.pl to process a guestbook entry, it gets all messed up with the rewrite in the htaccess file, even if I go into the individual subdomain cgi-bin and change the guestbook.pl to www.mydomain.com/subdomain/cgi-bin/guestbook.pl.

I admit I have no clue what I am doing. I don't even know how to describe exactly my problem. Any guesses as to what I am trying to say, and any solutions?

The problem is I have 187 links to domain.com by others. It's gonna be a lot of work to email all those folks and ask them to change them. I'm just kind of hoping since this wasn't an issue before Dominic even though those links existed then, that google will resolve it in a final update.

WebMistress

11:30 pm on Jun 28, 2003 (gmt 0)

10+ Year Member



Maybe this will make complete sense to somebody about why the rewrite in htacess to redirect domain.com to www.domain.com messes up submitting to a guestbook. Here's what is in the guestbook.pl:

$guestbookurl = "http://domain.com/subdomain/guestbookpage.htm";
$guestbookreal = "/home/domain/public_html/subdomain/guestbookpage.htm";
$cgiurl = "http://domain.com/subdomain/cgi-bin/guestbook.pl";

Any suggestions as to how I could change this code, such that redirection in htaccess won't mess it up?

jdMorgan

12:58 am on Jun 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Webmistress,

The question is, "how do you get to that guestbook subdomain, and from there to the guestbook subdirectory?"

If it is implemented using another redirect, then the two redirects can interfere with each other.

The code posted above redirects any domain that is NOT www.domain.com to www.domain.com. So, if your guestbooks are implemented as subdomains, i.e., guestbook.yourdomain.com, then requests for those subdomains will get rewritten to www.domain.com as well.

To avoid this, use a "less promiscuous" version of the code:


RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

This redirects ONLY domain.com to www.domain.com.

You may need to add a few more domains to be redirected to www.domain.com. If so, you must specify each one that needs to be redirected when you use this more-specific version of the code:


RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com [NC,OR]
RewriteCond %{HTTP_HOST} ^temp.domain\.com [NC,OR]
RewriteCond %{HTTP_HOST} ^foo.domain\.com [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

Note that the last RewriteCond must not have an [OR] flag. The [NC] flag makes the test case-insensitive.

Ref: Introduction to mod_rewrite [webmasterworld.com]

HTH,
Jim

James_Dale

1:02 am on Jun 29, 2003 (gmt 0)

10+ Year Member



Hey, do you think Google could manually consolidate the domains if we asked nicely? i.e. gave them the different document IDs for the sites? I've got two distinct IDs for domain.com and www.domain.com.

This is for sure damaging my SERPs + cashflow right now. Manual consolidation of the two IDs would be a great help!

jdMorgan

1:17 am on Jun 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



They used to consolidate the www subdomains automatically. I suspect they will do so again in the future - probably the (very) near future.

As to manual consolidation, I doubt it. There are hundreds of millions of us, and only hundreds of them. If all Google employees were conscripted to manual domain deduplication duty for a millenium, I doubt that they could finish it. And it would really delay the next update!

I'm taking this situation as a sign that although the recent update is nominally finished, it still has yet to stew for awhile before everything is really done.

Jim

khuntley

1:17 am on Jun 29, 2003 (gmt 0)

10+ Year Member



Webmistress and others,
I didn't mean for the example I posted to be a catch-all that everyone can use. This can be tough stuff and is beyond making valid alt tags or blocking an IP...

Every site is different. For example the mod rewrite stuff I gave for my site wouldn't work if I had links coming into [domain.com...] (note "s") or [cname.domain.com....] And webmistress, cgi scripts tend to work to "override" some things in .htaccess because you add a new definition based on host and action.

I know this doesn't tell you much; you just have to go to:

[httpd.apache.org...]

and put in the six hours to see what works for your site. Just have notepad with .htaccess open and FTP to experiment. Unless you're Amazon.com any potential problems can be kept to a few seconds.

Also, I know that my own problem was also caused by a stupid mistake on my part of accidentally having some internal links without www.

Can anyone help WebMistress by altering my earlier htaccess text to accommodating calls to a cgi?

Kevin

edit - see a couple of these things answered after post

WebMistress

5:14 am on Jun 29, 2003 (gmt 0)

10+ Year Member



jdMorgan, thank you so much for your response. It didn't work for my guestbook problem, unfortunately.

I'm not sure if I understand your question, "how do you get to that guestbook subdomain, and from there to the guestbook subdirectory?"

Here's a little more info, in case it will help.

I go to a sign guestbook page at subdomain.domain.com/signguestbook.htm

The code to sign the guestbook is as follows:

<form name="sign" method="post" action="http://domain.com/subdomain/cgi-bin/guestbook.pl">

then as stated earlier, in guestbook.pl, the code is written:

$guestbookurl = "http://domain.com/subdomain/guestbookpage.htm";
$guestbookreal = "/home/domain/public_html/subdomain/guestbookpage.htm";
$cgiurl = "http://domain.com/subdomain/cgi-bin/guestbook.pl";

when I add

RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteRule ^(.*)$ [domain.com...] [R=301,L]

to the htaccess file, I get this in the URL address:

[domain.com...]

And the page says, "The comment section in the guestbook fillout form appears to be blank and therefore the Guestbook Addition was not added. Please enter your comments below."

It thinks nothing was entered, although comments were entered.

Without the htaccess change, when hit submit in guestbook form, I get this URL address:

[domain.com...]

and it tells me my entry was added successfully, then I can click on a link to view the guestbook (this is as it should work).

The difference I see is that the URL address returned upon "submit" without redirect in htaccess has no www, but with redirect in htaccess has www. So, the redirect clearly redirects the action in the form:

<form name="sign" method="post" action="http://domain.com/subdomain/cgi-bin/guestbook.pl">

So, I went in and changed the form action to [domain.com...]

And everything works fine now. By process of elimination of changes I made, I found that no change was needed in guestbook.pl, but simply the form action needed to be changed to www version of domain.com to work with the rewrite in htaccess. So, I have some work to do in all the sign guestbook forms...but at least it will alleviate this non-www vs www issue sooner than later in google, hopefully.

I know this was a long post and written as I analyzed it, but I hope it will help anyone who runs into the same problem. Thank you all for your help. This is what I love about WW: the amazing support. So APPRECIATED!

jdMorgan

5:38 am on Jun 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Webmistress,

Glad you got it working... The devil's always in the details!

An alternative solution, for those who don't want to (or can't) change their form's submit URL, would be:


RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteCond %{REQUEST_URI} !^/guestbook_subdomain
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

This excludes the guestbook-subdomain-supporting subdirectory from the domain redirect.

But your solution is better. As reflected in this thread and others, "neatness counts" and it's just not a good idea to have www- and non-www domain variants floating around if it can be avoided. Having a single domain name and redirecting all others to it also simplifies all the other rules in your .htaccess files, because you don't have to account for all possible variations any more. And when another webmaster finds your site and wants to link to it, chances are his/her browser has already been redirected to the "proper" domain name. If not, it's possible he/she will spot the changed (redirected) URL when testing the new link. It just saves a whole bunch of headaches.

...And echoing the theme of another thread today:
For even more good stuff, Subscribe to WebmasterWorld! [webmasterworld.com]

Jim

This 47 message thread spans 2 pages: 47