Forum Moderators: phranque

Message Too Old, No Replies

Question for Apache Guru's: 301 Pagerank Solution

could potentially cause problems

         

Kamin

7:50 am on Nov 27, 2003 (gmt 0)


jdmorgan, really appreciated your solution to write an .htaccess when you have a site with a great page rank for www.domain.com but no pagerank without www (domain.com)

BUT, when i use that code shown here:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.domain\.com
rewriterule ^(.*)$ http://www.domain.com/$1 [R=permanent,L]

When running server checks to make sure thinks are operating properly I get the follwing:

301 http://www.domain.com redirects to: http://www.domain.com/

Why does this cause a redirect from WWW to WWW!? Isn't this another potential problem, when a spider gets a redirect BOTH ways? Aren't we missing something here? Shouldn't there be a way to achieve a 202 OK when the site is accessed via normally via www? Another code in the string that says "do nothing if accessed via http://www.domain.com? Concerns me to have a redirect to another redirect - seems like a "loop" that could potentially cause a problem...

Any enlightenment would be greatly appreciated. Sure would feel better to see the following happen:

http://domain.com redirects to: http://www.domain.com (301 host header)

http://www.domain.com simply produces a normal 202 OK header

Thanks in advance

Kamin

Yidaki

8:01 am on Nov 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kamin, you missed something important: a space!

RewriteCond %{HTTP_HOST}space!^www\.domain\.com

The space gets nuked if posted in a thread - WebmasterWorld filters spaces infront of a! or a? . Always remember to put this space in your htaccess code or the line wouldn't work at all!

Btw, no need to post the question twice in different forums. However, i also answered your (same) question at the supporters forum. :)

closed

8:21 am on Nov 27, 2003 (gmt 0)

10+ Year Member



It's not a problem.

To get the directory index, Apache needs to access /, so the 301 redirect is issued. That's the default behavior for Apache, and it would occur even without any code in .htaccess.

Yidaki

8:24 am on Nov 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



erm, closed, you go it. I forgot to think in this direction ... off to take some coffee, yawn. ;)

jdMorgan

8:51 am on Nov 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kamin,

> Isn't this another potential problem, when a spider gets a redirect BOTH ways?

No, mod_rewrite can't redirect both ways, otherwise, you'd have an infinite loop, and the browser would continue following redirects until it timed out. As closed says above, it's probably the fact that you requested what is, in reality, an illegal URL. Apache detected it, and redirected to the legal URL. The requirement of a trailing slash for a directory request is a fine point of the HTTP protocol, and most servers have a mechanism to cover it up. In this case, Apache appends the slash and does a redirect.

Yidaki's point about the missing space is a good one, too. Posting code on this board deletes any space preceding the "!" character. To avoid that, put two spaces in there - the board deletes one, and we get to keep the second one. :)

To test your redirect, request http://example.com/ and you should be redirected to http:www.example.com/ .
If you request http://www.example.com/ you should not get redirected.
If you request http://www.example.com or http://example.com, you should be redirected to http://www.example.com/ .

Jim

Kamin

11:54 am on Nov 27, 2003 (gmt 0)



I think you guys kinda missed my point in all of this. The 301 redirect from [domain.com...] to [domain.com...] works without a problem. Here is what I was asking. I use a server check utility that checks host header information. Here is the contents of the header check.


Error 301 - [domain.com...] redirects to: [domain.com...]

Site Status Header (from your server)

301 Moved Permanently
Date: Thu, 27 Nov 2003 10:54:45 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) Resin/2.1.5 mod_perl/1.26 PHP/4.2.2 FrontPage/5.0.2 mod_ssl/2.8.12 OpenSSL/0.9.6b
Location: [domain.com...]
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>301 Moved Permanently</TITLE>
</HEAD><BODY>
<H1>Moved Permanently</H1>
The document has moved <A HREF="http://www.domain.com/">here</A>.<P>
<HR>
<ADDRESS>Apache/1.3.27 Server at www.domain.com Port 80</ADDRESS>
</BODY></HTML>

As you can see the rewrite is working perfectly when checking host headers on [domain.com...] but here is what I get when I check [domain.com...]

Error 301 - [domain.com...] redirects to: [domain.com...]

Site Status Header (from your server)

301 Moved Permanently
Date: Thu, 27 Nov 2003 10:58:05 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) Resin/2.1.5 mod_perl/1.26 PHP/4.2.2 FrontPage/5.0.2 mod_ssl/2.8.12 OpenSSL/0.9.6b
Location: [domain.com...]
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>301 Moved Permanently</TITLE>
</HEAD><BODY>
<H1>Moved Permanently</H1>
The document has moved <A HREF="http://www.domain.com/">here</A>.<P>
<HR>
<ADDRESS>Apache/1.3.27 Server at www.domain.com Port 80</ADDRESS>
</BODY></HTML>

As you can see on the host header check for [domain.com,...] it shows it as redirecting to itself. Since this statement in the .htaccess "RewriteCond %{HTTP_HOST}!^www\.domain\.com" is basically saying ( Any request that is not equal to www.domain.com, redirect them to www.domain.com ) should the host header check on [domain.com...] return a 200 OK code? and not the 301 redirect that could confuse search engines and keep them in an endless loop when spidering a site?

Maybe I am wrong in thinking how this should work... but shouldn't

[domain.com...] have a 301 host header return, and
[domain.com...] should return a 200 code, not a 301.

Thanks
Kamin

Yidaki

12:05 pm on Nov 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Kamin, reread what closed and jd said:

http://www.domain.com -> redirects to -> http://www.domain.com/

withouttrailingslash -> redirects to -> withtrailingslash

What does your server return if you check http://www.domain.com/ (notice the trailingslash)?

http://www.domain.com/ -> should return 200
http://www.domain.com -> should return 301

Kamin

12:19 pm on Nov 27, 2003 (gmt 0)



I did as you suggested and here is what is returned on [domain.com...]

Warning - Cloaked or Virtual IP site detected
Some search engines will penalize Cloaked or Virtual IP sites

Page results mis-match at line 1

Site Status Header (from your server)

301 Moved Permanently
Date: Thu, 27 Nov 2003 11:26:00 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) Resin/2.1.5 mod_perl/1.26 PHP/4.2.2 FrontPage/5.0.2 mod_ssl/2.8.12 OpenSSL/0.9.6b
Location: [domain.com...]
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>301 Moved Permanently</TITLE>
</HEAD><BODY>
<H1>Moved Permanently</H1>
The document has moved <A HREF="http://www.domain.com/">here</A>.<P>
<HR>
<ADDRESS>Apache/1.3.27 Server at www.domain.com Port 80</ADDRESS>
</BODY></HTML>

This doesn't make much sense. Still returning the 301 and now its saying the site is on a cloaked or virtual IP even though the site has its own dedicated static ip.

Yidaki

12:35 pm on Nov 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Uhm, i'm lost ...

Do you have other lines in your htaccess file or is what you've posted above all?

Jim, ideas?

closed

4:06 pm on Nov 27, 2003 (gmt 0)

10+ Year Member



Just so we're all on the same page, what are you using to check the server headers?

Also, what Yidaki is getting at is that the code you posted above has nothing to do with the redirects you're getting for www.domain.com.

jdMorgan

2:34 am on Nov 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Jim, ideas?

I suspect other code in the .htaccess files and httpd.conf file above Kamin's two directories. Unfortunately, he started off telling us that he doesn't have access to them. :(

Jim