Forum Moderators: open

Message Too Old, No Replies

Dedicated IP addresses

         

bcc1234

12:29 am on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Is having a dedicated IP address for each site important ?

Also, is there a chance that google would try to do a request without specifying the HOST, just to see if it still goes to the same page ?

przero2

1:47 am on Jul 3, 2002 (gmt 0)



I read many people in the past say that dedicated IP is preferred over shared IP. I am one of those believers in the recent past and moved my domain to static IP. However, it was a painful move as I realized that if you change the IP address without changing the nameservers, the googlebot is thoroghly confused. My site in question was out of google index completely (except a few cobalt raq welcome pages in the month ago index). As of now, today, the bot is still trying to look for pages on the shared IP and encountering all 404s per my access logs. On a good note, some IPs of the googlebot are also spidering the site on the new/static IP. So, it is possible the change was noticed by a few IPs of googlebot and a few others still think my site is on the old shared IP address. So, much so for confusion.
If I were to move a well indexed site that is currently on a shared IP, I would not do it. If I must do it, I would ensure that there is a corresponding nameserver change also with it.
However, all my new sites that I think are important, I would keep on a static IP only going forward!

bcc1234

3:34 am on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But is there a possibility that google would try to request a page without specifying the HOST field in the request ?

I'm using name based virtual hosts with different ip addresses for each domain, so it would be possible to connect to the IP of the domain1.com and request domain2.com.
Or, if the HOST is not specified - the server would return the default host which is not the the target site.

Should I fix this ?

przero2

4:34 am on Jul 3, 2002 (gmt 0)



bcc ..., let me tell you what i am seeing in access log. let say my host domain is hostdomain.com and my site is sitedomain.com - let me walk you through different steps:
1) sitedomain.com was a virtual host sharing IP of hostdomain.com and hostdomain.com was also accessible with the IP address
2) sitedomain.com has been moved to a different static IP address, i.e. the domain is accessible directly with the IP address also
3) the physical server or the location of the web files have not changed/moved .. the change was only the IP address
4) in the access log i see googlebot going crazy looking for my site web files under the hostdomain.com now instead of under sitedomain.com . needless to say it does not find files there at all! this is after full 2 months following the change!

based on the above infer whatever you can although it seems pretty ridiculous to me how a bot of a major search engine like google can be so confused;)

mbauser2

5:27 am on Jul 3, 2002 (gmt 0)

10+ Year Member



bc1234:
But is there a possibility that google would try to request a page without specifying the HOST field in the request?

Technically it's possible and legal, but it strikes me as unlikely. According to my access logs, Googlebot makes HTTP 1.0 requests; the HTTP 1.0 spec (RFC 1945) doesn't require user agents to send a "Host" header as part of requests. But it doesn't forbid the "Host" header, either, and it would be very odd if Google left the header out.

przero2 said:

As of now, today, the bot is still trying to look for pages on the shared IP and encountering all 404s per my access logs.

Wait a minute, are you saying you moved an existing domain, but you're still seeing 404s on the server occupying the old IP address? If so, that's the wrong error code. RFC 2068 (the HTTP 1.1 spec) says servers that receive a request with a mismatched "Host" header MUST serve a genric "Status 400". If you're seeing 404s, either the server isn't running to spec, or Googlebot isn't including the "Host" header.

Now I think I've confused myself. Somebody needs to to an environment capture on Googlebot to see if it's sending a "Host" header with requests. I also recommend anybody having problems like przero2 double-check the configuration of both the old and new servers. There might be something wrong at the web server end.

przero2

6:20 am on Jul 3, 2002 (gmt 0)



mbauser2, it was an existing domain, well indexed in google and it was moved from shared IP (also server/webhost IP) to its own dedicated IP .. below find a snippet of access log with hostdomain replaced with "hostdomain"

www.hostdomain.com 216.239.46.26 - - [02/Jul/2002:14:23:56 -0400] "GET /discount-hotels/doubletree.htm HTTP/1.0" 302 249 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.26 - - [02/Jul/2002:14:23:56 -0400] "GET /discount-hotels/doubletree.htm HTTP/1.0" 200 645 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.113 - - [02/Jul/2002:14:24:16 -0400] "GET /discount-hotels/mississauga.htm HTTP/1.0" 302 238 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.113 - - [02/Jul/2002:14:24:17 -0400] "GET /discount-hotels/mississauga.htm HTTP/1.0" 200 645 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.222 - - [02/Jul/2002:14:25:41 -0400] "GET /discount-hotels/oxnard.htm HTTP/1.0" 302 233 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.222 - - [02/Jul/2002:14:25:41 -0400] "GET /discount-hotels/oxnard.htm HTTP/1.0" 200 645 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

.........
obviously googlebot cannot find those files under the server IP address?. as to how the server is configured, I am not sure as I use a local web hosting provider!

ggrot

6:34 am on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I'll put in my 2 cents. I would advise against hosting a site on an ip that you don't control the other sites on that ip. The problem is that sites could potentially be blacklisted by ip address. However, I wouldn't recommend using a different dedicated ip for every site you own. I don't see any problem in using 1 ip address for 10 or even 100 sites as long as you are running them all.

bcc1234

7:03 am on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't see any problem in using 1 ip address for 10 or even 100 sites as long as you are running them all.

What about linking from one site to another while having them on the same IP ? the same /24 net ?

mbauser2

8:37 am on Jul 3, 2002 (gmt 0)

10+ Year Member




www.hostdomain.com 216.239.46.26 - - [02/Jul/2002:14:23:56 -0400] "GET /discount-hotels/doubletree.htm HTTP/1.0" 302 249 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.hostdomain.com 216.239.46.26 - - [02/Jul/2002:14:23:56 -0400] "GET /discount-hotels/doubletree.htm HTTP/1.0" 200 645 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

That doesn't make any sense. Your log sample shows a page redirecting to itself. That's not just weird, that's the HTTP equivilent of a science-fiction "I'm my own grandpa" time paradox. Did you mess up the search-and-replace? Is that log meant to be alternating between sitedomain and hostdomain ?

If so, the status "302" is a problem. 302 is "Found", a.k.a. "temporary redirect". User agents are allowed to cache revisit 302'ed URLs, so Googlebot isn't completely wrong in requesting the old URL. If you want to permanently redirect a page, you should use status "301" so that Googlebot knows to old address is defunct. (Although, again, status 400 is what the specs require if there's a "Host" header mismatch. Even I think this is getting too complicated.)

Here's a key reference for HTTP status codes: [w3.org...]

By the way, what server software are you running these domains on? Might be relevant.

ciml

10:56 am on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Over the last year, more than 90% of my host-less requests came from Windows servers tying to give me Nimda.

None were from Googlebot or any other search engine that sends real traffic.

On the other hand, I have come across more than a few sites that have had problems because the server does accept additional host headers. This can lead to duplicate content problems including lost PageRank and rarely, hijacks.

richlowe

3:35 pm on Jul 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had a site which was hosted on a shared host, then moved it to another shared host. I saw Googlebot visiting the first site for almost two months before it realized that the site had moved.

Richard Lowe

przero2

4:23 pm on Jul 3, 2002 (gmt 0)



mbauser2, thanks for your comments on log entries. the server is a cobalt raq 4i running apache. i do not know if the hosting provider messed up with search and replace?. how do i tell??. all i knew was i moved from shared IP to static IP and a change was made by the hosting provider at that time. regardless of how the host is set up, the issue is why googlebot is even looking for my web site pages under the hostdomain instead of sitedomain?. it is weird that it is only googlebot that is doing this and not any other spider like slurp, scooter, mercator, etc.