|Does DNS based round robin load balancing affect Google ranking?|
I am setting up a second server with the same scripts. The two servers will pull data from the mysql server [a much more powerful one]
The second server will have the sites running on another set of dedicated ips and multiple A names will be added to the dns to create a DNS based round robin load balancing system.
Now I am worried will this effect my SERPS on Google. Is round robin a good idea? It will definitely distribute the load 50/50 and the cron scripts and ftp can run from one of the servers.
Anyone has any experience with round robin LBing.
Although I don't suggest this solution you are implementing if your purpose is searching for high availability (you'll surely know that the DNS based load balancing method you suggest does not take care of various potential issues such as unavailable servers, i.e. if one server goes down, or DNS caching by other name servers) I think the issue you pose is very interestinng.
Please correct me if I misinterpreted your question.
What will Googlebot do if, while querying www.foo.com, will find the IP address of that site changing almost every time? The more IPs you put in the round robin, the more likely Googlebot will see your site served from different IPs (servers).
Will this affect (or may be promote) your rankings? Think about serving the site from a lot of IPs distributed on different geographic locations....
I would be interested in hearing experiences of someone who has already experimented this case, as I think this could have quite significant implications.
Yes, you got the question right. Some of the problems you mentioned do have solutions - like for example the DNS caching issue can be taken care of by assigning multiple A records.
[edited by: encyclo at 11:14 pm (utc) on Aug. 25, 2007]
I would love to hear somebody's experience or knowledge on this subject as well...very interesting question + I am directly concerned :)
IMHO, This can be very confusing.
If each of the servers in your DNS Round Robin Pool can be accessed via the internet by using an IP address, then you are leaving yourself open to duplicate content issues. If only one of the servers can be accessed at any time from the internet, then you should be OK.
BUT, there are may considerations that are not covered by the most basic of load balancing (DNS Round Robin). This method does not take into consideration the number of connection to a box, nor the processing currently being used by a box. There is no monitoring of response of the servers, so one could be hung & still accepting connections, one could be down and DNS still directing traffic. One server could end up with heavy users, and slow as a result, and the other could have much less of a load.
Back to Watching,
"If only one of the servers can be accessed at any time from the internet, then you should be OK."
Ok, let's assume that roycerus is smart enough not to have his IPs to resolve his domains directly (i.e. by setting his domains on that IP address as virtualhosts, therefore not reachable by simply typing the IP address in the browser window). There is still one point.
You say "then you should be OK" and I generally agree with you.
Unless: the different IPs in the DNS round robin are (for instance) in different countries.
If roycerus has a DNS round robin for load balancing purposes (ok we know that this method doesn't allow more availability: let's keep this issue aside for a moment), and his IPs for his domain:
IP #*$!.#*$!.#*$!.#*$! (hosted in US)
IP yyy.yyy.yyy.yyy (hosted in UK)
IP zzz.zzz.zzz.zzz (hosted in Australia)
This is a legitimate setting of the DNS.
Googlebot will see this domain "mydomain.com" to be hosted "sometimes in US", "sometimes in UK" and "sometimes in Australia".
The "residency" will depend on the server that Googlebot will hit by picking one server at random in the lot of the three described above.
Even if the ranking will not be affected, surely Googlebot will have to decide in what Country the site mydomain.com will be resident.
As you know, this plays an enormous role in the geotargeting side of the algo.
So, Googlebot, provided it hits one of the 3 IPs at random everytime it crawls the site mydomain.com, will see the site "changing residency" every time.
And therefore, my question: will Google give ranking and geo-ranking in what Country of the 3 described?
Interesting issue, I think. Specially if you want to play with it in order to acquire reputation in other countries where your rankings have been hammered by the geotargeting algo.
Will you or will you not succeed in telling Google "I'm a bit US-based, a bit UK-based and a bit Australian-based?"
Hey giuliorapetti, IMHO
From the setup you described, I do not think that G would view the dns round robin setup you described as duplicate content, if you only leave one way to the boxes in the pool, from the eyes of G, there is always only one copy.
in answer to your question:
"will Google give ranking and geo-ranking in what Country of the 3 described"
My answer is:
That is a VERY good question, and I do not have a clue.
Doesn't Google work out the country from the IP address of the name servers, and NOT from the IP address of the web server?
|If each of the servers in your DNS Round Robin Pool can be accessed via the internet by using an IP address, then you are leaving yourself open to duplicate content issues. |
This is true even if you don't have load balancing. It's a problem that can affect a single site. If you're set up correctly, it shouldn't happen.
I don't believe that any decent SE will penalise you for having multi-hosted servers. And an SE cannot easily tell in any case if you are using round-robin or some more complicated geo-sensitive or load-balancing (etc) algorithm.
I use round robin for my two main (best-connected, UK and US) mirrors or my main site and I see no evidence of any kind of penalty.
DO WHAT IS BEST FOR USERS and good SEs will aim to reward that.
"DO WHAT IS BEST FOR USERS"
Sure. But sometimes what's best for users is not what Google does and/or encourages.
If you choose to host in the US because there is the best and most reliable hosting industry in the world, and you sell on a worldwide basis, in English, a product that can be effectively bought from any part of the globe, today, with Google, you have a problem. And also the users do have a problem.
(Sure sure, too easy to talk about "design 3 sites, one in American, one in British And one in Aussie English": if you sell standard products/services, like a software for example, the words "software" or "download" or "system requirements" are part of the website content that you cannot translate: English is English, if you sell "plain" things like software to be downloadable online, and not novels' books where you can describe them with American/British/Australian abstarcts... I'd be curious to see how Pinocchio could be presented in 3 different English languages anyway :).
So, if you host in the US your pages, you will not rank well in, say, Germany, and they will be "hidden" in the last pages of results even if the user sitting in Germany searches for things in English, i.e. objectively demonstrating to be willing to accept the worldwide scene as set of results.
Google semantic maps will recognize that the "Product" the user searched is "Produkt" in German, and will start to fire out results about Produkte.de etc. first.
There is nothing "best for the user" in hiding in the 3rd or 4th page of results even a world online leader on a worldwide-distributable/deliverable product or service.
This is not "for the user good" as the user just asked you "product" not "produkt". He/she wanted the world as range of results, not the biased vision of the world provided by the geotargeting algo...
So, anyway, coming back to the DNS issue with multiIPs/multilocated, may point was: what's the residency of a domain that every time is queried, changes location or even more, Country?
Will you be seen as belonging to all locations (i.e. based everywhere) or "based in that last location crawled by Goolgebot, until another crawling session will happen and a new geolocated server will be discovered and a new citizenship applied"?
Also: I have no info about the fact that the IP of the DNS and not the IP of the web server is investigated by Googlebot for the residency's proof.
I'd like g1smd to point me some info about it, as that would be a new discovery for me.
Sorry for the almost-offtopic issue about geotargeting, but that was important to explain my point: DNS round robin *can* affect (in good or in bad, who knows) the SERPS if you think about the internationalization of your servers...
I too have a direct interest in this area since I play with this type of software.
There is an alternative way to look at this.
It might be construed that a site that comes from multiple addresses is a *really, really big important site* since it uses the same techniques as the likes of google themselves, ibm, disney, wordpress, wikipedia. You get the picture.
It is also important to remember two other things.
The content is attributed to a domain name, not an ip address.
And, google, in the past has been criticised for caching dns records for far longer than what is indicated by the ttl issued with the record by the authoritative server.
From a *very* quick and unscientific test of my server with DNS records for machines in the US and the UK, Google seems to believe that the host is (correctly) in both locations.
So you have a domain.com with multiple IP addresses as A records in the DNS, and those IP addresses are pointing server in USA and UK, and when searching for "pages from UK" or "pages from US" on Google, you see all your site's pages listed in both searches (so the full "site:" set of results on both country searches)?
Please explain, that one would be an interesting discovery.
I don't know about geo-targeting issues, but:
Google seems to have no problem with DNS round robin.
The only thing is that Google Webmaster Tools "site verification" does (used to?) not work for some strange reason. I went back to a single server for a weekend, got verified, and went back to two servers.
Roughly as you suggest, though not quite as clear/good.