|www1, www2, www3|
| 4:54 am on Jun 24, 2006 (gmt 0)|
I am having same domain name and same content for URL's starting with www, www1, www2, www3 and sites are live.
Example: I have a domain's www.mydomainname.com ,www1.mydomainname.com , www2.mydomainname.com and www3.mydomainname.com with same content in all the domains.
Is this treated as spam? Will it effect SERP for mydomainname.com ..
Furthermore, today i run a command for 'site:www2.domain-name.com' on Google, many pages had been indexed by it yet. Our mainsite will be penalized by google?
Looking for your valuable suggestions.
Thanks in advance,
| 8:58 am on Jun 24, 2006 (gmt 0)|
From what I understand, it is spam, and yes, it will be treated as spam.
If you are not intending to spam, why are you deliberately setting up duplicate sites?
| 10:13 am on Jun 24, 2006 (gmt 0)|
One good reason is to have mirrors to balance load, handle failures, etc, etc.
I do this so that users can pick a mirror "close" to them.
It is NOT in any way SPAM, and G doesn't seem to treat it as such.
| 11:41 am on Jun 24, 2006 (gmt 0)|
Thank you for all the inputs.
The web servers www1, www2 and www3 are destinated to serve the same content. That said, It's planned on sharing the load between www1, www2 and www3 through www.mydomainname.com
I set up those sites for sales tracking. And any sites are 'opening' when user pick any one.
Any ideas? Does www.mydomainame.com will be penalized by Google? TIA
| 2:10 pm on Jun 24, 2006 (gmt 0)|
There are better, safer ways to do your tracking than setting up duplicate sites, which will very likely cause you problems.
To protect your main site, you could use robots.txt to block spiders from your duplicate sites.
In my view, from what you are saying, these would not be mirrors, but clones; Google, in particular, does not like duplication.
There are exceptions, but I would be very surprised if your scheme was one of them.
| 1:21 am on Jun 25, 2006 (gmt 0)|
Thanks for the reply.
Just a quick note to let you know the www1, www2 and www3 are a set of available mirrors of mydomainname.com, not clone sites.
Now the serious problem is several mirrors(like www2.mydomainname.com), google has indexed 90 pages. I just don't know those mirrors whether effect SERP for www.mydomainname.com? And how to use robots.txt to block spiders from my mirror sites?
| 9:38 am on Jun 26, 2006 (gmt 0)|
| 10:34 am on Jun 26, 2006 (gmt 0)|
Google does not like duplication.
As I said above: " There are better, safer ways to do your tracking than setting up duplicate sites, which will very likely cause you problems. To protect your main site, you could use robots.txt to block spiders from your duplicate sites."
When Google finds duplicates, after a while, all except one item will be dropped or become a 'supplementary result.
You have no control on which one will survive; could be the newest, oldest, blue-est, most linked-to, least linked-to ... no way to predict or control this.
So if you want one to survive, protect it by blocking the others.
| 11:02 am on Jun 26, 2006 (gmt 0)|
Thanks for your quick reply.
Ok, could you give me directions on how to have WWW spidered, but keeping googlebot off WWW1,WWW2,WWW3 in Robots.txt?
| 10:45 am on Jun 27, 2006 (gmt 0)|
Any ideas? How to use robots.txt to deny googlebot access my other servers?
I really appreciate your time. Thank you.
| 11:07 am on Jun 27, 2006 (gmt 0)|
It's just a couple of lines ... but if you are not familiar with robots.txt, then it's worth learning a little first - you may find other stuff to your advantage. Google "robots.txt" and spend a little time; you won't regret it.
| 5:49 am on Jun 29, 2006 (gmt 0)|
I really don't know whether my advice is of any use to you. Try to find the robots.txt generator online programs. There are many on the net but I don't konw which one is the best. After generating the robots.txt from such application put this file in your root directory on the server. This will proabably solve your problem.
[edited by: engine at 8:30 am (utc) on June 29, 2006]
[edit reason] No sigs See TOS [webmasterworld.com] [/edit]
| 11:24 am on Jun 30, 2006 (gmt 0)|
Most likely that Google will recognize it as spam
As of load balancing,its not necessary to have www1 www2 etc,you can set that there is only one www and so it work on a few servers
| 8:31 am on Jul 1, 2006 (gmt 0)|
Thanks for all the help. But the multiple domains on one server(i,e www1,www2,www3 and WWW on one server). So I don't know how to use robots.txt to block googlebot from my mirror sites(www1,www2,www3)?
| 9:33 am on Jul 1, 2006 (gmt 0)|
>>Most likely that Google will recognize it as spam
It isn't spam, but it doesn't ever make sense to confuse a bot.
To exclude the mirrors, in the root directory of each of the subdomains, in the same folder where the main index page of each of the subdomains sits, upload a text file called robots.txt with this entry to keep all robots out of the whole subdomain:
Make sure the file extension is .txt and don't link to those subdomains from anyplace.
| 11:44 am on Jul 2, 2006 (gmt 0)|
google does not treat that as spam at all, even if you're hardcore spamming the serps with wildcard subdomains. at least google didn't treat that as spam for the last years and i am quite shure they won't treat it as spam in the future because their system can not handle that.
| 6:59 am on Jul 3, 2006 (gmt 0)|
I hope so. But google has indexed some pages of www2.mydomainname.com, those duplicated copies might be punished by google. So i just want to set the robots.txt file to block googlespider access my mirror sites. But the 4 subdomains all go to the same web folder on a IIs-based server, how to resolve this matter?
//google does not treat that as spam at all, even if you're hardcore spamming the serps with wildcard subdomains. at least google didn't treat that as spam for the last years and i am quite shure they won't treat it as spam in the future because their system can not handle that.//