Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Does Google spider Yahoo and Bing?

         

latimer

8:27 pm on Sep 14, 2009 (gmt 0)

10+ Year Member



have a site that we blocked google from some years ago when we built a new identical site strictly for google as a result of penalty.

fixed everything we could think of and after waiting for google to lift the penalty, we finally just put the site out on a new domain and blocked google via robts.txt from the old one. This seems to have worked fine for the past 7 years.

per discussion on another thread, concern has been raised that google spiders yahoo and bing and that having 2 sites, one for google and the other for yahoo and bing, will cause duplicate penalty filter? This doesn't seem likely since one of the sites is blocked from google. Would like to hear what others have to say.

aristotle

1:02 am on Sep 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the blocking is done properly, I think it will be safe.
Incidentally, did you also block Yahoo and Bing from the new site intended for Google?

tedster

1:33 am on Sep 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This line in Bing's robots.txt file [bing.com]:

Disallow: /search?

This line is in Yahoo's robots.txt file [search.yahoo.com]:

Disallow: /search

Neither Bing nor Yahoo would be happy to see googlebot violate that directive ;)

Also, Googlebot does not want to spider search results (not even Site Searches on local websites). You can release your concern, I'd say.

latimer

2:06 am on Sep 15, 2009 (gmt 0)

10+ Year Member



Yes, we do block google from the other site that allows yahoo and bing.

the only question remaining, is why does google show a few of the pages for the other site in their index. these are the ones with no description just the urls. from other discussions, gathered that these get into google as a result of links pointing to them, but they are not fully indexed, even though they were technically blocked by robots.txt. Not a concern? What would you say Tedster?

tedster

2:50 am on Sep 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Such listings in the SERP are because of backlinks to those pages. Google cobbles together a "url-only" listing in some cases of good backlinks but no spidering allowed.

bwnbwn

9:35 pm on Sep 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a question as to this post as I made an error in stating that google spidered yahoo and msn. I just never checked there robots.txt but that was my bad.

Discussion [webmasterworld.com...] Need to be a member to view.

How many webmasters here have 2 identical sites live under 2 differnt domain names?

I feel that this along with some other issues is a big reason for the sites drop in rankings.

I am intrested as to others response to this question as it does give me some ideas if other members do the same and or feel there isn't a possible issue here.

dusky

5:49 am on Sep 16, 2009 (gmt 0)

10+ Year Member



Many webmasters have a habit of testing on one site and going live on another, putting only a disallow on the test site alone which is foolish in my opinion (I was foolish to do that myself). I was one of those who previously relied on the disallow alone. Y! completely ignores it most times, so did G* and ended up with duplicate site penalty. G* just demoted the site, halved the PR, suppressed ranking for all pages inc homepage. Y! completely banned and de-indexed both for three years, took me 3 years of work to find out, and last month Y! responded by indexing and allowing the production site again after I'd explained their mistake three times and I had to completely delete everything on the test site and left only an htaccess, robot.txt file with disallow everything (though there is nothing to index except the index page) and an index page with no title but a link to the production site!

Y! even apologized, surprisingly, I could've gone public with their admission by email. The reason why they indexed the site regardless was because some sites linked to the test site which is .org site whereas the production site is .com, however at one time when I finished testing, I deleted all content except an index page, but did not have a robot.txt file probably only for few days, don't know how they did it, they still indexed few hundred pages which are supposed to be on the production site but indexed under the test site. That site still suffers from some kind of penalty. On the bright side, I know what did it after more than 50 reconsideration requests to G* and Y!. G* never responds, though I could decipher what Matt emailed me back with even though my emails to him were totally unrelated as I managed to throw a question regarding the site's penalty.

Conclusion, if you have two sites / domains with the same content and use one as a test site, slam a htaccess and password protect the test site, only you can access it, the rest will need a password to view anything, period!

dusky

5:55 am on Sep 16, 2009 (gmt 0)

10+ Year Member



In your case by the way, you can't password protect one of the duplicate sites as you want one search engine to index it and not another, well, that's a tough call especially if other sites link to the one you don't want to index with either or neither. This requires careful thought!

bwnbwn

1:54 pm on Sep 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



dusky I agree this requires careful thought so that said I am really intrested in how many webmasters here have two duplicate sites on the internet that are being used for revenue?

Tedester you have any clients with 2 duplicate sites on the net?

ecmedia

2:52 pm on Sep 16, 2009 (gmt 0)

10+ Year Member



It is perfectly alright to have websites designed exclusively for one search engine as long as robots.txt is written the right way. Each search engine uses different algorithm and there is nothing black-hat about optimizing your website for a specific search engine.

aristotle

4:17 pm on Sep 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Maybe I'm mis-understanding the concept, but if you have two or more almost identical sites under different domain names, doesn't this mean that your incoming backlinks will spread over both sites, and therefore their effect is diluted. Wouldn't it be better to have all the backlinks concentrated into one site?

latimer

9:27 pm on Sep 16, 2009 (gmt 0)

10+ Year Member



aristotle:

Agreed. Definately not the best that the two sites split links.

Had the process of reinclusion been as strightforward and easy as it is now through webmaster tools 7 years ago when our site was removed from google for no known reason, we would have used that process.

Google has made big improvements in communications with webmasters regarding these sort of things, but back then wasn't quite as easy.

We couldn't just dump the old site and build a new one because we had many links to the old, and after all it is our trademarked company name.

So, we followed good advice here and blocked the old from google, built a new site and let google naturally crawl and index it. There have been no problems with that over the past 7 years, and based on the posts from Aristotle and Tedster, we think now that it is unlikely that our drop in sales is a result of having two sites, or that google is spidering and indexing content from bing or yahoo as bwnbwn had raised concern about.

For the sake of discussion, say we decide to open up the blocked original site to google and 301 the current google site over. Any thoughts on how this could be done effectivley to funnel all the juice from the new site that isn't doing as well, into the old one that has lots of links also?

The other valid points that bwnbwn and others have raised regarding our site are appreciated. We are working to implement improvements.

Our sales have dropped, but 60 - 70% is still better than nothing. Given the ever changing g-landscape, the increasingly tough competition in our market, and the need for improvements to our site, we are probably doing well to maintain those levels.