Welcome to WebmasterWorld Guest from 54.226.130.194

Forum Moderators: goodroi

Message Too Old, No Replies

Get my site off bing?

bing msnbot

     

Sunnz

5:53 am on Apr 4, 2010 (gmt 0)

5+ Year Member



Just wondering what are the ways to prevent my website from appearing on bing?

I have disallowed SandCrawler, msnbot, MSRBot in robots.txt as well as blocking user agent of SandCrawler, msnbot, MSRBot in my web server configuration.

Are there other ways that I should use as well? Maybe blocking bing bot IP addresses at the firewall?

Thanks.

goodroi

11:31 am on Apr 6, 2010 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That should do it.

For more information you may want to visit this old post [webmasterworld.com...]

Also remember that robots.txt is a voluntary protocol and sometimes there are rare glitches. If you want to truly block anyone from accessing your website you should use htaccess file.

tangor

11:52 am on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Not seeking to change mind, just curious as to why no Bing? Even if one sees only 10% from there, that is quite a bit of traffic. Interested into the reasoning for denying Bing.

Sunnz

12:15 pm on Apr 6, 2010 (gmt 0)

5+ Year Member



Hey goodroi thanks for the link.

Regarding to htaccess, what do you really mean by that? I know you can deny access but base on what sort of rules? IP address? User agent string?

I am not using Apache but I have a similar set up in my web server that actually close the connection for any clients with the user agent strings above... so I am just wondering if I were doing the same thing?

This is for a personal web site so I am not concern with losing visitors from bing as most visitors are probably going to be a purely word of the mouth basis. I am just interested to learn about how these things work, and bing is just a nice one to try this out as it is big enough search engine, yet I am not losing anything from blocking it for a personal web site.

tangor

12:30 pm on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



If it is an experiment, then choose Teoma (ask) or Yandex, Baidu, Twiceler, etc... these are significant, yet not nearly as large as Bing for Western (EU, USA) websites. Bing is driving traffic these days, and it will continue to grow.

Bans are done by IP, UA. Your .htaccess might grow for Bing, which brings on new IPs and UAs everyday it seems. See: [webmasterworld.com...] for starters.

For understanding regarding .htaccess see: [webmasterworld.com...]

Sunnz

7:40 pm on Apr 6, 2010 (gmt 0)

5+ Year Member



Yes I was reading that thread after this, so there are no concrete way to ensure a web site doesn't end up on Bing?

Please note that I am not asking how .htaccess works but what specific rules can be used to get Bing off a site in addition to what I have done in the initial post... and I was not asking which search engine I am picking to try things out for fun. Remember that driving traffic isn't always the goal, especially for non-commercial sites where bandwidth may be limited and it is not necessarily that one would want as much traffic as possible.

Cheers.

tangor

10:41 pm on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



I doubt you'll find Bing specific bans here at WW. If bandwidth is a problem I'd ban Google. :)

Sunnz

8:15 am on Apr 7, 2010 (gmt 0)

5+ Year Member



I am not trying to save as much as bandwidth as I can neither... neither visitors nor bandwidth are the focus...

tangor

10:23 am on Apr 7, 2010 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Ban IP ranges first, then UAs. The IP range will be more accurate, the UA will catch new IP addresses as they are introduced.

jdMorgan

12:45 pm on Apr 7, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'd like to point out that Disallowing a robot in robots.txt or blocking it using .htaccess or any other server-side code will not prevent a site from appearing in most modern search engines; If they find links to that site anywhere on the Web, they may list the site by URL and link-text, even if they are unable to fetch pages from that site. This is often called a "URL-only" listing, but major search engines now use the link text they find along with the URLs they discover to "build" a listing with more than just a URL in it.

The easiest way to prevent a site's URLs from appearing in search results is to NOT Disallow the robots in robots.txt, and to NOT block IP addresses or user-agents, but instead to allow full access and then use the on-page HTML <meta name="robots" content="noindex"> tag.

Note that robots.txt prevents fetches by compliant robots, while the on-page meta-tag prevents indexing... not at all the same thing. And note that if a robot cannot fetch the page due to robots.txt or server access restrictions, then it cannot see the on-page noindex tag.

Jim

Sunnz

9:04 am on Apr 9, 2010 (gmt 0)

5+ Year Member



Hey jdMorgan thanks for the informative post that's very interesting.

Not all bots honor things like robots.txt and noindex though... so I am thinking to make an empty web site with robots.txt allowing robots to visit index.html, and use the noindex meta tag in there, and use server configuration to detect and redirect bots to there?

What do you think of that approach?
 

Featured Threads

Hot Threads This Week

Hot Threads This Month