Forum Moderators: open

Message Too Old, No Replies

Yahoo Bots

         

sem4u

8:09 pm on Nov 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Anyone know what these two bots are up to:

dev12.vc.corp.yahoo.com
snv-global1.corp.yahoo.com

Also, I have a load like this:

sbider4.sitebuildit.com

Sitebuildit is some kind of website creation software?

joeking

7:19 am on Dec 4, 2006 (gmt 0)

10+ Year Member



sitebuildit.com - I'm getting sbiderX.sitebuildit.com visiting all of my sites too. And as you say when you go to the website it seems to be a website building tool?

Anybody have any information on this and whether it should be blocked or not?

And can it be blocked by name - deny from sitebuildit.com -?

Mokita

10:39 am on Dec 4, 2006 (gmt 0)

10+ Year Member



Anybody have any information on this and whether it should be blocked or not?

This one comes down to purely personal preference. It depends whether it *might* bring you any traffic (doesn't for any of our sites), or if you feel benevolent towards it.

And can it be blocked by name - deny from sitebuildit.com -?

You can try blocking it via robots.txt like this:

User-agent: SBIder
Disallow: /

However, with this bot it is a bit of a lottery whether it obeys robots.txt or not.

If it bothers you enough, then block it via Mod_Rewrite (assuming you are hosted on an Apache server)

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^SBider [NC]
RewriteRule .* - [F,L]

joeking

7:25 am on Dec 5, 2006 (gmt 0)

10+ Year Member



Thanks for your advice.

Do I just paste this into my .htaccess?

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^SBider [NC]
RewriteRule .* - [F,L]

Mokita

7:54 am on Dec 5, 2006 (gmt 0)

10+ Year Member



Do I just paste this into my .htaccess?

Um yes - unless you already have any current Rewrite rules in your .htaccess.

Mokita

8:26 am on Dec 5, 2006 (gmt 0)

10+ Year Member



Amendment: If you don't currently have any Mod_Rewrite rules, I think this is the best code to start off with:

RewriteEngine On
RewriteBase /

RewriteCond %{REQUEST_URI}!^/robots\.txt$
RewriteCond %{HTTP_USER_AGENT} ^SBider [NC]
RewriteRule .* - [F,L]

<edit> Darn! The forum software has deleted the essential space before the exclamation mark, and nothing I have tried will bring it back. So be warned - ensure there is a space before the (!) in the preceding code! </edit>

This allows the SBider bot to access robots.txt (then hopefully note and obey the fact that it is blocked from going further). But if it decides to disobey - the Rewrite rule comes into play, blocking access to any files but robots.txt.

[edited by: Mokita at 8:33 am (utc) on Dec. 5, 2006]

keyplyr

9:00 am on Dec 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The forum software has deleted the essential space before the exclamation mark, and nothing I have tried will bring it back - Mokita

The forum CMS only removes 1 space, so if you type 2 spaces, one will remain :)

RewriteEngine On
RewriteBase /

RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteCond %{HTTP_USER_AGENT} ^SBider [NC]
RewriteRule .* - [F,L]

Mokita

11:02 am on Dec 6, 2006 (gmt 0)

10+ Year Member



Thanks for the neat hint keyplyr! I'll try to remember that for next time.