Welcome to WebmasterWorld Guest from 54.224.68.56

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

New Blekkobot User Agent String

scoutjet / blekkobot

     
4:47 pm on May 27, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3148
votes: 4


The Blekko robot's UA has changed. It still includes the word scoutjet but now includes blekkobot and a new robot URL.

Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot)

The robots.txt keyword is now blekkobot but it will still honour scoutjet so no need to change.

Unusually for a bot page, blekko's has always given the IP ranges the bot uses. Probably worth a visit from time to time for updates. Current IP ranges are:

64.13.159.0 - 64.13.159.255
199.87.248.0 - 199.87.255.255
38.99.96.0 - 38.99.99.255
7:20 pm on May 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 24, 2002
posts:894
votes: 0


Are they actually still crawling, I haven't seen them for a very long time.

Maybe that's why their search results are so poor, at least for the keywords I'm most familiar with.
7:30 pm on May 29, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8945
votes: 408


I see scoutjet almost every week sometimes performing a full crawl, but usually just a dozen or so pages.
7:46 pm on May 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Blekkobot/Scoutjet turned up for the first time in a long time on several sites in the last 24 hours or so.

Another new one was the ahrefsbot.
9:30 pm on May 29, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13829
votes: 484


Another new one was the ahrefsbot.

I've had that one blocked for a long time due to misbehavior w/r/t robots.txt, so it was a surprise seeing the occasional "ahrefs" graphic in the top corner of wmt pages. What's it for? First showed up in November, with a peculiar habit of first getting pages and then getting robots.txt to, presumably, see if it was allowed to get the pages it had already gotten.
10:08 pm on May 29, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14663
votes: 99


You can blame this new UA on me ;)

I suggested they change it because ScoutJet didn't point people to using Blekko and IMO they were wasting literally millions of opportunities daily to let webmasters know they exist.

Anyway, they still honor the ScoutJet UA obviously so you don't have to change anything unless you serve robots.txt dynamically like I do, as I had to change my script just a bit so I didn't keep telling Blekko it was denied LOL!
10:19 pm on May 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Unusually for a bot, there's no appended version number.

It's just
Blekkobot; ScoutJet;
in
Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot)
with no appended version info cf.
Googlebot/2.1
etc.
11:06 pm on May 29, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14663
votes: 99


Unless you have multiple active bots that have different criteria that need to be met, I really don't see the need for the version number except as a courtesy. It's actually a PITA for people that test for exactly specific UAs that suddenly get broken when the version changes.
11:11 pm on May 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


I don't test for specific UA versions, but it is useful to see when the version number changes as it gives a big clue that the bot behaviour may have changed in some way too.

[edited by: g1smd at 11:28 pm (utc) on May 29, 2012]

11:25 pm on May 29, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14663
votes: 99


I agree and you can see it's evolving and not some dead bot just running from a server until the end of time just wasting our bandwidth. However, it also makes things break for both novices and those doing strict matching was my point.
9:04 pm on May 30, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3148
votes: 4


Wish you'd warned us, IncrediBill! :)

I test bot UAs against declared or otherwise known IPs, sometimes shortened UAs, sometimes (as in scoutjet/blekko) the full UA. This helps in prohibiting bad UAs from good bot IPs, which I'm seeing a lot more of lately.

I usually pick up changes fairly quickly but this UA change slipped by me for at least a couple of weeks.