homepage Welcome to WebmasterWorld Guest from 54.166.10.100
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
New Blekkobot User Agent String
scoutjet / blekkobot
dstiles




msg:4458412
 4:47 pm on May 27, 2012 (gmt 0)

The Blekko robot's UA has changed. It still includes the word scoutjet but now includes blekkobot and a new robot URL.

Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot)

The robots.txt keyword is now blekkobot but it will still honour scoutjet so no need to change.

Unusually for a bot page, blekko's has always given the IP ranges the bot uses. Probably worth a visit from time to time for updates. Current IP ranges are:

64.13.159.0 - 64.13.159.255
199.87.248.0 - 199.87.255.255
38.99.96.0 - 38.99.99.255

 

Staffa




msg:4459142
 7:20 pm on May 29, 2012 (gmt 0)

Are they actually still crawling, I haven't seen them for a very long time.

Maybe that's why their search results are so poor, at least for the keywords I'm most familiar with.

keyplyr




msg:4459150
 7:30 pm on May 29, 2012 (gmt 0)

I see scoutjet almost every week sometimes performing a full crawl, but usually just a dozen or so pages.

g1smd




msg:4459164
 7:46 pm on May 29, 2012 (gmt 0)

Blekkobot/Scoutjet turned up for the first time in a long time on several sites in the last 24 hours or so.

Another new one was the ahrefsbot.

lucy24




msg:4459199
 9:30 pm on May 29, 2012 (gmt 0)

Another new one was the ahrefsbot.

I've had that one blocked for a long time due to misbehavior w/r/t robots.txt, so it was a surprise seeing the occasional "ahrefs" graphic in the top corner of wmt pages. What's it for? First showed up in November, with a peculiar habit of first getting pages and then getting robots.txt to, presumably, see if it was allowed to get the pages it had already gotten.

incrediBILL




msg:4459216
 10:08 pm on May 29, 2012 (gmt 0)

You can blame this new UA on me ;)

I suggested they change it because ScoutJet didn't point people to using Blekko and IMO they were wasting literally millions of opportunities daily to let webmasters know they exist.

Anyway, they still honor the ScoutJet UA obviously so you don't have to change anything unless you serve robots.txt dynamically like I do, as I had to change my script just a bit so I didn't keep telling Blekko it was denied LOL!

g1smd




msg:4459223
 10:19 pm on May 29, 2012 (gmt 0)

Unusually for a bot, there's no appended version number.

It's just
Blekkobot; ScoutJet; in
Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot) with no appended version info cf. Googlebot/2.1 etc.
incrediBILL




msg:4459238
 11:06 pm on May 29, 2012 (gmt 0)

Unless you have multiple active bots that have different criteria that need to be met, I really don't see the need for the version number except as a courtesy. It's actually a PITA for people that test for exactly specific UAs that suddenly get broken when the version changes.

g1smd




msg:4459240
 11:11 pm on May 29, 2012 (gmt 0)

I don't test for specific UA versions, but it is useful to see when the version number changes as it gives a big clue that the bot behaviour may have changed in some way too.

[edited by: g1smd at 11:28 pm (utc) on May 29, 2012]

incrediBILL




msg:4459242
 11:25 pm on May 29, 2012 (gmt 0)

I agree and you can see it's evolving and not some dead bot just running from a server until the end of time just wasting our bandwidth. However, it also makes things break for both novices and those doing strict matching was my point.

dstiles




msg:4459617
 9:04 pm on May 30, 2012 (gmt 0)

Wish you'd warned us, IncrediBill! :)

I test bot UAs against declared or otherwise known IPs, sometimes shortened UAs, sometimes (as in scoutjet/blekko) the full UA. This helps in prohibiting bad UAs from good bot IPs, which I'm seeing a lot more of lately.

I usually pick up changes fairly quickly but this UA change slipped by me for at least a couple of weeks.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved