Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- MojeekBot


mojeek - 8:14 pm on Apr 30, 2012 (gmt 0)


I fully understand the problem as I have my fair share of rogue bots and scrapers, including so-called "reputable search engines" trying to avoid using proper api access or creating their own technology. I don't allow automated queries or the results to be crawled, so with nearly limitless pages it can be a big problem.

I just find it a shame when a genuine new or smaller engine can so easily be publicly associated with rogue bots and are usually the first to be banned, as they're also the easiest to be identified, without at least being given a chance or checked out. There's a thread on here talking about Mojeek in 2006 - [webmasterworld.com...] so we're not new and obviously not some page harvester.

With regards to genuine visitors, I suppose we'll never be able to send any if we're not allowed to index your site, or provide some results to our users that we would of otherwise liked to.

dstiles - Thanks, although we do provide thorough info on our bot including how to test it's ours. I commented on the UK se thread earlier, always interested in any engine coming out of the UK, a rare thing!


Thread source:: http://www.webmasterworld.com/search_engine_spiders/4444489.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com