Welcome to WebmasterWorld Guest from 54.196.244.45

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

MojeekBot

     
2:45 am on Apr 24, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


Just a heads up.

Three 2005 references in the archives.
IP a UK server farm.

195.74.55.164 - - [24/Apr/2012:02:21:24 +0100] "GET /robots.txt HTTP/1.1" 200 2627 "-" "Mozilla/5.0 (compatible; MojeekBot/0.2; http://www.mojeek.com/bot.html)"

[edited by: incrediBILL at 4:18 am (utc) on Apr 24, 2012]
[edit reason] de-linked user agent [/edit]

4:41 am on Apr 24, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5792
votes: 64




Thanks Don
7:10 am on Apr 24, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12689
votes: 243


:: shuffling papers ::

195.74.55.164, yup. That silly name must have made a huge impression on my memory-- I use the term loosely-- because I only find one visit in the past year. (Logs on HD where Spotlight can paw through them.) robots.txt, front page, yawn. Did they do anything nasty to you? Er, to your site.
8:56 am on Apr 24, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5792
votes: 64



IP a UK server farm

and a colo - 'nuff said
12:19 pm on Apr 24, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


lucy,
RewriteCond %{REMOTE_ADDR} ^19[013-6]\. [OR]
5:11 pm on Apr 30, 2012 (gmt 0)

New User

joined:Apr 30, 2012
posts:6
votes: 0


Hi, I'm the developer of Mojeek, can I ask what's wrong with colocation? Also if our bot disobeyed your robots.txt or did anything that would suggest it to be anything other than a genuine se bot?

Also, as we already provide a fairly comprehensive bot page, is there more we could add to it that would of persuaded you to give us a chance?

Thanks.

Marc
5:41 pm on Apr 30, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


can I ask what's wrong with colocation


neither co-location or server farms (shared hosting, VPN's and other similar website hosting), offer valid visitors, rather they provide a web host and/or its server harvesting pages.

is there more we could add to it that would of persuaded you to give us a chance


Not from me.
I don't allow all of Europe into my sites. Except by special request/custom from widget contacts/references.

I'm sure somebody else will come along and provide and explanation more beneficial to you.
6:01 pm on Apr 30, 2012 (gmt 0)

New User

joined:Apr 30, 2012
posts:6
votes: 0


Ok thanks for the reply. Although it's not shared hosting etc., we have our own racks and apart from having our own datacentre not sure what other options there are.
6:19 pm on Apr 30, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


There's some very, very long threads here on Amazon, which you may find insightful.

IMO, a server is a server. It doesn't matter to me if its a colo, shared, VPN or even a commercial internet provider who offers subnet ranges for hosting customers.
NONE of them offer valid visitors to my websites.
7:17 pm on Apr 30, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3089
votes: 2


Mojeek - In general I'll go along with wilderness on that. Only known genuine bots are allowed access to our sites from any kind of server.

It makes no difference whether the server range is rated good or bad. I even block all of the server IPs at the server farm where my own servers are leased from. :)

The simplest criterion is: does this bot benefit my site? Very few do, and I have to say I hadn't been aware of mojeek until this thread, although I may have blocked the IP range based on an unknown (eg your) bot.

In the spirit of looking for alternative UK SEs (I reside in UK) I have unblocked the IP 195.74.55.164 and added MojeekBot to my Allowed list. Let's see how it goes. :)

And incidentally, you may want to check the WebmasterWorld forum UK & Ireland Search Engines at [webmasterworld.com...] - a thread on a new UK SE has just begun there.
8:14 pm on Apr 30, 2012 (gmt 0)

New User

joined:Apr 30, 2012
posts: 6
votes: 0


I fully understand the problem as I have my fair share of rogue bots and scrapers, including so-called "reputable search engines" trying to avoid using proper api access or creating their own technology. I don't allow automated queries or the results to be crawled, so with nearly limitless pages it can be a big problem.

I just find it a shame when a genuine new or smaller engine can so easily be publicly associated with rogue bots and are usually the first to be banned, as they're also the easiest to be identified, without at least being given a chance or checked out. There's a thread on here talking about Mojeek in 2006 - [webmasterworld.com...] so we're not new and obviously not some page harvester.

With regards to genuine visitors, I suppose we'll never be able to send any if we're not allowed to index your site, or provide some results to our users that we would of otherwise liked to.

dstiles - Thanks, although we do provide thorough info on our bot including how to test it's ours. I commented on the UK se thread earlier, always interested in any engine coming out of the UK, a rare thing!
5:58 pm on May 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 23, 2002
posts:659
votes: 0


SecRule HTTP_User-Agent "MojeekBot" "deny,log,status:403" 
7:33 pm on May 1, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12689
votes: 243


I just find it a shame when a genuine new or smaller engine can so easily be publicly associated with rogue bots and are usually the first to be banned, as they're also the easiest to be identified, without at least being given a chance or checked out.

That's a whole nother thread. If all the Big Sites routinely block all robots except the Privileged Few-- some of whom, ahem, behave almost as badly as your average Ukrainian-- then all that's left for an up-and-coming search engine is the Not So Big Sites. So your search results become something like "The best of the rest". Which in some cases could be quite interesting :)
2:16 pm on May 2, 2012 (gmt 0)

New User

joined:Apr 30, 2012
posts: 6
votes: 0


Interesting thought but probably cause there to be even less alternatives than there is now, unless they simply backfilled with a major. Anyway, definitely going off topic now so I'll shut up, sorry about that.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members