Forum Moderators: open

Message Too Old, No Replies

Hurricane Electric

570 MB in two hour period

         

OptiRex

11:28 am on Apr 1, 2005 (gmt 0)



Hi

This morning 64.62.175.137 sucked me for in excess of 570 MB of bandwidth. The IP is Hurricane Electric from California however whether it is them directly I have no idea.

I have written to them requesting if they can supply me with an explanation however has anyone else seen such activity?

Thanks.

ttkr1

1:14 am on Jun 22, 2005 (gmt 0)

10+ Year Member



A lot of hypothesizing but here is some research:

<snip>

[edited by: volatilegx at 2:13 pm (utc) on June 22, 2005]
[edit reason] no blog links, please [/edit]

soquinn

4:31 am on Jun 23, 2005 (gmt 0)

10+ Year Member



OmniExplorer_Bot/1.10 hit again, so I guess this didn't work...


AddType application/x-httpd-php .htm .html
Options -Indexes
RewriteEngine on
RewriteCond %{HTTP_HOST} ^site\.com [OR]
RewriteCond %{HTTP_HOST} ^anothersite\.com [OR]
RewriteCond %{HTTP_HOST} ^www\.anothersite\.com
RewriteRule ^(.*) [site.com...] [L,R=301]
RewriteCond %{HTTP_USER_AGENT} ^Missigua [OR]
RewriteCond %{HTTP_USER_AGENT} ^OmniExplorer
RewriteRule ^.* - [F]

Do I need to add "_Bot"?

wilderness

1:10 pm on Jun 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do I need to add "_Bot"?

No.
You may even omit Explorer.

So the line would read:

RewriteCond %{HTTP_USER_AGENT} ^Omni

If begins with Omni

mdreher

10:50 pm on Jun 23, 2005 (gmt 0)

10+ Year Member



The Omni-Explorer site is now (sort-of) online:

Relevant to our discussion:


Omni-Explorer.

Omni-Explorer is a stealth-mode venture-backed startup based in Silicon Valley. Stay tuned to this site; we plan on launching shortly.

The Omni-Crawler

If you have found this page because of the Omni-crawler, please bear with us. We hope to be able to point many more users to the valuable content on your site shortly.

If you are finding our crawler overly burdensome on your site or prefer not to be included, you can exclude our crawler. Simply insert the following into your robots.txt file (if you don't know what one is, see the Robotstxt.org site).

The Omni-Explorer agent is: OmniExplorer_Bot/1.09. To prevent it from crawling your site, please put the following in your robots.txt file:

User-Agent: OmniExplorer_Bot/1.09
Disallow: *

We will also obey the delay directive (in seconds) for how long to wait between page views on your site:

Crawl-delay: 2

We know some of you got hit by an earlier version of our crawler that was particularly...well, hungry. For that, we sincerely apologize.

Please feel free to send us feedback. If you feel that the crawlers are not matching the behavior stated on this page, please include the HTTP log file lines and your robots.txt file (or site) so we can verify the issue. Thank you.

wilderness

11:16 pm on Jun 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Welcome to Webamster World.

The Omni-Explorer site is now (sort-of) online

So ;)
You expect webmasters to trust in the possibly of a change in policy and methods given the history of the bot and/or
"stealth-mode venture-backed startup"

mdreher

12:12 am on Jun 24, 2005 (gmt 0)

10+ Year Member



Wilderness,


So ;)
You expect webmasters to trust in the possibly of a change in policy and methods given the history of the bot and/or
"stealth-mode venture-backed startup"

Actually, no. Not really. They tried to hit one of my sites again after I excluded them. They didn't get very far. :)

I just thought the "stealth-mode venture-backed startup" an interesting choice of words myself, and figured others might agree.

I also saw that given someone earlier in the thread posted that they had a version 1.10 bot, so I'm wondering if it's the same "stealth-mode" group, or someone else who is mimicking their user-agent.

jd01

12:28 am on Jun 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RewriteCond %{HTTP_USER_AGENT} ^Omni [NC]

Just a thought, I always use No Case when I am disallowing...

Justin

soquinn

6:36 am on Jun 24, 2005 (gmt 0)

10+ Year Member



The Omni-Explorer site is now (sort-of) online:

What does that mean? Where did you pull the info mdreher?

I see crazy crawling across 4-5 different sites I monitor as of this week that we’ve tried block? With no discernable benefit from any research on this bot or any proof that it obeys a robots.txt file … we are left with little choice but to only waste time figuring out who it is and if they are really legit? So then we try to to block it... am I wrong here?

wilderness

11:33 am on Jun 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



am I wrong here?

soquinn,
mdreher was only musing at the use of the term
"stealth-mode venture-backed startup" rather than endorsing the bot.

Don

mdreher

1:39 pm on Jun 24, 2005 (gmt 0)

10+ Year Member



They now have a page up at [omni-explorer.com...]

(URL necessary here to show original source.)

And no, I'm not with them - I don't endorse them. Just another person who got hit by them as well. Mine is a low-traffic site, and let's just say the sudden bandwidth spike made me take notice.

(Edit to clarify and reinforce Wilderness' understanding of my position.)

soquinn

3:17 pm on Jun 24, 2005 (gmt 0)

10+ Year Member



Thanks for the clarification. mdreher, I didn’t think you were with them I just didn’t realise you quoted their site.

In terms of using the robots text (as they recommend) there seems to be many versions so I‘m guessing you would have to list each one?

I’m going to try…

RewriteCond %{HTTP_USER_AGENT} ^Omni [NC]

jd01, what's the significance of the “no case”?

wilderness

4:22 pm on Jun 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"no case"

[google.com...]

incrediBILL

4:15 pm on Jul 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've tried blocking this pest by IP range but it seems to have a few C blocks handy.

OmniExplorer_Bot/3.06d (+http://www.omni-explorer.com) WorldIndexer

Came back this morning and found this batch:

65.19.150.212
65.19.150.231
65.19.150.227
65.19.150.244
65.19.150.244
65.19.150.222

They're web site claims the IP address range is 65.19.150.193 - 65.19.150.254 which is either incorrect or someone else is using their agent name to slide in under the radar. They also claim to be a venture backed Silicon Valley startup but the domain is registered in Oregon.

Additional info provided:

The Omni-Explorer agent is: OmniExplorer_Bot/1.09. To prevent it from crawling your site, please put the following in your robots.txt file:

User-Agent: OmniExplorer_Bot
Disallow: /

We will also obey the delay directive (in seconds) for how long to wait between page views on your site:

Crawl-delay: 2

Considering what crawled me today claims to be version 3.06d the web site is horribly out of date even though is claims to be updated this month.

volatilegx

3:08 pm on Jul 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In the last couple of months, I've seen the following user agents:

# UA "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer"
# UA "OmniExplorer_Bot/1.09 (+http://www.omni-explorer.com) Internet Categorizer"
# UA "OmniExplorer_Bot/1.09 (+http://www.omni-explorer.com)"
# UA "OmniExplorer_Bot/1.10 (+http://www.omni-explorer.com) Jobs Crawler"
# UA "OmniExplorer_Bot/1.18 (+http://www.omni-explorer.com) Torrent Crawler"
# UA "OmniExplorer_Bot/2.3 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.57 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.67 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.69 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.70 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.71 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.73 (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.78a (+http://www.omni-explorer.com) WorldIndexer"
# UA "OmniExplorer_Bot/2.82 (+http://www.omni-explorer.com) WorldIndexer"

coming from these IP addresses:

64.62.175.130
64.62.175.131
64.62.175.137
64.71.131.109
64.71.131.117
65.19.150.206
65.19.150.207
65.19.150.208
65.19.150.209
65.19.150.210
65.19.150.211
65.19.150.212
65.19.150.213
65.19.150.214
65.19.150.220
65.19.150.221
65.19.150.222
65.19.150.223
65.19.150.224
65.19.150.225
65.19.150.226
65.19.150.227
65.19.150.228
65.19.150.229
65.19.150.230
65.19.150.231
65.19.150.232
65.19.150.233
65.19.150.234
65.19.150.235
65.19.150.236
65.19.150.237
65.19.150.238
65.19.150.239
65.19.150.240
65.19.150.241
65.19.150.242
65.19.150.243
65.19.150.244
65.19.150.245
65.19.150.246
65.19.150.247
65.19.150.248
65.19.150.249
65.19.150.250
65.19.150.251
65.19.169.228
65.19.169.229
65.19.169.230
65.19.150.250
65.19.169.242
65.19.169.252
65.19.169.254

All of the recent spidering activity has come from the 65.19.150.* block.

soquinn

10:55 pm on Aug 2, 2005 (gmt 0)

10+ Year Member



I've been using this code for the last month trying to block OmniExplore but it doesn't seem to be working:

RewriteCond %{HTTP_USER_AGENT} ^Missigua [OR]
RewriteCond %{HTTP_USER_AGENT} ^Omni
RewriteRule ^.* - [F]

Should they be able to get past that re-write? I see thousands of hits and still see many different names as volatilegx mentions like:

OmniExplorer_Bot/1.09
OmniExplorer_Bot/2.93 ..etc

So maybe "^Omni" on it's own doesn't cover every combo?

wilderness

11:07 pm on Aug 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So maybe "^Omni" on it's own doesn't cover every combo?

soquinn,
the line covers everything where the user-agent begins-with Omni

If the user-agent does not begin with Omni, than you'll need to find another term that works.

These method of (begins-with, ends-with and contains) were all explained in Msg#12 of this thread.
Perhaps you need to bookmark it or find a more thorough explantion?

soquinn

12:35 am on Aug 3, 2005 (gmt 0)

10+ Year Member



wilderness, no I’ve read your explanation in Msg#12 and thank-you it was straight forward enough and you are always very helpful with us rookies but after reading other tutorials and testing for a month with no success I wanted to double check to see if anyone else has successfully blocked it with the same code while being hit?

It begins with Omni and is not case sensitive so I’m puzzled. The ( ^ ) binds the match to the beginning of the User-Agent string but I’ve had trouble finding out if it still works with compound words and special characters, underscores, back slashes and the many versions like volatilegx listed? Teach a man to fish, right?

jdMorgan

4:19 am on Aug 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perhaps the reason for confusion is the definition of "blocking a User-agent" and the expectations of some participants in this thread.

The corrected code posted above will block Omni as stated, but what does that mean? It means the Omni will get a 403-Forbidden response from your server, and will not be able to access the requested page. However, you will still see Omni in your 'stats' as having visited your site. A closer examination, looking at your raw server access log files, will show that Omni got 403-Forbidden responses to all requests, however.

The above assumes that you have privileges to use mod_rewrite, and that it is configured properly.

If you want to completely block all access to your server (so that they get no response at all, and you see no log entries) from that IP range, you'll need to do it at the server firewall.

If you want to completely block all access to your server (so that they get no response at all, and you see no log entries) by that user-agent, then you'll need a very expensive enterprise-class firewall.

Just wanted to adjust expectations, if needed...

JIm

wilderness

4:27 am on Aug 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Teach a man to fish, right?

Next requirement is teaching the man how to work the reel, :) then the depth finder, then showing him how to put the fish on the stringer and on and on.

EVERY UA example that Dan provided in Msg#44 BEGINS with Omni.

You could have anything from a simple syntax error to not having "Rewrite on".

On numerous occassions, I've made syntax errors in an IP range rewrite and then in a week or two I notice that something isn't working properly. The two days that follow of going line-by-line through my extensive htaccess are very humbling.

I have a question for you and it's not my desire to be facetious?

If your unable to get a simple rewrite begins-with functioning?
How do you expect to implement more complicated rewrites?

My suggestion is to get what you currently have functioning before attempting to understand other options.

wilderness

4:49 am on Aug 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just wanted to adjust expectations, if needed...

Jim,
Is it possible that a webmaster attemping to implement rewrites doesn't understand access codes?

Don't answer ;)

[faqs.org...]
or
[members.tripod.com...]

wilderness

1:08 pm on Aug 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



On numerous occassions, I've made syntax errors in an IP range rewrite and then in a week or two I notice that something isn't working properly. The two days that follow of going line-by-line through my extensive htaccess are very humbling.

Some things. . .are better left unspoken ;)
Crap!

This 51 message thread spans 2 pages: 51