Forum Moderators: open

Message Too Old, No Replies

Anyone Know Which Spider this is?

Got hit hard by them this morning

         

cmjohnston

5:50 pm on May 23, 2003 (gmt 0)

10+ Year Member



Any help?

5/23/2003,11:16:39 AM, ,208.63.87.197,Mozilla/3.0 (compatible),

fiestagirl

6:22 pm on May 23, 2003 (gmt 0)

10+ Year Member



Mozilla/3.0 (compatible) is a bogus user agent. I block it because of that. The ip resolves to a web developer that works in real estate. Possibly the competition checking up on you with a bot.

Dreamquick

6:30 pm on May 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That style of UA is also often used by non-transparent proxy servers and so by blocking it you are effectly blocking all the people who use that proxy.

- Tony

fiestagirl

6:50 pm on May 23, 2003 (gmt 0)

10+ Year Member



Tony do you mean the UA: Mozilla/3.0 (compatible;)? This is a legitimate UA as I can see in my logs. I've never seen Mozilla/3.0 (compatible) do anything but make multiple requests in rapid succession and quite often asking for pages that don't exist.

Dreamquick

9:21 pm on May 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To be honest I think it varies from implementation to implementation whether the ; gets used or not - either way I'm too unsure to ban that UA.

With bot its easy - guilt (banned) until proven useful, however you don't really have that luxury with browers.

- Tony

fiestagirl

12:54 am on May 24, 2003 (gmt 0)

10+ Year Member



Well, as far as I'm concerned that ua is not welcome. Too often they request more than 10 pages a second. Never seen a browser do that.
Also I ALWAYS see the Mozilla/3.0 (compatible;) paired with another visit from the browser in the same second. The ip's from this ua can also be traced back to large corporations that are not trying to hide what they are doing, --aside from the fact that they are logging every webpage that you view at work...

WarmGlow

3:44 am on May 24, 2003 (gmt 0)

10+ Year Member



I fully agree with fiestagirl. I permanently denied access to "Mozilla/3.0 (compatible)" about seven months ago after it continuously triggered my malicious robots script. I have never regretted this decision and feel perfectly comfortable that I am not denying access to legitimate proxy servers.

balam

8:46 pm on May 28, 2003 (gmt 0)

10+ Year Member



FWIW...

Until this thread, I'd never seen this UA come from anywhere except the RIPE region, usually from Israel & the UK (and then, when they could be ID'ed, from dialup/DSL accounts).

Ban as you will, but I'm not so sure that all visitors with this UA are malicious. Most all visits grabbed only the homepage and then moved on... The exceptions are two visits grabbed only a specific page (different IP's, times, pages), and one visit - the truly unusual one - came and grabbed three specific pages.

This last visit, like I said, is rather interesting. While the IP goes to a dialup account in a small French town near the German/Luxembourgish border, the pages were spidered for inclusion in "Africa's premier search engine" - WoYaa. (www.woyaaonline.com)

It was a semi-polite visit; no robots.txt check, but they did take twenty minutes to grab the pages...How they found me, who knows, but there was definately a human behind it. My pages are now part of the WoYaa directory & nicely categorized. :)

balam

wilderness

9:22 pm on May 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've found this thread quite interesting. As overbearing as I am added to that, the occcassional single page grabbing I get from this UA?
I found the following in my logs today:
64.133.97.147 - - [28/May/2003:10:34:18 -0700] "GET /mypage.htm HTTP/1.0" 200 11438 "http ://search.yahoo.com/bin/search?p=dimmig+horses&ei=UTF-8" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; GEP IE 5.5 SP2 Standard; GEP IE 5.5 SP1 Standard)"
64.133.97.147 - - [28/May/2003:10:34:18 -0700] "GET /myimage.jpg HTTP/1.0" 200 1162 "-" "Mozilla/3.01 (compatible;)"
64.133.97.147 - - [28/May/2003:10:34:19 -0700] "GET /myimage2.gif HTTP/1.0" 200 2457 "-" "Mozilla/3.01 (compatible;)"

To me, this somwhow present the use of a VALID UA?
The UA was only used for the images?
Is it a trait of NT or one of the other parts of the UA?

Don

fiestagirl

3:08 am on May 30, 2003 (gmt 0)

10+ Year Member



This activity follows the same pattern as the Border Manager. It fetches the images so that "you can control, accelerate and monitor your users' Internet activities." (read: the boss is spying on you) This is a supposedly a safeguard against undesirable Internet content.
I've gotten visits from plenty of people surfing at work from large corporations that probably don't even know that big bro' is protecting them from all the undesirable stuff out there.

jim_w

3:34 am on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



wilderness

I had just the opposite today from the Navy. The came in from a search engine with the compatible and then it switched to IE 6. So, beats me. But on my index.html, there is only one graphic, but there is some java.

wilderness

3:37 am on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



fiesta
as a matter of coincedence. . .
Today I was curious about BorderManager contained in a UA and googled it.

I've yet to determine what category this product fits in my little realm ;)
It's not plagarism nor is it those copyright chaser tools.

My first instinct on all those content/parental control softwares was denying, of which I changed my perspective rather quickly.

Perhaps I'm turning a leaf ( choke! Choke! gag! Gag! )as I let back in some IP ranges today I've had banned for over two years. Planning on more also.

Don

wilderness

3:41 am on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



then it switched to IE 6

Jim
was NT in the UA?

Don

jim_w

3:52 am on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



wilderness

Yes.
1st.
Mozilla/3.01 (compatible;)

2nd
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Note:The above came from my AXS log, but it is the same as access_log, just in a little different format. And I was wrong, it was not IE 6.

SeanL

7:03 am on Jun 14, 2003 (gmt 0)

10+ Year Member



I get this bot several times a week. It generally comes from vastly different IPs, but it almost always requests the same two files:
/.../elepahnt-jokes.html
/extras/elephant-/
The real file name is a combination of the good parts of those two.
Sometimes it asks for the correct page name. And sometimes it will ask for other files, but never with any graphics or scripts. And once, it listed www.iaea.org as the referer. Usually, no referer.
So it's banned.

WarmGlow

10:05 am on Jun 14, 2003 (gmt 0)

10+ Year Member



And once, it listed www.iaea.org as the referer.

As you may know, this is the "calling card" left by Atomic Harvester 2000 which is primarily used for harvesting E-Mail addresses.