Forum Moderators: mack

Message Too Old, No Replies

Someone at MS just got banned!

Was Bill Gates Surfing My site?

         

carfac

5:21 pm on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi:

Just saw this guy, fell into a spider trap:

131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...

IP resolves to Redmond.... did Bill just get himself banned?

dave

wilderness

5:32 pm on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Dave,
this is another number in the lower block of that 131 range. The other day I expanded upwards to prevent future expansion.
deny from 131.107.

Don

jmccormac

11:56 pm on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've seen an IP try to continually snag a few pages every hour on my main site. The activity pattern suggested it was a bursty (hits appeared in clusters or bursts around the last quarter of each hour from what I remember). For a while, I wasn't sure if it was a bot or a Microsoftie. Since it did not present any useragent, I just banned it.

Regards...jmcc

carfac

1:11 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jmccormac:

I would say that pretty much clinches it as a bot.... on my site, d/l a bunch of HTML and NO Images is WEIRD (My site is PICTURES of WIDGETS!)

Wilderness:

You blocked the whole IP- 131.107.xxx.xxx?

What do you think they are doing?

Not that I am all surprised that a 'bot out of Redmond is not polite (that is, respects robots.txt), but you would think MS of ALL people would put every possible doo-hicky and thingamabob into something they write. So, do you think the "Redmond Robot" is about 20 megs in source code size? :)

dave

wilderness

1:30 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>Wilderness! You blocked the whole IP- 131.107.xxx.xxx?</snip>

:) :)
Dave
I might better appreciate your despair if I went all the way up to 131. ;)

It's doesn't much matter to me what they (whether its MS or somebody falsely representing themselves as such) are doing?
For me the determining factor is three fold;
1) First they began visits with both referer and ua blank.
2) When the denies began as a result of their actions in line one above, they changed to a UA to get around that. While still not providing if they are a MS bot or providing a link back to the bot which gives us an answer we desire.
3) now they change IP's

I've learned time and again that every time I deny short that it comes back to haunt me. In this instance the 131.107. may even be short :(

Don

carfac

2:20 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>> appreciate your despair if I went all the way up to 131

LOL!

Do you run a spider trap? For me, that has worked really well. I do not want to get too much into it here, but I run a few different versions. The overlap works great, and it catches them in process, not after the fact- I guess that is why I do not go for blocking whole C, B or -GASP- A blocks. I do, on occasion, when traffic is all over an IP range, and I know I do not have to care at all about the range (read Maylasia, Cybervalence, IA, etc)...

dave

wilderness

3:04 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've considered a trap. Jim has attepmted to sway me more than a few times and it would likely save me a bunch of time. I may eventually.
I have sort of a crummy trap which shows in the logs which caught somebody the second day it was in and it's only on two of my pages :)
I stumbled across the thing while browsing for something else.

I'm sure there are some bots I haven't seen and perhaps never will due to both the content of my sites and the narrow market. Most of the other malicious ones are already denied.

Thanks for the hint :-)
Don

carfac

5:32 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don:

Well, between Jim and I, I think we have perfected the trap! He comes up with a new idea.... then I add something else.... we have it so it is pretty darn foolproof now. And it gets a lot of what gets past the IP and UA blocks. But then there are some that get by that, too.... and I catch in a bandwidth or CPU throttle. If they get by all that- and I just discovered one that did!- they deserve to get whatever they can! (Kidding)

Jim is a bit more cautious than I am in regards to the trap... I am a bit more, uh, proactive. I am ALWAYS banning Ask Jeeves (which is a very poorly behaved spider), and I know Jim makes allowances for that one.

Anyway, I just see it as another line of defense, and I would reccomend you do it!

dave

martinibuster

6:08 am on Apr 12, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



another line of defense

Against what?

Are you defending your security or three-cents-worth of bandwidth?

wilderness

12:47 pm on Apr 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>your security or three-cents-worth of bandwidth?</snip>

Martin,
You have a more varied particpation than Dave and myself in these forumns.
Not sure how you either miss or not understand the concept or method?

Each webmaster makes a determination as part of the goals for their website on visitors and use of their content. In the end it's the overall scheme of things rather than a solitary portion, whether it's pennies or buttons ;)

"My bandwidth" rather than defining pennies might better be interpetd as boundaries.
Don

This 111 message thread spans 12 pages: 111