Forum Moderators: phranque

Message Too Old, No Replies

Bingbot getting banned

Bing IP ranges receiving 403 errors

         

grandma genie

5:13 pm on Jun 27, 2012 (gmt 0)

10+ Year Member



Hello,

I just noticed starting yesterday, June 26, that all the bing IP ranges from 157.55 and 157.56 are starting to get 403 errors in my server logs, but I can find no reason for it. They are not blocked in htaccess. Anyone have any ideas why this could be happening?

GG

g1smd

7:18 pm on Jun 27, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is the requested URL matching some other pattern that you block?

Are they GET or POST or something else?

grandma genie

7:27 pm on Jun 27, 2012 (gmt 0)

10+ Year Member



Not that I can see. I contacted my host to see if they were blocking them, but they said no. Here is a sample:

157.55.16.57 - - "GET /example.html HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

I am not blocking the IP in any form. I am not blocking bing anything. This is just too odd.

grandma genie

7:55 pm on Jun 27, 2012 (gmt 0)

10+ Year Member



Here is a bingbot from a different IP:

65.52.109.26 - - "GET /robots.txt HTTP/1.1" 200 1500 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

It seems to be the IP that is the issue. All other ranges are OK. But I do not have any of the 157.55 or 157.56 ranges blocked

lucy24

9:03 pm on Jun 27, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But that's a request for robots.txt. You may have an all-purpose override so even horrible robots from Kazakhstan can see what they're disobeying.

Conversely it would make sense if it was the plainclothes bingbot getting locked out. I know I'm not the only one who blocks bing ranges if the UA is anything other than bingbot or msnbot-media; or anyone requesting Bing Site Authorization if needed.

Look at your raw logs. Are all requests from these ranges getting 403'd, or are you just noticing some getting locked out?

:: off to investigate vaguely related issue of msnbot-media arriving from unexpected though legitimate location ::

grandma genie

9:18 pm on Jun 27, 2012 (gmt 0)

10+ Year Member



It happened all at once. Here is the last bingbot from that range that got in OK:

157.55.16.57 - - [26/Jun/2012:01:08:43 -0400] "GET /example.html HTTP/1.1" 200 23194 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

the next one was blocked:

157.55.16.221 - - [26/Jun/2012:01:49:26 -0400] "GET /example.html HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

About 40 minutes went by between the two. I have two htaccess files that I use all the time. One is in the root with all the allow/denies. I also block by user agent and referer. But there is nothing in that ua that I blocked. No bings or bingbots. I have blocked one IP range that begins with 157, but the next quadrants are not even close to bing and it is from Thailand. Nothing else in the 157 range. Every IP that is 157.55 or 157.56 is getting the 403.

g1smd

9:21 pm on Jun 27, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have blocked one IP range that begins with 157

I had wondered if there was an unanchored pattern, but now suspect a typo in one of your patterns.

wilderness

9:59 pm on Jun 27, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"I had wondered if there was an unanchored pattern, but now suspect a typo in one of your patterns. "

I agree with g1smd.

Syntax errors are capable of providing very unpredictable results, and the errors may have no relationship (or even close in proximity( to the non-functioning line.

lucy24

4:03 am on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But wait. Did you change anything in either of those htaccess files during that 40-minute interval? If, like me, you're constantly tweaking your htaccess, take a quick look at the timestamp. That will tell you the last time you uploaded the file, and then you'll know which one to look at.

uneasily thinking that you're probably supposed to keep backups showing a history of even the teeniest little change, though in my case this would require buying an additional hard drive

Do your error logs say anything useful, or just the generic (and slightly misleading) "denied by server configuration"?

grandma genie

4:14 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



Sorry for the delay in replying. That IP range doesn't come in all the time. Maybe once an hour, so it takes awhile to see if what I do has any impact. I was going to ask if anything in the file would affect just that range, but g1smd and Don answered that one. And I do tweak pretty much daily. And I'm sure I made a few changes. So, I will have to just do some experiments and see what happens. At least the other bing ranges are OK. So far I do not see anything that would cause that to happen. Here is an example of the way I block in allow,deny area of htaccess:

order allow,deny
deny from 1.176.158.0/17
deny from 2.82
deny from 27
allow from all

I don't know if mixing and matching like that could be a problem. Those are not real numbers, just examples. After that long list of IPs, I then block by UA and by Referer. Then there are some things like stopping hotlinking and stopping some types of exploits. But the only section I do the most tweaking is in the allow,deny area. So, would the mixing and matching syntax be the problem? I removed the only 157.nnn block I had and that made no difference.

grandma genie

4:24 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



Is it possible for something in the block by UA or Referer to impact the allow,deny area? Or should I only be looking in the allow,denies to find the problem? The only htaccess file that has the IP blocks is in the root directory.

wilderness

4:51 pm on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



gg,
Although the mod_authz_host and mod_rewrite modules work separately, it is entirely possible for one syntax error to cause havoc across the entire htaccess.

Thus you must check and verify the syntax of every line.

trailing ot missing spaces, lack of [OR] flags, extra, missing or error opening nad closing brackets are quite common.
incomplete or incorrect IP ranges.

For mod_rewrite the most effective locating solution I've found is the following method:
1) save a copy of your original file.
2) create a replacement file that breaks every thing down to smaller sections. Eventually you'll confine your error to one of these smaller sections.

grandma genie

5:51 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



It is definitely part of the allow,deny area. I am making headway. Took out section of numbers and the 157 range is now being allowed back in. I just need to find the segment that is causing the problem. Will let you all know what it was when I find it. Thank you for all your help.

I do keep a copy of the latest file on my desktop.

grandma genie

8:50 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



OK, I found it and will share it with you all. Have you ever been typing away and thought you might have accidentally deleted something but couldn't tell if you did or what it was? Well, here it is:

deny from 176.34.216.0/2

Notice anything missing?

What that has to do with bing I will never know.

For newbies who don't know what I am talking about, it should be:

deny from 176.34.216.0/24

Last time I looked bing was happily spidering away. Thank you all for helping me figure this out.

By the way, that is an Amazon aws block.

g1smd

8:54 pm on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you can keep old copies of the file from time to time, then DIFF is your friend.

lucy24

9:07 pm on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is it possible for something in the block by UA or Referer to impact the allow,deny area?

UA and Referer are in mod_rewrite. (OK, they could be in mod_setenvif but you started out saying rewrite.) Allow,Deny is core. You can't be 100% sure what order the modules will run in, but afaik you can be 100% sure they will all run before the core.

deny from 1.176.158.0/17
deny from 2.82
deny from 27
allow from all

I don't know if mixing and matching like that could be a problem.

I'm not sure that even counts as mixing and matching. It's all CIDR ranges, nothing like "Deny from .ca" that requires a whole new type of activity.

btw... If your order is "Allow,Deny" I would put the "Allow" statement at the top. To the server it makes no difference, but to a human reader it's intuitively easier to follow.

Took out section of numbers and the 157 range is now being allowed back in. I just need to find the segment that is causing the problem.

Oh, go ahead and post them. I assume they're all evil robots that everyone else would block too, unless you've got a lot of "Oh, that one's my mother-in-law, I just don't like her." (Sounds goofy but I swear there was once a thread hereabouts wanting to do something of the kind.)

wilderness

9:18 pm on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



deny from 176.34.216.0/2

Notice anything missing?

What that has to do with bing I will never know.


gg,
I would suggest that you continue reviewing your lines for errors.

Frequently, one syntax error will just trigger another undetermined error that has been laying dormant (never fully implementing what you originally intended it to do).

Large files and their lines are so eye-straining that frequently in haste, it is easy to locate one error and "assume" that was the culprit, when in fact, it may not be.

FWIW, I back up my htaccess files monthly, like clockwork.
You never know when your server or your machine will crash and you'll lose everything. These are hard learned lessons.

grandma genie

9:23 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



I think I would freak everybody out if I showed you what I block. I pretty much block the whole planet. That list gets so doggone long when you try to do little segments, so I just block the whole range, you know all of 177 or all of 178, etc. I am just letting the USA and Canada in, and as you know, even then I get pestered by those little nasties. Then I have to do surgery on the USA ranges, looking for server farms. Haven't blocked any in-laws.....yet.

grandma genie

9:30 pm on Jun 28, 2012 (gmt 0)

10+ Year Member



Don, I did check to make sure that IP range (157.55 and 157.56) was getting in and they are getting the usual 200s and 301s, so I assume that fixed the problem. And I update the current htaccess file on my desktop regularly. You're right. Anything can happen. I'll keep an eye out to see if what I fixed broke something else.

Were you suggesting that the problem shouldn't have caused the bing bot to get blocked? Yeah, I thought that was weird.

I also back up all my server files about once a month.

wilderness

9:47 pm on Jun 28, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Were you suggesting that the problem shouldn't have caused the bing bot to get blocked? Yeah, I thought that was weird.


No and quite the contrary.

It's more like the frequently misquoted "Murphy's Law" ;)

I had a syntax error of a missing space preceding the [OR] flag that took me forever to find.

grandma genie

3:41 pm on Jun 29, 2012 (gmt 0)

10+ Year Member



Ewwwww. That would hurt. Breaking down the file into segments is a good idea. That is what worked for me. Thanks again.

lucy24

5:38 pm on Jun 29, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



deny from 176.34.216.0/2

Notice anything missing?

OUCH !

Don't know how I missed that.

176.34.216.0/2 = 128.0.0.0 through 191.255.255.255 inclusive.