moTi - 11:46 pm on Oct 29, 2012 (gmt 0)
Hey guys, new to this forum, so excuse me if this is common knowledge or already discussed, but..
I do a reverse/forward DNS lookup on my server to ban bots that don't pass the test and send them a 403. Recently I noticed, that a lot of requests from search.msn.com get a 403 as well.
My initial thought was that it's a fake Bingbot. But on further investigation i discovered the following pattern:
When requests from IP address 18.104.22.168 crawl my web pages, sometimes within the same second I get a 200 for one file request following by a 403 for another file request and so on. Strange. So I test for the reverse DNS, it's always msnbot-157-55-33-113.search.msn.com. But doing a forward DNS with this I get two results (you can check for yourself with a lookup tool):
In practice, it shows that a request from this bot in about 50 percent of the cases resolves to 22.214.171.124 and passes the lookup, whereas in the other cases it resoves to 126.96.36.199 and (rightly) gets a 403 from my server.
So, as the IP is officially from Microsoft, why do they do that? Have you experienced this before? How do you handle that? I certainly won't let the Bingbot in, if it can't verify itself. It produces massive 403 errors in my logs.