Forum Moderators: open
I'm confused: today I had three additions to my .htaccess denying
access to 209.2.34.114 after it had violated robots.txt and fallen
into my spider trap. The first two occasions happened within about
a second, so might be explained from the server still working with
the .htaccess before it got updated, but the third entry came some
45 minutes later. How is that possible: I was blocking 209.2.34.114
and yet it managed to fall into my spider trap again (in all three
cases on different web pages). Is this some new sort of IP address
cloaking or am I overlooking something obvious? I have now three
lines in my .htaccess reading
deny from 209.2.34.114
and definitely the third time it should not even have been able to
fall into the spider trap again? A quick Google search indicates
that it is some sort of bad bot, but still that does not explain
how it can fall into a spider trap multiple times? It comes with
strange user agent strings like "lB6xxomcggnw6pp h 6tcymlxx" and
"ophqj5mivnn h nvv5tdho5savb" but I ban by IP address 209.2.34.114
Sorry if this is a stupid oversight on my side, but I have never
seen this happen before, except when two pages with traps were
visited within the same second or so.
My automatic addition of banned IP addresses normally seems
to work fine apart from this particular exception. Of course
it is always possible that something special happened on the
shared server side, such as a restore after a crash, and I
have not yet seen 209.2.34.114 return in later days. I did
notice that it had changed its random-looking user agent
string at its third visit, but I ban by IP address, not by
user agent, so this ought to play no role and only suggests
that this bot is trying to be smart - and maybe it is.
I'll keep an eye on 209.2.34.114 in case it manages to hit
my trap again despite the "deny from" ban, but I decided to
now ban the suspect's IP range 209.2.34.112/28.