Forum Moderators: phranque

Message Too Old, No Replies

weird log file entries

         

roshaoar

1:26 pm on Apr 3, 2014 (gmt 0)

10+ Year Member



Hello,

Can anyone point me to a site or information which explains what's going on with some of the really weird page log entries I get on my site?

There's so much talk of malicious bots, bad IPs, hacking and whatnot that I don't really always know what I'm looking at and why. 404s for images and page strings that have never existed. Automated scripts accessing a tree of urls that don't exist. Backend commands in urls - all that.

Anyone?

Thx

wilderness

3:41 pm on Apr 3, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This may help:
Default User Agents of Programming Libraries and Command Line Tools [webmasterworld.com], however simply learning to recognize standard visitors and comparing the non-standards to same is an ongoing process.

lucy24

6:41 pm on Apr 3, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Remember this line: "Never attribute to malice that which can be adequately explained by stupidity." The weirder requests can generally be attributed to stupid robots following bad programming. They get a shopping list that names such-and-such pages-- and then the domain name gets garbled and they ask for the same pages at some other site. Sometimes the bogus URLs are so specific, you can even look them up and figure out what site the robot thought it was on! (I don't think there's any reason for doing this other than personal amusement, though.)

If you get requests for anything involving /wp-admin/ or similar, that's another type of robotic shopping list. They simply take the 50 or so likeliest filenames and attach them to any domain under the sun.

One-off requests with weird query strings are preliminary tests. If the request gets the desired response-- such as being able to PUT something they've no business PUTting-- they'll be back later with a much longer list.

Personally: I don't block IP ranges upfront, unless it's something like Hetzner or OVH where you already know that nothing good will come of it. Also the entire nation of China. If I do get an offensive visitor, I look up what I know. If it turns out to be a server farm, or it belongs to Eastern Europe-- this is obviously subjective-- they get blocked.

roshaoar

7:09 pm on Apr 3, 2014 (gmt 0)

10+ Year Member



Yeah, this sort of sums up what I do as well Lucy. Not all of China though, just about 25 or so spans!

You've hit it on the head with stupidity btw. Some of the entries do look like a bot or browser having a really bad day, programming errors of sorts.

I also have a list of about 50 of these CMS wp-admin type request that I just block. But then they all 404 on "File does not exist: /var/www/error" - not sure what I'm not doing that I should be :)

lucy24

8:09 pm on Apr 3, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That sounds like the server looking for a custom 403 page and not finding it. But it would only come through in logs as a 404 if the ErrorDocument directive was malformed, such as by including a full protocol-plus-domain.