Forum Moderators: open

Message Too Old, No Replies

nlpproject.info

         

grouchy sysadmin

3:10 am on Mar 15, 2016 (gmt 0)

10+ Year Member



Seems to be another slightly broken bot.

UA: \x22nlpproject.info research\x22"
Protocol: GET
Robots.txt: Nope
Host: 93.113.124.0 - 93.113.125.255

Adding this just because it was weird to see. This is the slightly modified request for brevity.

<IP removed> - - [15/Mar/2016:02:41:24 +0000] "GET /we_are_looking_for_not_found_pages_we_are_looking_for_not_found_pages_we_are_looking_for_not_found_pages_we_are_looking_for_not_found_pages HTTP/1.0" 444 0 "-" "\x22nlpproject.info research\x22" "-"0.000- -

The _we_are_looking_for_not_found_pages bit was repeated a couple dozen times.

keyplyr

3:23 am on Mar 15, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@grouchy sysadmin - Just a FYI...

"GET" is a request Method
"HTTP/1.0" or "HTTP/1.1" is the Protocol (useful in a few rules & header evaluation)

Of course you could always include the Method if you think it pertinent :)

Host: ddnet.ro looks like a multi-service company including server admin, media & security services.

grouchy sysadmin

3:39 am on Mar 15, 2016 (gmt 0)

10+ Year Member



Oops. See, this is why I should not post before bedtime.

Method: GET
Protocol: HTTP/1.0

Sorry for the confusion.

lucy24

6:43 am on Mar 15, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Uhm, you just talked about this in another thread didn't you but now I can't find it. Is the 444 something you've chosen to return manually (the way my host returns 418 "teapot error" for mod_security violations), or does it have a particular meaning?

tangor

7:09 am on Mar 15, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just checking ... has HTTP/1.0 meant anything for a while?

keyplyr

8:17 am on Mar 15, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just checking ... has HTTP/1.0 meant anything for a while?
Still in use quite a bit with GET tools and older programing languages. I tried blocking it as a filtering method but found that a lot of 3rd world countries are still using old browsers set to HTTP/1.0, so I removed the block.

In this forum (search_engine_spiders) you'll probably find 20% of UAs using HTTP/1.0 which is why I try to include it when posting. Also, as noted above, it helps in other validation methods to know the protocol.

grouchy sysadmin

1:33 pm on Mar 15, 2016 (gmt 0)

10+ Year Member



@lucy24
I mentioned returning a 444 error code to fake google bots in the OnPageBot thread (https://www.webmasterworld.com/search_engine_spiders/4795795.htm) . The 444 code is created by Nginx and closes the connection without sending a response header. It is something I set specifically because I find it works better at deterring malware scanners than a 403 or 500.

@tangor
I ran a check through my logs and I still see a lot of 1.0 requests. For example the WordPress cron function and WordFence plugin are both still using HTTP/1.0.