Forum Moderators: open

Message Too Old, No Replies

Mediatoolkitbot

         

TorontoBoy

8:25 pm on Jun 20, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



UA: Mediatoolkitbot (complaints@mediatoolkit.com)
Protocol: HTTP/1.1
Robots.txt: Yes
Host: omonia.hr
213.186.0.0 - 213.186.31.255
213.186.0.0/19

2017 [webmasterworld.com...]
2016 [webmasterworld.com...]

Touched me only twice on a doc I updated, with 2 different IPs, same range:

213.186.1.*** [18/Jun/2018:03:10:33 GET /robots.txt HTTP/1.1 200 1519 - Mediatoolkitbot (complaints@mediatoolkit.com)
213.186.4.** [18/Jun/2018:03:14:02 GET /wp-acme/2017/09/28/computer-and-smartphone-security-something-else/ HTTP/1.1 200 53208 - Mediatoolkitbot (complaints@mediatoolkit.com)

I would have preferred to append to the 2017 thread, but it is closed.

- - -
(Please only post ranges, not specific computer addresses)

[edited by: keyplyr at 8:38 pm (utc) on Jun 20, 2018]
[edit reason] obscured IP address [/edit]

keyplyr

8:37 pm on Jun 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Has anything changed with this UA?

TorontoBoy

8:47 pm on Jun 20, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



It read my robots.txt. Did the read with one IP, then read a post with another IP. Both in the same IP range. In both old posts the bot disregarded the robots.txt.

keyplyr

9:15 pm on Jun 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It may have added the request for robots.txt or maybe in the previous posts I just failed to detect it. Still, it's a site monitoring service so I would doubt it follows robots.txt directives (as you noted); probably just added the request to pass some filters.

lucy24

11:22 pm on Jun 20, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I wonder what information it’s extracting from robots.txt? The only time I’ve ever seen it was a cluster of visits last July. Each time it read robots.txt and nothing else. I had to go check my robots.txt file to confirm that it isn’t disallowed, so that isn’t the reason, and I don’t see anyone else from the same /19 in that time period. (Besides, it seems to be more common to use your real name for page fetches, reserving the alias for robots.txt.)

tangor

1:54 am on Jun 21, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I feel left out. Looking back last six months, not a visit at all. But thanks for heads up!

keyplyr

9:31 am on Jun 30, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So if it's truely a site monitoring service, requesting robots.txt would make sense. Almost everyone gets access to robots.txt, even if it's blocked on other pages or directories.

lucy24

5:21 pm on Jun 30, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So if it's truely a site monitoring service, requesting robots.txt would make sense.
It has similarly occurred to me that if a robot's function is to verify uptime, a 403 response is as useful as any other, because it's still a response.

Incidentally, I went back and checked one place I hadn't thought of checking before. Turns out the reason this bot isn't disallowed ... is that I've actually poked a hole for it. But, as far as I can tell, it never came back to find out :)

keyplyr

6:59 pm on Jun 30, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



if a robot's function is to verify uptime, a 403 response is as useful as any other, because it's still a response
Rare, but some servers may be set-up to return 4** or 5** responses under certain circumstances regardless whether an account is active or not.

...poked a hole for it. But, as far as I can tell, it never came back to find out
That's the web's version of 'wash your car then it rains.'

I have a bat file with some grep commands I run on my logs every day. When I poke a hole for a new UA, I add a line to check for it. I have 2 that haven't come back in over a month and I've forgotten which ranges I moved to which rules.

lucy24

8:16 pm on Jun 30, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To say nothing of the agents who requested robots.txt often enough that I added a Disallow to test for hole-poking ... and I've never seen them since.

:: shuffling papers ::

Yeah, haha, the “currently testing” list has 35 names on it--most of which I'm reasonably confident I will never set eyes on again, seeing as how the oldest names go back to March 2016.