Welcome to WebmasterWorld Guest from 54.205.115.177

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Mailru-net

stopped by snagging images only.

   
3:11 am on Nov 21, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



netnum: 217.69.128.0 - 217.69.135.255
netname: MAILRU-NET

definitely not an email referral
9:20 pm on Nov 21, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I have a note on the range 217.69.128.0 - 217.69.143.255 (which is blocked): "may include proxies".
10:43 pm on Nov 21, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I kinda think they're a legitimate search engine-- for a given definition of "legitimate" at least. But I finally got tired and blocked them just the same. Same images over and over again. Tiny little ones of no use to anybody. Yawn. Maybe it's a common search term that brings up the same set of hotlinks in a package each time.
3:31 am on Nov 22, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I kinda think they're a legitimate search engine

Absolutely. Mail.ru is the largest portal/SE in Russia. Mail.ru is similar to Yahoo where Yandex is similar to Google in Eastern Europe.

That's not to say they do not engage in "iffy" behaviors (by our standards anyway) but they are a legit organization and an important player.
6:54 am on Nov 22, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



That's reassuring, since I had nothing to go by except gut feeling. Well, and their site looks exactly like Yahoo or any of those other ISP mail sites.

Maybe if I give them a month or so they'll lose their morbid appetite for that particular fistful of pictures and go for something else. Yandex tends to bring up pictures of rats. (To the point where I can recognize the word in Cyrillic ;))
10:23 am on Nov 22, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Having said that, my records show I put a block in place to stop them from scraping image files over a year ago :)

217.69.128.0 - 217.69.135.255
217.69.128.0/21

I do let them crawl, just not retrieve image files.

I let many SEs take my images for their image search *if* they create a thumbnail that links to my image, where by connecting to my server, I have a script that pulls the user to the parent screen, my web page = = traffic!

A few of the 2nd & 3rd level SEs just steal my images without linking to my site, so I block those since I don't gain anything from them.
9:29 pm on Nov 22, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



They used one of those tripartite systems with me. Seems to be popular with ex-soviet robots in general; in the mail.ru version you get five sets of (each set for a different image, but always the same UA-and-referer pattern)

217.69.135.91 - - [19/Nov/2012:05:53:31 -0800] "GET http://www.example.com/games/images/SultanPic.jpg HTTP/1.1" 403 1442 "-" "Mozilla/5.0 (compatible; Mail.RU/2.0c)" 
217.69.135.91 - - [19/Nov/2012:05:53:32 -0800] "GET http://www.example.com/games/images/SultanPic.jpg HTTP/1.1" 403 1442 "-" "Mozilla/5.0 (compatible; Mail.RU/2.0c)"
217.69.135.91 - - [19/Nov/2012:05:53:34 -0800] "GET http://www.example.com/games/images/SultanPic.jpg HTTP/1.1" 403 1442 "http://go.mail.ru/search_images" "Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20120403211507 Firefox/12.0"

where the requested images are barely bigger (2-3K) than the 403. Many are almost literally thumbnail-sized.

Matter of fact, I could exclude them from /games/images/ alone and it would have pretty much the same effect. Why on earth would someone want the "Made with FutureBasic" logo from 1997?

Oddly I've got them down as /20, not /21. But the actual crawling is from a still narrower range. Probably something like ..132.0/22.
9:48 pm on Nov 22, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I have just "allowed" the mail.ru bot to see what happens (can't be worse than G, can it?).

As far as I can tell there is only one IP range for mail.ru (if anyone has others I'd be interested)...

217.69.128.0 - 217.69.143.255

Bots, according to a DNS scan and grep for "spider" and "fetcher"...

217.69.133.67 - 217.69.133.70
217.69.134.53 - 217.69.134.56
217.69.134.79 - 217.69.134.79
217.69.134.113 - 217.69.134.113
217.69.134.165 - 217.69.134.179
217.69.135.91 - 217.69.135.91
217.69.136.29 - 217.69.136.32

Bot UA is...

Mozilla/5.0 (compatible; Mail.RU_Bot/2.0)