homepage Welcome to WebmasterWorld Guest from 54.242.231.109
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
YandexImages
lucy24




msg:4604258
 12:21 am on Aug 23, 2013 (gmt 0)

I have never met this before in my life, though free lookup says it's been registered forever:

5.255.253.89 - - {time} "GET /robots.txt HTTP/1.1" 200 1051 "-" "Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)"
5.255.253.89 - - {time} "GET /paintings/rats/bigfish1.png HTTP/1.1" 304 237 "-" "Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)"

There's a bunch more-- probably going on as we speak, judging by most recent timestamp. Note the 304 response: it's clearly the same robot.

Full IP range is
5.255.192.0/18
Yup, RIPE range. Don't see much of those lately, especially for images.

 

dstiles




msg:4604718
 7:34 pm on Aug 24, 2013 (gmt 0)

Perfectly normal range for yandex, which is originally Russian migrating to American.

I've blocked the yandeximages bot for a lont time - as long as I've enabled yandex, in fact. Again a normal thing.

Check out the URL in the UA - they have a good bot support page.

lucy24




msg:4604724
 9:44 pm on Aug 24, 2013 (gmt 0)

Yandex Images counts as "no skin off my nose". I can now recognize the Russian word for "rat" in Cyrillic :)

Their robots page has a good listing of UAs but they've always refused to name IPs. (Although not as bad as, say, Majestic, which tends to get locked out purely because they operate from other people's server farms.)

I don't get any fake yandexbots worth mentioning, but I do get fake "yandsearch" referers. If I ever find their contact address I will ask what, if anything, "lr=213" means.

not2easy




msg:4604764
 4:24 am on Aug 25, 2013 (gmt 0)

I see their image bots at 100.43.64.0/19 and 37.140.128.0/18 also.

dstiles




msg:4604831
 8:21 pm on Aug 25, 2013 (gmt 0)

Lucy - yandex IPs are fairly easy...

5.45.202.0 - 5.45.202.255
37.140.141.0 - 37.140.141.63
77.88.0.0 - 77.88.63.255
87.250.224.0 - 87.250.255.255
93.158.128.0 - 93.158.191.255
95.108.128.0 - 95.108.255.255
100.43.64.0 - 100.43.95.255
178.154.128.0 - 178.154.255.255
199.21.96.0 - 199.21.99.255
199.36.240.0 - 199.36.243.255
213.180.192.0 - 213.180.223.255

Add new ones as they are discovered. Not all IPs in range claim to be bots but coupled with the actual (acceptible) UA I've found no problems.

Given the current state of SEs in general, yandex is a fairly good one and seems to give good results whilst not annoying webmasters (eg me). Its bots and SEs are based in Russia and USA and I let all their ordinary bots in, excluding image and a few others.

Only a guess but could "lr=213" be a general reference to the bot range that originally discovered it? Maybe not, but a possibility.

lucy24




msg:4604887
 6:20 am on Aug 26, 2013 (gmt 0)

The exact string "lr=213" only occurs in fake-referer requests. Most are already blocked by IP; I've retained the referer check just for insurance. I've never seen "lr=any-number-at-all" in a legitimate YandSearch visit; that's what makes it mystifying.

Far as I can tell, I've never seen 213.180 at all, ever, and didn't know the Yandex range existed :) Is this maybe a range Yandex used to crawl from, so "lr=213" is something you'd get in very, very old search results, now only preserved in fake referers?

There are several ranges on your list I've never seen in my life. Contrariwise I've got the 5.255 range noted above, and also 37.9.64.0/18. (37.9.84.253 YandexFavicon gets front page + favicon, 37.9.69-70 YandexImageResizer, UA list says it's for mobiles)

thetrasher




msg:4604951
 12:37 pm on Aug 26, 2013 (gmt 0)

[api.yandex.com...]
The region to give preference to when generating search results is defined by the value of the lr parameter of the search query.

[search.yaca.yandex.ru...]
213 = Moscow

lucy24




msg:4604954
 1:36 pm on Aug 26, 2013 (gmt 0)

You're a better man than I, thetrasher. I looked everywhere and never managed to find that.

It must be one of those prefs you can set by default. The overwhelming majority of my lr=213 searches come with a referer string that says in full

http:/ /yandex.ru/yandsearch?text=www.example.com&lr=213

OK, so they're looking my domain, but specifically that version of it which is most applicable to Moscow? :)

:: detour to spell out non-213 results (I don't read Cyrillic) ::

Wonder why St Petersburg (which I am sorry to say I still think of as Leningrad) is listed twice?

dstiles




msg:4606136
 2:11 pm on Aug 30, 2013 (gmt 0)

Another yandex IP range with some bots. New to me today but no indication in DNS as to when it was registered.

5.255.192.0 - 5.255.255.255
5.255.192/18
Russia

Bots found so far:

5.255.253.0/24
probably not all IPs are bots but checked against UA should be safe to enabled all.

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

lucy24




msg:4606221
 8:01 pm on Aug 30, 2013 (gmt 0)

Did you copy-and-paste the wrong numbers? 5.255.192.0/18 is what we've been talking about all along :)

dstiles




msg:4606467
 8:21 pm on Aug 31, 2013 (gmt 0)

Er... Hmmm.

I cross-checked the IP in DNS to see what we were talking about and forgot to add it to my database. It was new to me with your original posting: my posting was because I found a bot hitting an IP I hadn't (but should have) logged. :(

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved