homepage Welcome to WebmasterWorld Guest from 50.19.172.0
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Yandex gets fatter
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4517886 posted 2:05 am on Nov 11, 2012 (gmt 0)

37.140.128.0/18
Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)

Like the other yandexbots I've met, they pick an IP and stick with it. Here it was 37.140.141.13.

I've yet to see them on my own site, but a rare visit to art studio's logs

:: insert boilerplate about finding something while looking for something else (in this case iPad misbehavior, see nearby post) ::

found them all over the place, alternating with the familiar 199.21.64.0/20 and a handful of 95.108.128.0/17.

They didn't go anywhere they weren't supposed to.

 

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4517886 posted 9:54 pm on Nov 11, 2012 (gmt 0)

Thanks for the new range, Lucy. Just ran a DNS check on the full range and the word "spider" occurs only in the range 37.140.141.1 - 37.140.141.36

Of these, 37.140.141.1 - 37.140.141.12 are basic "spider"; the others are "image-spider".

TypicalSurfer

5+ Year Member



 
Msg#: 4517886 posted 10:11 pm on Nov 11, 2012 (gmt 0)

Yandex has a nice index, it kinda reminds me of a search engine. I have seen a few referrals and noticed that dogpile is tossing them into their meta results. They do crawl a lot but I'll cut them slack.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4517886 posted 10:49 pm on Nov 11, 2012 (gmt 0)

I get a fair amount of traffic from Yandex. Their bot behaves and I have never had a problem.

Yandex has a nice index, it kinda reminds me of a search engine.

Probably because they are.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4517886 posted 5:04 am on Jan 31, 2013 (gmt 0)

:: bump ::

Here's another one:

37.9.64.0/18
possibly constrained to
37.9.84.0/22 (37.9.84-87)

Only met them twice so far.
exact IP: 37.9.84.253
full UA: Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/ bots)

Along with the favicon (first visit only), they picked up the front page both times.

No relation to 37.9.0.0/18 (main activity in the 37.9.50's, I think) which has been mentioned in a few threads hereabouts. But while cross-checking the UA I found this Yandex page [help.yandex.com] (in English) which I don't remember seeing before.

Yandex has a nice index, it kinda reminds me of a search engine.
Probably because they are.

I thought he was being satirical ;)

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4517886 posted 8:20 am on Jan 31, 2013 (gmt 0)


There are many IP addresses that Yandex robots can originate from, and these IP addresses are subject to change. We are therefore unable to offer a list of IP addresses and we do not recommend using a filter based on IP addresses.

This is interesting.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4517886 posted 8:34 am on Jan 31, 2013 (gmt 0)

At least so far the IPs do seem to belong to Yandex. It isn't as hopeless as, say, the MJ12bot which can come from absolutely anywhere-- including a good many blocked server farms. And, conversely, you don't see a lot of Yandexbot spoofers. Fake yandsearch* in referers, yes. Fake UAs, not so much.


* I did some quickie experimenting. The fake referer isn't any different from a real one. Except that currently mine all end in &lr=213. I don't know what "lr" is, only that it doesn't correspond to "cd".

bigtoga

5+ Year Member



 
Msg#: 4517886 posted 11:57 am on Jan 31, 2013 (gmt 0)

dstiles: "Just ran a DNS check on the full range and the word "spider" occurs only in the range 37.140.141.1 - 37.140.141.36"

Could you share how you did that? I've asked that question here before and never really gotten a good response.

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4517886 posted 10:31 pm on Jan 31, 2013 (gmt 0)

I use dig on linux, run in a bash shell script. I run the checks against one of the relevant DNS servers (eg yandex for yandex, ms for bingbot etc).

Slow but a) I'm in no hurry for the results and b) I worry that a fast DNS scan would get me blocked.

Not sure how to run this under windows but there is probably an equivalent to dig.

Leosghost

WebmasterWorld Senior Member leosghost us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4517886 posted 11:14 pm on Jan 31, 2013 (gmt 0)

Not sure how to run this under windows but there is probably an equivalent to dig.


I was going to suggest "sam spade" ..(was a "multi tools" for win ) ..and then went for a look ..and "sam" is gone :( .."where have all the..."

So ..going to pour myself a "leapfrog"..and make sure there are no kids on my lawn..

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4517886 posted 8:30 pm on Feb 1, 2013 (gmt 0)

I used to use sam when I was using windows machines for everyday activities but gave up when various bits of it stopped working. At the time there was still a web version running but I haven't looked for several years. It's possible that NS-Batch might still be available and capable of doing this but I've only used that once (recently) for tracking a list of IPs and can't recall its full facilities.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4517886 posted 9:43 pm on Feb 1, 2013 (gmt 0)

At the time there was still a web version running

Heh. I didn't know there was anything but a www version. It must have closed up shop eons ago; I even took it off my bookmarks. Currently dot com times out and dot org gives you the "It Works!" header which I think is generated by testing something-or-other but, er, I can't remember what. Cursory whois'ing says that dot org is the one we want :(.

SevenCubed

WebmasterWorld Senior Member



 
Msg#: 4517886 posted 9:57 pm on Feb 1, 2013 (gmt 0)

...the "It Works!" header...

It's the default message that an Apache server homepage displays once it's set up properly.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved