Welcome to WebmasterWorld Guest from 23.20.6.115

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Yandex.ru now running yandex.com spider

Wanted to let others know they may want to update their black/white list.

     
7:09 am on Nov 23, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member jab_creations is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 26, 2004
posts: 3159
votes: 15


I noticed a higher than average number of rejects and investigated my glorious reject log for my site to find that a bot running from yandex.com was being denied. Well I did a little research (and even found Brett posting about yandex.com though not specifically the associated bot). It looks legit as far as I can tell though someone mentioned that the IP's are in a block associated with some undesirables. The main point is that you may or may not want to update your black/white lists accordingly. I would not mind hearing about the associated undesirables here though.

- John
9:25 pm on Dec 14, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13210
votes: 347


I'm not getting Russian visitors, EU and US mostly.

Well, they may be immigrants or vacationers sticking with what's familiar ;) I haven't bothered to check where mine come from, but the queries are always in Russian.
11:25 pm on Dec 14, 2011 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:6519
votes: 114


I'm not getting Russian visitors, EU and US mostly.


Well, they may be immigrants or vacationers sticking with what's familiar ;) I haven't bothered to check where mine come from, but the queries are always in Russian.

All mine are wearing furry hats.
11:41 pm on Dec 14, 2011 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14650
votes: 94


I'm now seeing both .ru and .com bots, too lazy, er busy, to update whitelist yet ;)
10:25 pm on Dec 15, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3121
votes: 3


> All mine are wearing furry hats.

Ah. Northern part of USA and Canada, then. :)
8:35 am on Jan 12, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13210
votes: 347


Ooh, I've passed the Minimum Size Threshold. As of January 1, I too have started getting visits from the YandexBot at its US address, 199.21.99.nn. I noticed it while

:: insert boilerplate here ::

checked back and it really did start precisely on January 1. Well, maybe it was already the 2nd in their time zone; I'm on the west coast.

They are unequivocally the same robot. In addition to the identical UA and same behavior, the US one is drawing 304's from pages that it has never visited before from that IP.

At some point when I wasn't looking, the YandexBot started using what I'd always thought of as the imagebot's IP at 95.108. This has bumped the imagebot over to 178.154.243.83, an address I don't remember seeing before. But I must not have been paying attention; Yandex paid a couple of visits from 178.154 way back in May (thank you, Spotlight) and started using it sporadically in November.

What ever will they think of next? :)
2:25 pm on Jan 31, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


FWIW: The second of two Yandex bot 'sessions' in six hours from two Hosts totally ignored a total Disallow in robots.txt:

spider-199-21-99-95.yandex.com
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
19:46:47 /robots.txt [200]

sticker03.yandex.ru [93.158.147.8]
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
01:36:21 /robots.txt [200]
01:36:21 /robots.txt [200]
01:36:21 / [403]
01:36:22 / [403]
01:36:22 / [403]
01:36:22 / [403]
9:18 pm on Jan 31, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13210
votes: 347


Bummer. I've found them very well behaved ever since I let them back in a few months ago.

In fact, as long as we're here, I had a "D'oh" moment recently.

-- posts in assorted forums complaining about the ever-increasing clutter in g### SERPs
-- discussion of Yandex

1 + 1 =

Yup. An absolutely clean SERP. Nothing but results, as far as the eye can see. No big suspicious white spaces implying that my Ad Blocker is doing its stuff.
12:54 pm on Mar 12, 2012 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2003
posts:358
votes: 0


Saw this one yesterday.

Didn't bother to look at robots.txt

From Palo Alto, CA:

100.43.83.136
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
9:40 pm on Mar 12, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3121
votes: 3


Thanks for the heads-up, mslina.

For a reverse lookup within the range 100.43.64.0 - 100.43.95.255 for the word "spider" the bot range is currently 100.43.83.129 - 100.43.83.161
4:55 pm on June 8, 2012 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 29, 2003
posts: 126
votes: 0


Yandex did this (I'm no expert) today.
199.21.99.91 - - [08/Jun/2012:03:11:22 -0700] "GET /robots.txt HTTP/1.1" 200 26 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:03:11:22 -0700] "GET /robots.txt HTTP/1.1" 200 26 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:05:41:46 -0700] "GET / HTTP/1.1" 200 26386 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:07:54:44 -0700] "GET / HTTP/1.1" 200 26386 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"

Do my logs tell me that Yandex ignored my robots.txt?

thanks
5:39 pm on June 8, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


That depends on what is in your robots.txt file.


Yandex always uses the exact same two IPs: one for the regular bot, one for Yandex Images.

For me, one in
178.154.243.nnn
and one in
77.88.30.nnn

Yes, quite consistent for
YandexBot/3.0
6:35 pm on June 8, 2012 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 29, 2003
posts: 126
votes: 0


whoops! I forgot I had just rebuilt my site & accidentally deleted my robots.txt file! I've now denied their robot, so I'll wait & see if they obey.

Thanks for the reminder g1smd!
9:21 pm on June 8, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3121
votes: 3


I allow yandex - have done for some time - but their bot does seem to have one bug: it ignores some folder exclusions in robots.txt IF it finds, within one of the site's pages, a link to a file that lives there.

At least, that seems to be the case here.

Otherwise it's a good bot; better than some I could mention for SEs of much higher prominence. :(
1:10 am on June 9, 2012 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 29, 2003
posts: 126
votes: 0


Thanks for the note dstiles, I'll keep an eye on their behaviour. Don't really like a pounding by some bots. Cheers!
2:51 pm on Aug 20, 2012 (gmt 0)

New User

5+ Year Member

joined:Mar 28, 2010
posts:9
votes: 0


I block them by UA.

Today there was a new one from a known (by me at least) bad network.

184.82.128.0/18 Scranton NOC. I have never seen any legitimate traffic from that nest of evil. ;)
11:58 pm on Aug 20, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:6519
votes: 114




For the first time I'm seeing triple digit daily human traffic coming from Yandex SERP. A few of these users have German IPs, so it's just not Russian users who use their SE.
This 46 message thread spans 2 pages: 46