Welcome to WebmasterWorld Guest from 54.145.209.34

Forum Moderators: Ocean10000 & incrediBILL

Yandex.ru now running yandex.com spider

Wanted to let others know they may want to update their black/white list.

   
7:09 am on Nov 23, 2011 (gmt 0)

WebmasterWorld Senior Member jab_creations is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I noticed a higher than average number of rejects and investigated my glorious reject log for my site to find that a bot running from yandex.com was being denied. Well I did a little research (and even found Brett posting about yandex.com though not specifically the associated bot). It looks legit as far as I can tell though someone mentioned that the IP's are in a block associated with some undesirables. The main point is that you may or may not want to update your black/white lists accordingly. I would not mind hearing about the associated undesirables here though.

- John
9:25 pm on Dec 14, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I'm not getting Russian visitors, EU and US mostly.

Well, they may be immigrants or vacationers sticking with what's familiar ;) I haven't bothered to check where mine come from, but the queries are always in Russian.
11:25 pm on Dec 14, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'm not getting Russian visitors, EU and US mostly.


Well, they may be immigrants or vacationers sticking with what's familiar ;) I haven't bothered to check where mine come from, but the queries are always in Russian.

All mine are wearing furry hats.
11:41 pm on Dec 14, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'm now seeing both .ru and .com bots, too lazy, er busy, to update whitelist yet ;)
10:25 pm on Dec 15, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



> All mine are wearing furry hats.

Ah. Northern part of USA and Canada, then. :)
8:35 am on Jan 12, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Ooh, I've passed the Minimum Size Threshold. As of January 1, I too have started getting visits from the YandexBot at its US address, 199.21.99.nn. I noticed it while

:: insert boilerplate here ::

checked back and it really did start precisely on January 1. Well, maybe it was already the 2nd in their time zone; I'm on the west coast.

They are unequivocally the same robot. In addition to the identical UA and same behavior, the US one is drawing 304's from pages that it has never visited before from that IP.

At some point when I wasn't looking, the YandexBot started using what I'd always thought of as the imagebot's IP at 95.108. This has bumped the imagebot over to 178.154.243.83, an address I don't remember seeing before. But I must not have been paying attention; Yandex paid a couple of visits from 178.154 way back in May (thank you, Spotlight) and started using it sporadically in November.

What ever will they think of next? :)
2:25 pm on Jan 31, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



FWIW: The second of two Yandex bot 'sessions' in six hours from two Hosts totally ignored a total Disallow in robots.txt:

spider-199-21-99-95.yandex.com
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
19:46:47 /robots.txt [200]

sticker03.yandex.ru [93.158.147.8]
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
01:36:21 /robots.txt [200]
01:36:21 /robots.txt [200]
01:36:21 / [403]
01:36:22 / [403]
01:36:22 / [403]
01:36:22 / [403]
9:18 pm on Jan 31, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Bummer. I've found them very well behaved ever since I let them back in a few months ago.

In fact, as long as we're here, I had a "D'oh" moment recently.

-- posts in assorted forums complaining about the ever-increasing clutter in g### SERPs
-- discussion of Yandex

1 + 1 =

Yup. An absolutely clean SERP. Nothing but results, as far as the eye can see. No big suspicious white spaces implying that my Ad Blocker is doing its stuff.
12:54 pm on Mar 12, 2012 (gmt 0)

10+ Year Member



Saw this one yesterday.

Didn't bother to look at robots.txt

From Palo Alto, CA:

100.43.83.136
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
9:40 pm on Mar 12, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Thanks for the heads-up, mslina.

For a reverse lookup within the range 100.43.64.0 - 100.43.95.255 for the word "spider" the bot range is currently 100.43.83.129 - 100.43.83.161
4:55 pm on Jun 8, 2012 (gmt 0)

10+ Year Member



Yandex did this (I'm no expert) today.
199.21.99.91 - - [08/Jun/2012:03:11:22 -0700] "GET /robots.txt HTTP/1.1" 200 26 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:03:11:22 -0700] "GET /robots.txt HTTP/1.1" 200 26 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:05:41:46 -0700] "GET / HTTP/1.1" 200 26386 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.91 - - [08/Jun/2012:07:54:44 -0700] "GET / HTTP/1.1" 200 26386 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"

Do my logs tell me that Yandex ignored my robots.txt?

thanks
5:39 pm on Jun 8, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That depends on what is in your robots.txt file.


Yandex always uses the exact same two IPs: one for the regular bot, one for Yandex Images.

For me, one in
178.154.243.nnn
and one in
77.88.30.nnn

Yes, quite consistent for
YandexBot/3.0
6:35 pm on Jun 8, 2012 (gmt 0)

10+ Year Member



whoops! I forgot I had just rebuilt my site & accidentally deleted my robots.txt file! I've now denied their robot, so I'll wait & see if they obey.

Thanks for the reminder g1smd!
9:21 pm on Jun 8, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I allow yandex - have done for some time - but their bot does seem to have one bug: it ignores some folder exclusions in robots.txt IF it finds, within one of the site's pages, a link to a file that lives there.

At least, that seems to be the case here.

Otherwise it's a good bot; better than some I could mention for SEs of much higher prominence. :(
1:10 am on Jun 9, 2012 (gmt 0)

10+ Year Member



Thanks for the note dstiles, I'll keep an eye on their behaviour. Don't really like a pounding by some bots. Cheers!
2:51 pm on Aug 20, 2012 (gmt 0)

5+ Year Member



I block them by UA.

Today there was a new one from a known (by me at least) bad network.

184.82.128.0/18 Scranton NOC. I have never seen any legitimate traffic from that nest of evil. ;)
11:58 pm on Aug 20, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month





For the first time I'm seeing triple digit daily human traffic coming from Yandex SERP. A few of these users have German IPs, so it's just not Russian users who use their SE.
This 46 message thread spans 2 pages: 46
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month