Forum Moderators: open

Message Too Old, No Replies

Yandex (redux)

         

Pfui

2:30 am on Oct 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



htest0n.yandex.ru
Yandex/1.01.001 (compatible; Win16; H)

robots.txt? Yes BUT immediately, rapidly and repeatedly ignored it:

18:31:17/robots.txt
18:31:18/robots.txt
18:31:19/
18:31:20/
18:31:21/
18:31:22/

(If the word "test" in the subdomain is any indication, this one flunked BIG time.)

Prior posts here [webmasterworld.com], from a few months ago. UA versions all over the map.

FWIW: With the exception of robots.txt access, I've 403'd any Host or UA containing the word "yandex" for years.

GaryK

3:02 pm on Oct 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's too bad Pfui. My experiences with Yandex have been much better than yours. I wonder if it has something to do with several sites I manage getting lots of traffic from Yandex? It always reads and respects robots.txt.

keyplyr

6:42 pm on Oct 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No problems ever with authentic Yandex; reads/respects robots.txt and delivers a little traffic. There may be spoofs out there doing the dirty deeds.

Pfui

9:51 pm on Oct 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If a spoof, then yandex.ru is spoofing itself.

GaryK

1:50 am on Oct 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wonder if Yandex offers any of the same kinds of proxy-type services that Google does?

Pfui

5:41 am on Oct 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't believe any of Google's proxy services run Googlebot per se.

As far as Yandex running Yandex is concerned, here's another visit made within the past hour, this time one hit, to robots.txt, thus robots.txt-compliant:

spider6n.yandex.ru
Yandex/1.01.001 (compatible; Win16; I)

robots.txt? Yes

Note the non-"test" subdomain, plus the one-letter-different UA compared to that used in the OP:

Yandex/1.01.001 (compatible; Win16; H)

I realize you guys want to, and do, trust Yandex from .yandex.ru. Cool. My experience with same simply leads me to the opposite conclusion.

GaryK

8:22 pm on Oct 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What I was suggesting was, perhaps someone other than Yandex was using a Yandex proxy to crawl sites using a Yandex UA in hopes it would slip past less vigilant webmasters than our little group.

Pfui

4:19 am on Nov 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Deja vu all over again. Same server, UA, even hit pattern:

htest0n.yandex.ru
Yandex/1.01.001 (compatible; Win16; H)

[Time-URI-Status Code]
20:02:42 /robots.txt 200
20:02:43 /robots.txt 200
20:02:44 / 403
20:02:45 / 403
20:02:46 / 403
20:02:47 / 403