Welcome to WebmasterWorld Guest from 54.242.165.26

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

this time it's Sogou

61.135 returns

     
11:32 pm on Dec 9, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:12985
votes: 287


October 2010 [webmasterworld.com] (Soso)
August 2011 [webmasterworld.com] (Yodao)
July 2012 [webmasterworld.com] (Baidu)
September 2012 [webmasterworld.com] (thread next door in Apache)

The current incarnation looks like this (spacing as shown):*

61.135.189.106 - - [23/Sep/2013:09:50:10 -0700] "GET /robots.txt HTTP/1.1" 200 1014 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)" 
61.135.189.106 - - [23/Sep/2013:09:50:11 -0700] "GET /ebooks/paston/paston6b.html HTTP/1.1" 403 2963 "-" "New-Sogou-Spider/1.0 (compatible; MSIE 5.5; Windows 98)"

What is it with Chinese robots anyway? They always seem to put on UA strings that would get them blocked even from a previously unknown IP.

Personal hunch: the idea is to lull servers into complacency by first asking for robots.txt. It isn't very determined though; it goes away after one or two 403s. (If anyone has been asleep for the last five years, the uber-range is 61.128.0.0/10. If only Ukrainian robots lived in such nice fat /10 blocks!)

Cursory log search tells me they also show up at 220.181.125.155** with the same behavior pattern except that they don't change UAs after getting robots.txt. I don't know if either one is legit; free lookup is uninformative on both.


* The referenced page is in Chinese except for the recurring phrases "sogou spider" and "robots.txt". Rumor has it they're compliant, but who gives a ###.
** Not to be confused with 220.181.108.78, which sometimes claims to be Baidu.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members