Forum Moderators: open
OutfoxBot
For internet experiments outfoxbot. ... OutfoxBot For internet experiments. GaryK #:401697, 2:39 pm on Nov. 13, 2005 (utc 0). User Agent: OutfoxBot/0.3 (For internet experiments; http://; outfox.agent@gmail.com). IP Address: 220.181.8.* ...
[webmasterworld.com...]
OutfoxBot/0.3
outfoxbot/0.3. ... 220.181.8.102 - - [08/Nov/2005:15:16:56 -0700] "GET /robots.txt HTTP/1.0" 403 292 "-" "OutfoxBot/0.3 (For internet experiments; ... Agent: OutfoxBot/0.3 (For internet experiments; http://; outfox.agent@gmail.com) ...
[webmasterworld.com...]
What is this? OutfoxBot
I found this agent crawling around my site what is this? outfoxbot.
[webmasterworld.com...]
OutfoxBot
Disobeys robots.txt outfoxbot. ... 220.181.8.121 - - [25/Jun/2006:00:02:19 -0400] "GET /robots.txt HTTP/1.1" 200 2966 "-" "OutfoxBot/0.1 (For internet experiments; http://www.outfox.com; outfoxbot@gmail.com)" ...
[webmasterworld.com...]
OutfoxBot /0.1?
Anyone know who what where? outfoxbot /0.1? ... Referer: - Agent: OutfoxBot/0.1 (For internet experiments; http://www.outfox.com; outfoxbot@gmail.com). / Http Code: 200 Date: Apr 28 15:24:59 Http Version: HTTP/1.1 Size in Bytes: 31663 ...
[webmasterworld.com...]
IP is come from Beijing China and no other except a gmail address: outfox@gmail.com in UA info.
Last weekend(15th Dec): there is a new released Chinese search engine:
Yodao.com, and outfoxbot renamed to yodaobot.
I tried my crawler identify query, site:example.com crawledby
and found crawler is just Outfoxbot:
phpMan: Unix Man page/ Perldoc / Info page Web Interface
On Apache/1.3.37 (Unix) mod_perl/1.29 mod_gzip/1.3.26.1a PHP/4.4.4 Under GNU General Public License 2006-12-14 04:53 @60.191.80.39 CrawledBy OutfoxBot/0.5 (for internet experiments; http://; outfoxbot@gmail.com)
the new yodaobot detect for awstats: including other Chinese browser and spider updates
diff -r1.44 robots.pm
100d99
< # added TencentTraveler
180,181d178
< # added sogou spider http://corp.sohu.com/20051130/n240842344.shtml
< # added sogou test http://corp.sohu.com/20051130/n240842344.shtml
351a349
> 'lilina',
462a461
> 'gougou',
472a472,474
> 'iaskspider',
> 'hl_ftien_spider',
> 'sogou',
835d836
< 'tencenttraveler', # Must be before msiecrawler
863c864
< 'outfoxbot',
---
> 'yodaobot',
899,900d899
< 'sogou\sspider',
< 'sogou\stest',
973a973
> 'zhuaxia',
1006a1007
> 'lilina','Lilina',
1115a1117
> 'gougou','GouGou',
1125a1128,1130
> 'iaskspider','<a href="http://www.iask.com/" target="_blank">Sina Iask Spider</a>',
> 'hl_ftien_spider','<a href="http://www.hylanda.com/" target="_blank">Hylanda</a>',
> 'sogou','<a href="http://www.sogou.com/" target="_blank">Sogou Spider</a>',
1463d1467
< 'tencenttraveler','TencentTraveler', # Must be before msiecrawler.
1491c1495
< 'outfoxbot','<a href="mailto:outfox.agent@gmail.com?subject=Outfox Bot Information" title="Bot e-mail.">OutfoxBot</a>',
---
> 'yodaobot','<a href="http://www.yodao.com/help/webmaster/spider/" title="Bot e-mail.">OutfoxBot/YodaoBot</a>',
1527,1528d1530
< 'sogou\sspider','<a href="http://corp.sohu.com/20051130/n240842344.shtml" title="Bot home page [new window]" target="_blank">sogou spider</a>',
< 'sogou\stest','<a href="http://corp.sohu.com/20051130/n240842344.shtml" title="Bot home page [new window]" target="_blank">sogou test</a>',
1601a1604
> 'zhuaxia','<a href="http://www.zhuaxia.com/" target="_blank">ZhuaXia</a>',
Che Dong
[edited by: encyclo at 2:48 am (utc) on Dec. 25, 2006]
[edit reason] examplified, fixed formatting [/edit]
[edited by: Leosghost at 11:50 pm (utc) on Dec. 24, 2006]
the home pages are different in their raison d'etre ..
yodao.com ( the bot home )..is interesting ..and doesn't appear to be running as a scraper ..just omnivore ..like the chinese people it eats everything but the squeal when it sits down to dinner on a site ..knows no bounds :)
I just ran some searches on it in english ..I am pleased ;-) I do very well on it's index there ..:)..( with pages that shouldn't do very well anywhere ..and usually dont ..they are the ones that I was to lazy to change for a few years now ) ..dont expect it to bring me many clients though :)
It's algo looks to be very basic ..sort of AV just before the death ..all on page ..it's susceptible to keyword stuffing bless it ..:)
Doesn't appear to be wholesale ripping..mainly taking home pages and then random internal stuff ..from what I saw of mine and some others that I know ..most of it's own index appears to be taken for now from inside the "wall" ( some of my stuff I know is linked to from inside ) ..would probably have to ask in the Asian area here on WebmasterWorld to get any real hard info on it as it's "about" sector is a blog .
However ..
Just ran some other searches on it for things that normally would only be found in off limits areas of any sites to bots ..it's cache hold some very interesting things for the less well intentioned to exploit ..on the basis of a quick look at what it's pointy lil nose does get into and then hang on the washing line of "cache" ..
I think most people would want to block it entirely ..which might be challenging as it appears to run in various disguises including browsers and Gbots etc ..and mobile devices ..
Another case of if you dont expect customers block the PRC entirely ..?
It's rise is meteoric ..but it is very indiscreet ..
[edited by: Leosghost at 12:43 am (utc) on Dec. 25, 2006]
Another case of if you dont expect customers block the PRC entirely ..?
EDIT: Hi Mokita. Merry Christmas. Sadly I could barely make sense of what the OP was trying to convey to us. All I picked up on was the sub-title about OutfoxBot becoming Yodaobot.
[edited by: GaryK at 1:15 am (utc) on Dec. 25, 2006]
and apparently there is indeed a link between it and outfox ..the yodao.com page asks to set a cookie " OUTFOX_SEARCH_USER_ID" ..duration til 2036..with a "91.164.217.249" return ..
keyplyr and incrediBILL tagged some outfox IP's here [webmasterworld.com...]
in one of the threads mentioned by the OP .. earlier this year ..
Again someone from the Asian foras input would be enlightening maybe ..Bill ( when you've finished with the turkey ..like to comment? ) ..or any other regulars from Asian search ..
I wondered that too Mokita ..figured admins would get to it after the mince pies :)
[edited by: Leosghost at 1:21 am (utc) on Dec. 25, 2006]
Merry Christmas! I thought the OP was understandable - the whole essence is in the title. But he muddied it by including too much pasted info from previous threads and the many lines for updating Awstats' detection of Chinese browsers and spiders.
I blocked large swathes of PRC IPs quite some time ago. They only provide scrapers, harvesters and spambots. Also, I have Taiwan, Korea and to a lesser extent Japan on a similar footing. None of our product sites ship goods to Asia and our information sites would be of no interest to 99.9% of them either.
Leosghost wrote:
...figured admins would get to it after the mince pies
Dan has been eating mince pies since 19th Dec? WOW! He'll need to do some serious dieting and exercise when he gets back :O :D
[edited by: Mokita at 2:04 am (utc) on Dec. 25, 2006]