Forum Moderators: open
Any info anyone?
I don't recognise the results - is it a spidering engine or directory?
Openfind data gatherer, Openbot/3.0+(robot-response@openfind.com.tw;+http://www.openfind.com.tw/robot.html
There's some basic info at [openfind.com.tw...]
Yep that's them alright. I had to ban them a few months ago because they wouldn't stop. I think they have one of the most abusive spiders out there. They were generating log files 50-60 MB in size. I didn't even submit to them so I think they just follow link after link on any links they find.
If you have a small site there is nothing to worry about.
Voted best Chinese website by various authorities, SINA.com is now partnering with Taiwan's leading search engine service provider Openfind to upgrade its powerful and user-friendly search engine for SinaNews and SinaFinance. This new venture follows the recent success SINA.com enjoyed through its partnership with US' largest search engine Alta Vista.
2)Openfind.com.tw is the leading search engine service provider in Taiwan area;
Baidu.com is the the leading search engine service provider in China.
3)openfind had been providing search engine service to sina.com.cn, sina.com.tw, kimo.com and cn.yahoo.com for sometime.
4)But now google is providing search engine service to kimo.com and cn.yahoo.com;
and Baidu.com is providing search engine service to sina.com.cn.
5)BTW, baidu.com is providing search engine service to sina.com.cn, sohu.com, cn.tom.com, hk.tom.cn, 263.com, 21cn.com, chinaren.com,etc.
it's about 70-80% search engine marketing in china.
They are retrieving 5-10 pages per minute and have been for the whole day.
66.7.131.148 <snip> Openfind data gatherer, Openbot/3.0+(robot-response@openfind.com.tw;+http://www.openfind.com.tw/robot.html)"
Most of my site is dynamically generated (pedigree searches) and the total ofpossible pages is around 1 million. Which it seems to be adamant in retrieving.
It does not seem to retrieve my robots.txt. I've searched my logs (date back 4 months) and it never retrieved it. (could have an older one, but should have gotten a newer version within that timespan)
It hate to have to resort to banning the ip/domain but it seems I have no other choice.
Olaf