Welcome to WebmasterWorld Guest from 54.80.115.140
Forum Moderators: goodroi
User-agent: baiduspider
Disallow: /
So I added this one and it appeared to keep the bot at bay.
User-agent: baiduspider+
Disallow: /
Looking at my logs this morning, I found the spider back again trying to request each of the 50,000+ pages I have.
It's using this user agent string:
Baiduspider+(+http://www.baidu.com/search/spider.htm)
from ip address 220.181.32.53
My robots.txt file has become very long - (20kB) and the entry is toward the bottom. Would this cause it to be skipped?
What else can I try?
This issue was discussed in 2005. Sorry to hear the issue is still around. Good news is that the solution suggested then will still work - use htaccess or isapi rewrite to deny it. Check out the old discussion:
[webmasterworld.com...]