Welcome to WebmasterWorld Guest from 188.8.131.52
We have discussed Baidu at length here, and I've tried to give them the benefit of the doubt. However, they don't seem to be capable of coding their spider to correctly fetch and parse robots.txt. I'm leaving the ban in place until they get it right.
This is one I can say, "I TOLD YOU!"
Bauduispider is the spider for a search engine... I think in Japan (might be China)... I often found this bad boy misbehaving.
I am halfway like Don on this one... It behaved so badly on my site (about 6 months ago) that I completely banned it.
In recent weeks it has requested several times the file
my server is unix-based, so requests are case sensitive. A file /sitetech/global.css does exist, but only lower-case, i.e. it's generating a 404 at present because /SIteTECh/global.Css doesn't actually exist.
There are no links on any of my pages to the above, and I really doubt there are external links to a non-existent file, so why is it making a request for this file which has never existed?
Like I said, I banned it, and have not unbanned it.... I just do not see the need to be indexed in this one. But that is my own decision based on my goals, and my sites. YMMV!
"We are testing if your site is case sensitive or not. So we change some character in the filename to uppercase th get it. If we can get it, your site is not case sensitive, and we will change all characters in the urls of your site to avoid get duplicate page in your site. I am very sorry to trouble you."
-- Personally I think it is unacceptable deliberately to cause errors on other people's sites, time after time. If they want to check for duplicate pages they should just do a comparison of two files...