Forum Moderators: phranque
Ok so here's what is setup, on this server I have a directory called /webreports/
This directory (and everything beneath it) is password protected by an .htaccess file (I'm on Apache/1.3.37) within /webreports/. So basically you can read anything without a password and username.
So, stupidly, in my robots.txt file, I had listed:
User-agent: *
Disallow: /webreports/
Ok, fine, not the best way to hide a directory. Anyway, within that webreports directory are webalizer stats for about 5 sites. Example is:
/webreports/site1/
/webreports/site2/
/webreports/site1/usage_200604.html
and so on and so on.
So the question is. If the /webreports/ directory is password protected.....how does Google list 80 URLS within that directory as being restricted by a robots.txt file?
These pages aren't linked from anywhere. They are just standard webalizer reports, but the 80 urls google is listing is like it was able to read the directory and crawl it.
Hope this makes sense.
-k