lucy24 - 2:03 am on Sep 29, 2012 (gmt 0)
Bing crawls websites to build an index. If the URLs are not in the website, how can it find URLs and also post valid CGI parameters to them?
parts of a file system that are not published anywhere
"not published on my own site" is not the same as "not published anywhere".
Somewhere earlier in this thread you said that bing knows the names of pages in the directory. If so, the directory itself becomes irrelevant. Once you've got example.com/directory/secretdirectory/filename.html you don't need a separate link to know that /directory/secretdirectory/ exists. Robots don't have a lot of brains, but they-- or, ahem, their human programmers-- can figure out that much. In my own logs, I routinely see bing asking for /ebooks/sometitle/ although these files (that is, an /ebooks/title/index.html file for the assorted titles) don't exist. At one time I assumed it was my fault for ::cough-cough:: goofing on some relative links. But now I realize they'd be asking for these files anyway. Heck, even humans will sometimes try it. Same goes for other directories that don't have an index file* -- and they're not all attributable to mistakes I made in the past.
* <ot>In cases where an index.html file seems a reasonable thing to look for, I've put in individual redirects.</ot>