jatar_k

msg:1528621 | 2:55 pm on Dec 22, 2004 (gmt 0) |
spaces aren't really valid in directory names or urls so I am not sure there is a valid format for robots.txt I looked around at a bunch of tutorials and saw no references
|
Jon_King

msg:1528622 | 3:12 pm on Dec 22, 2004 (gmt 0) |
Yes, I wouldn't have used the space, but it's something tough to get changed in this particular organization. I couldn't find a reference to it either.
|
encyclo

msg:1528623 | 3:27 pm on Dec 22, 2004 (gmt 0) |
I would disallow both versions to cover both bases, but I would but the version with the space last in the robots.txt file in case it causes a parse error for some bots. The trouble is, with no certainty of exactly how this situation is supposed to be handled, you can expect problems. Is it possible to add a robots meta tag to those pages also?
|
Jon_King

msg:1528624 | 1:48 am on Jan 6, 2005 (gmt 0) |
A follow up. It seems the %20 does indeed hang/confuse some bots. Repeated requests for the robots.txt, and the pages in the disallowed directory are indexed.
|
|