| Welcome to WebmasterWorld Guest from 220.127.116.11 |
register, login, search, subscribe, help, library, PubCon, announcements, recent posts, open posts,
|Subscribe to WebmasterWorld|
|Not sure how to handle space character|
| 2:44 pm on Dec 22, 2004 (gmt 0)|
If disallowing a subdir with a space in the name which is correct:
| 2:55 pm on Dec 22, 2004 (gmt 0)|
spaces aren't really valid in directory names or urls so I am not sure there is a valid format for robots.txt
I looked around at a bunch of tutorials and saw no references
| 3:12 pm on Dec 22, 2004 (gmt 0)|
Yes, I wouldn't have used the space, but it's something tough to get changed in this particular organization.
I couldn't find a reference to it either.
| 3:27 pm on Dec 22, 2004 (gmt 0)|
I would disallow both versions to cover both bases, but I would but the version with the space last in the robots.txt file in case it causes a parse error for some bots.
The trouble is, with no certainty of exactly how this situation is supposed to be handled, you can expect problems. Is it possible to add a robots meta tag to those pages also?
| 1:48 am on Jan 6, 2005 (gmt 0)|
A follow up. It seems the %20 does indeed hang/confuse some bots. Repeated requests for the robots.txt, and the pages in the disallowed directory are indexed.
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld ® and PubCon ® are a Registered Trademarks of Pubcon Inc.
© Pubcon Inc. 1996-2012 all rights reserved