|robots.txt - allow folder withing disallowed one|
| 12:40 am on Aug 10, 2003 (gmt 0)|
I have the following link structure:
I want to disallow SE to go into each city directly, but allow to go into id folder. (id is also linked from other parts of the site)
| 4:10 am on Aug 10, 2003 (gmt 0)|
Nasty problem, that.
Google supports an "Allow" extension to the robots exclusion standard, but that's not much help with the other search engine spiders.
I presume you have thousands of city and id pages, so on-page meta robots tags may not be of much use either, unless you can generate them dynamically w/SSI or php, etc.
Another approach would be to move the id pages out from under cities, and put them at the same directory/URL depth. In other words, /find/country/state/city/ and /find/country/state/id/. Then you can use robots.txt without any problems. But that requires massive link editing, and possibly 301 redirects to let everyone know the pages have moved.
I can't think of any easy or quick solutions. <bump>
| 5:45 am on Aug 10, 2003 (gmt 0)|
Allow is not an option, because I do not want to be dependent on Google :)
Tags in each page is very easy to implement, it's all dynamic, I only need to change 1 file, but that doesn't solve my problem. I do not want bots to eat my bandwidth for no reason. They would still need to get that page to see the meta tag.
Moving the folder cannot be done the way you mentioned it, because of mod_rewrite issue in this situation, THOUGH that's an exellent idea, I will just move it somewhere else.
instead of /country/state/city/id/ i can just do /something/id/ :)