Forum Moderators: goodroi
question, when Disallowing Paths in Robots.txt, do I have to Disallow both paths in Robots?
for instance,
real path: /stores/products/electronics/plasma.php?=33
mod_rewritten virtual path: /televisions/plasma-33-sony-trinitron.html
If I wanted to disallow search bots from indexing any of the above paths, Do I have to Disallow Both? Or do I have to just disallow the Physical path?
so If I Disallow: /stores/products/electronics/plasma.php?=33
will it still index: /televisions/plasma-33-sony-trinitron.html ?
wondering if I have to disallow both, the physical path, AND the virtual directory?
Thank You for your response
regards, frogz
assuming the physical path is externally accessible as a url:
what you probably want to do for url canonicalization reasons is to externally redirect requests for the physical path to the virtual path.
the physical path should be allowed so the robot can make the request and get the 301 or 302 response.
when the robot makes the subsequent request for the virtual path it will be disallowed the access to the internal rewrite that would have ultimately provided the resource.
check out this WebmasterWorld thread for more information:
Robots.txt and Mod Rewrite [webmasterworld.com]