|robots.txt and mod_rewrite|
just double checking
| 1:20 pm on Mar 9, 2004 (gmt 0)|
Okay, I did a mod_rewrite that changes about half my URLs from
Now I want to make sure the spiders follow the /catalog/ URLs and not the /cgi-bin/ URLs
If I disallow the /cgi-bin/, they will still follow the URLs that are mod_rewritten to /catalog/ even though the pre-rewritten URL is /cgi-bin/.
I am correct in this?
| 1:44 pm on Mar 9, 2004 (gmt 0)|
Have you changed the links in the HTML to point to the new, spider friendly, URLs? If not, you will need to or the bots will keep requesting them.
| 2:35 pm on Mar 9, 2004 (gmt 0)|
The old URLs that siders are currently crawling will stay the same, due to the fact that I can not change them. I am creating new pages with the links to the modified URLs.
That's the reason I want to double check the robots.txt. I want to force the spiders into the new area, as the old URLs are tripping spiders up a bit due to the fact that they are dynamic. Unfortunatly, there is no way to change the old URLs right now, so I am stuck with this course of action.
| 5:26 am on Mar 12, 2004 (gmt 0)|
No. If they can't access
/cgi-bin/, they can't follow the redirect.
| 1:57 pm on Mar 13, 2004 (gmt 0)|
It isn't a redirect, though. It's a rewrite. Do they act the same?
| 2:24 am on Mar 14, 2004 (gmt 0)|
Redirect, rewrite, same difference in this case. :-)
Put it this way: the rewrite is in a room. If you don't allow access to the room, no one will actually know what's in it.