Forum Moderators: open
Google spiders the main page with these links on but doesn’t actually spider the "thisproduct" pages!
The only explanation is that google doesn’t work with page url's created from RewriteRules - but that cant be correct.
Also Slurp & Scooter bots are indexing these correctly just not google which I cannot fathom.
BUT if google has no issues with RewriteRule, then perhaps the bot hasn’t scanned this far into my site just yet. All I need really is confirmation that RewriteRule does not trip google in spidering static pages. Has anyone here had such pages index ok?
I know a lot of sites having rewritten URLs indexed in google. That shouldn't be an issue at all. It doesn't even prevent amazon.com to have a PR9...
When using a properly written "rewriterule", nobody is even aware the URL has been rewritten... no more GoogleBot than any other bot/visitor. Simply make sure you don't forget the [L=LAST] flag at the end of the rule when applicable.
Dan
<add> Maybe, as you suggested, has Googlebot not crawled that "deep" in your site yet</add>
The fact that requests for your rewritten URLs result in a 200-OK response indicates that you used an internal (transparent) redirect. As such, the rewrite takes place entirely inside the file system of your server, and is not externally visible or detectable. So, you should not have a problem.
In order for a user or robot to detect a rewrite, you'd have to use the [R] flag, and return a 301 or 302 "Moved" response. This would tell the user-agent to repeat the request with the supplied new URL, and so inform the user-agent that a rewrite/redirect is needed. But an internal rewrite resulting in a 200-OK doesn't involve that interaction with the user-agent.
I suspect Googlebot is just taking its time - as usual - about finding these "new pages" through the links on your site. As with new sites/new pages, worry only after two complete update cycles have passed. But in your case, the 200-OK says it all.
Jim
My (limited) experience when introducing large quantities of new material is it takes at least 2 deep crawls after the index page is indexed before the content is crawled properly.