Forum Moderators: open

Message Too Old, No Replies

Google Bot + RewriteRule

         

Blade

12:25 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



According to my logs, since 18 march other spiders are looking around correctly my pages but google bot is not spidering pages created from a rewrite rule redirection. I thought google would follow through to these correctly - perhaps not?

garylo

12:29 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



What do you mean by "created from a rewrite rule redirection".

Blade

12:38 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



Sorry I mean the pages are actually like this: /go/11.html
With the RewriteRule, the pages are presented as this: /go/thisproduct.html

Google spiders the main page with these links on but doesn’t actually spider the "thisproduct" pages!

The only explanation is that google doesn’t work with page url's created from RewriteRules - but that cant be correct.

hetzeld

12:41 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



Blade,

Could you post the RewriteRule exactly as it appears in your .htaccess file? The rule itself could be "problematic" ;)

Dan

Blade

1:01 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



Should be ok as the http header returns "HTTP/1.1 200 OK" which is correct.

Also Slurp & Scooter bots are indexing these correctly just not google which I cannot fathom.

BUT if google has no issues with RewriteRule, then perhaps the bot hasn’t scanned this far into my site just yet. All I need really is confirmation that RewriteRule does not trip google in spidering static pages. Has anyone here had such pages index ok?

hetzeld

2:11 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



Blade,

I know a lot of sites having rewritten URLs indexed in google. That shouldn't be an issue at all. It doesn't even prevent amazon.com to have a PR9...

When using a properly written "rewriterule", nobody is even aware the URL has been rewritten... no more GoogleBot than any other bot/visitor. Simply make sure you don't forget the [L=LAST] flag at the end of the rule when applicable.

Dan

<add> Maybe, as you suggested, has Googlebot not crawled that "deep" in your site yet</add>

jdMorgan

2:23 pm on Apr 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Blade,

The fact that requests for your rewritten URLs result in a 200-OK response indicates that you used an internal (transparent) redirect. As such, the rewrite takes place entirely inside the file system of your server, and is not externally visible or detectable. So, you should not have a problem.

In order for a user or robot to detect a rewrite, you'd have to use the [R] flag, and return a 301 or 302 "Moved" response. This would tell the user-agent to repeat the request with the supplied new URL, and so inform the user-agent that a rewrite/redirect is needed. But an internal rewrite resulting in a 200-OK doesn't involve that interaction with the user-agent.

I suspect Googlebot is just taking its time - as usual - about finding these "new pages" through the links on your site. As with new sites/new pages, worry only after two complete update cycles have passed. But in your case, the 200-OK says it all.

Jim

Blade

3:46 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



Thanks very much for the info, I'll see what happens over the next 8 weeks.

Unversed

3:56 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



I have thousands of RewriteRule pages indexed by Google - I'm sure that is not your problem. If the pages, or the RewriteRules, are new then it may take a couple of updates before Google spiders them.

My (limited) experience when introducing large quantities of new material is it takes at least 2 deep crawls after the index page is indexed before the content is crawled properly.

nirelan2

5:06 pm on Apr 5, 2003 (gmt 0)



Well Blade, this is do to the fact that Google has a good ranking algirithom and nothing else.Its technology is clearly outdated.

Rhadamanthus

7:46 pm on Apr 5, 2003 (gmt 0)

10+ Year Member



I use a rewrite rule which is very similar to what you're doing, and Googlebot sees everything just fine. I use mine to map from /stuff/12345.html to /showstuff.html?id=12345. Googlebot has never had trouble finding any pages this way.