Forum Moderators: phranque
I've got a working .htaccess file for my shopping cart system, where various 'items' all get redirected to a single page, like so:
RewriteRule ^item_saw.html$ item.php?18?%{QUERY_STRING} [L]
RewriteRule ^item_hammer.html$ item.php?19?%{QUERY_STRING} [L]
RewriteRule ^item_screwdriver.html$ item.php?20?%{QUERY_STRING} [L]
and, this works great. But, Google is not only adding the pages called
/item_saw.html
/item_hammer.html
/item_screwdriver.html
etc, but it's also adding
/item.php?18?
/item.php?19?
/item.php?20?
etc... which I don't want it to do! I suspect that breaks up my PageRank somewhat, and it's just plain unattractive.
I have no links to the underlying pages on my site, so how come Google is able to tell what page I'm internally serving? Surely the item.php bit never makes it out of my server?
Any idea how Google is able to work out the underlying page name? And how to stop it?
Note this is supposed to be an internal rewrite, not a redirect.
However, you may have some other rules that are interfering with this. All redirects should be listed first, and all rewrites should be listed last.
You need to use Live HTTP Headers to examine the server response. My guess is that you'll see a 302 returned somewhere in the system.
Your internal filepath is also likely invalid. You have two question marks in it. Maybe one of those should be an ampersand?
You should also set up a series of redirects so that requests for URLs with parameters are redirected so that the browser makes a new request for the correct URL.
I've only got a few other rules in my .htaccess file, and I'm pretty sure they're not interfering...
I used an online app to test the HTTP headers I get back, and there's no redirection at all! But, in the HTML that's returned, I do have this:
<!-- PASS THE POST-MOD-REWRITE URL TO JAVASCRIPT -->
<script type="text/javascript">
current_url = "http://www.example.com/item.php?27?";
</script>
Google can't fish URL's out of Javascript variables can it?
And it appears that Google now can execute Javascript code, and follow links contained within. I found another chap who seems to have experienced the same situation:
I can't post the URL here, but the number 1 result for a google search for 'new reality google follows links' should get you the article I just read. :)
Guess I'll have to cloak my Javascript like you say! Yurgh... site... getting... messier.
Thanks for the advice guys.
Like simple JS, Ajax is 'cool' and all that, but it should not be used for critical functions that will break your site if they aren't executed. I see a lot of folks who think that --for example-- they can *either* use client-side scripting like Ajax, *or* they can use server-side scripting such as PHP. The truth is that they should use both, with the selection made based on what is most appropriate and most robust, not on what 'their favorite language' is. This situation reminds me of the old saw, "If all you have is a hammer, then every problem looks like a nail."
Jim