Forum Moderators: goodroi
So I just wanted to disallow google from indexing my review pages, but I am using SEO URL, so I don't really know how to do it. My real reviews URL would be:
http://www.example.com/product_reviews.php?products_id=72
but with SEO URL, it is:
http://www.example.com/vga-splitter-duplicator-puertos-pc-monitores-pr-72.html
so I do not know how I could add that to my robots.txt. What all the reviews have in common at the end is the *-pr-number.html, their rewriterule is the following:
RewriteRule ^(.*)-pr-([0-9]+).html$ product_reviews.php?products_id=$2&%{QUERY_STRING}
So I was wondering what would be the right way of disallowing all this review files, since I do not think that placing a "Disallow: /product_reviews.php" will do the trick.
Disallow: /*-pr-*
Would this work ok? I am not sure I am using the right sintax, and I do know that this will also disallow any other URLs that have -pr- on them (but I can live with that, I doubt I am going to use the word "pr" a lot around hehe)
Many thanks! :)
[edited by: jatar_k at 6:34 pm (utc) on Nov. 13, 2007]
[edit reason] please use example.com [/edit]
you should examplify your urls in this forum (use example.com)
the correct way to exclude this would be "Disallow: /product_reviews.php" since wildcarding and globbing aren't officially supported for robot.txt.
The problem is that the URLs I want to exclude are not /product_reviews.php?products_id=72 , but /vga-splitter-duplicator-puertos-pc-monitores-pr-72.html
I have a contribution installed on my store that changes the URLs to the second ones, to make them friendlier for search engines. SO I guess that if I add "Disallow: /product_reviews.php" to my robots.txt, URLs like example.com/vga-splitter-duplicator-puertos-pc-monitores-pr-72.html will still be spidered, which is what I want to avoid.
Thank you!
Aitor
/pr-vga-splitter-duplicator-puertos-pc-monitores-72.html
/pr-72-vga-splitter-duplicator-puertos-pc-monitores.html
URL systems should be designed with the limitations of robots.txt URL prefix-matching in mind.
Jim
After all, if I do not want them indexed, I couldn't care less on how nice they look like to search engines... I wll try to take this route, make it so they are not SEOed, disallow product_reviews.php on robots.txt, and wait for the old -pr- URLs to vanish from google.
Thanks! :D
<?php
if(strstr($_SERVER['REQUEST_URI'],"-pr-") === TRUE) {
echo "<meta name=\"robots\" content=\"noindex,nofollow,noarchive\" />";
}
?>
Justin
<added>
If you want some SE credit for the pages, you might consider changing the line to:
echo "<meta name=\"robots\" content=\"noindex,follow,noarchive\" />";
</added>
That ought to be such a simple solution that it just couldn't get any easier. No messing with URLs, no messing with .htaccess, and no messing with almost anything. I only had to create a new variable on my meta tags module so it will add the "noindex,follow,noarchive" to my product_reviews.php and product_reviews_info.php, and will just leave it "all" everywhere else.
Thank you! :D
I just checked with google webmaster tools, and I have a long list of URLs restricted by robots.txt (386). All the review pages seem to be coming out there, so people won't find any more review pages before the one for the product :D
Thx again! ^^