Forum Moderators: phranque

Message Too Old, No Replies

block only some dynamic URL's

How to block only some dynamic URL's-- htaccess and robots.txt

         

illlogical

12:50 pm on Apr 15, 2006 (gmt 0)

10+ Year Member



Hi,

I have a page that runs a product rating script and there are a lot of products and a lot of possible ratings. Most robots understand this or comprehend it quickly so I was letting them run through my site but the other day, I thought there was a full out attack on the site. It turned out to be a few lost spiders.

There is no reason to have the spiders press the ratings on the products so I would like to block them out of the ratings while letting them follow other links on the pages. Since all of the pages are dynamic, there is no directory to block with robots.txt. I tried blocking all dynamic pages using a fix from a thread in these forums but that did not work for two reasons-- not all bots read wildchars and I don't really want to block bots from ALL dynamic pages.

I just want to keep bots out of some areas of dynamic content

Here is a hypothetical domain--

...www.rumblebumble.com/index.php?option=dddddddd&task=rate&Itemid=&startrow=1&endrow=2&cat=&pixidx=381&rate=2

I would like to find a way to use both robots.txt and .htaccess to keep robots out of any link that contains "task=rate"

All other dynamic content even stuff that is the same in every other way except for that one command still needs to be indexed. If anyone has any suggestions, please let me know. Thanks a million

-Andrew

Little_G

1:17 pm on Apr 15, 2006 (gmt 0)

10+ Year Member



Hi,

I don't think it's possible to do what you wnat using robot.txt. I would recommend using PHP to add a robots meta tag, eg:


if($_GET['task'] == "rate"){
echo "<meta name=\"robots\" content=\"noindex,nofollow\">";
}

Andrew