Robots.txt help

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt help

Northstar

4:46 pm on Sep 10, 2006 (gmt 0)

I would like to block some duplicate pages that my script is producing via robots.txt.

I want to block this page: http://www.example.com/cgi-bin/pseek/dirs.cgilv=2&ct=category_widgets

But want to keep this page: http://www.example.com/cgi-bin/pseek/dirs2.cgi?cid=147

Would this work to block the first URL without hurting the second one?

User-Agent: *
Disallow: /cgi-bin/pseek/dirs.cgilv

Or would it be better to write out the full URL for each page I want to block like this.

User-Agent: *
Disallow: /cgi-bin/pseek/dirs.cgilv=2&ct=category_widgets

I need to be very careful not to block the second URL (dirs2.cgi). Would there be any danger of blocking the second URL with any of the above robots.txt disallow's?

goodroi

7:50 pm on Sep 20, 2006 (gmt 0)

Hi Northstar,

I would not recommend using robots.txt to block specific pages. If you have a very large site you could end up having a 1mb robotos.txt file and trust me search engines don't like that. Have you thought about using .htaccess to resolve the situation?

Robots.txt help

Northstar

goodroi

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week