Forum Moderators: goodroi

Message Too Old, No Replies

Don't want to index dynamic pages

But would it be a mistake to block them?

         

chopin2256

1:49 am on Jul 30, 2005 (gmt 0)

10+ Year Member



Hi,

I run a forum in which the dynamic pages are converted to html using htaccess. I also may run mambo on my site. I will be using htaccess to convert the dynamic pages into html as well. Google and other search engines spider the dynamic files like crazy still, and I was hoping the bots would spider only the html files. What syntax do I need to do in the robots.txt file in order to exclude all of the dynamic pages from being spidered? The forum is invision board (php). I beleive mambo will be in php as well.

Would it be a mistake to block the search engines from spidering these dynamic pages due to the fact the search engines spider these pages like crazy? Why aren't they spidering the html pages as much?

encyclo

1:54 am on Jul 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just list the filenames of the dynamic URLs in your robots.txt block. Try the robots.txt tutorial [searchengineworld.com] for more specific help, or post back with your robots.txt if you are unsure whether the syntax is correct. :)

chopin2256

2:10 am on Jul 30, 2005 (gmt 0)

10+ Year Member



Lets say I want to disallow googlebot and googlebot only. If I do this:

User-agent: googlebot
Disallow: /

Will the rest of the search engines be allowed to spider my site?

Because according to the tutorial, if you do this:

User-agent: googlebot
Disallow:

It will only allow google to spider my site, and no one else.

chopin2256

2:11 am on Jul 30, 2005 (gmt 0)

10+ Year Member



And....here is the robots.txt I would probably need to do:

User-agent: *
Disallow: /forum/
Disallow: /mambo/ #not installed yet

I want all search engine spiders to be blocked from spidering my php dynamic pages. But is this safe! Can you take a look at my site to make sure! I will sticky you the url.

encyclo

2:12 am on Jul 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you want to exclude everyone* but Googlebot, you need two rules - one for blocking eeryone, followed by a specific rule for Googlebot allowing it access:

User-agent: *
Disallow: /

User-agent: googlebot
Disallow:

For your example code, you would be excluding everything in the forum and mambo directories. Is that what you want? What do your static forum URLs look like?

*- everyone, that is, that respects robots.txt