Forum Moderators: goodroi
I run a forum in which the dynamic pages are converted to html using htaccess. I also may run mambo on my site. I will be using htaccess to convert the dynamic pages into html as well. Google and other search engines spider the dynamic files like crazy still, and I was hoping the bots would spider only the html files. What syntax do I need to do in the robots.txt file in order to exclude all of the dynamic pages from being spidered? The forum is invision board (php). I beleive mambo will be in php as well.
Would it be a mistake to block the search engines from spidering these dynamic pages due to the fact the search engines spider these pages like crazy? Why aren't they spidering the html pages as much?
User-agent: googlebot
Disallow: /
Will the rest of the search engines be allowed to spider my site?
Because according to the tutorial, if you do this:
User-agent: googlebot
Disallow:
It will only allow google to spider my site, and no one else.
User-agent: *
Disallow: /forum/
Disallow: /mambo/ #not installed yet
I want all search engine spiders to be blocked from spidering my php dynamic pages. But is this safe! Can you take a look at my site to make sure! I will sticky you the url.
User-agent: *
Disallow: /
User-agent: googlebot
Disallow: For your example code, you would be excluding everything in the forum and mambo directories. Is that what you want? What do your static forum URLs look like?
*- everyone, that is, that respects robots.txt