Robot.txt and Templates

Forum Moderators: goodroi

Message Too Old, No Replies

Robot.txt and Templates

The <meta>Robot tag in templates, with specific disallow in robot.txt

juniperwasting

11:22 pm on Mar 10, 2003 (gmt 0)

The store I manage uses a template/database based language, meaning I only have 3 main pages in the directory,( with info I would really want indexed), index.itml, prod.itml, and level.itml. I have a <meta name="robots" content="index, follow"> on all three of these templates, as I encourage as much traffic as possible. The templates then generate pages based on MySQL database node id numbers and the info stored in each directory as such. I do have a couple of development regions on the database, and I have attempted disallow to the specific URLs relating to them. The pages in question are still being visited by all the "nice" bots, (and a slew of nasty). I am wondering if my robot.txt is useless due to the template/database format I am operating with.

heini

12:33 am on Mar 11, 2003 (gmt 0)

juniperwasting, welcome to WebmasterWorld

What you are using is not even robots.txt, but the meta name="robots". In short: That's not the appropriate solution to your problem.
Robots.txt is a file, which you put in the root of your pages. With robots.txt you can set rules for spiders as to which directories and files they are allowed to visit.
Since robots.txt is based on a convention spiders should adhere to those rules, but rogue bots often just don't care.
Anyhow, here's all about robots.txt [searchengineworld.com]

I would suggest however to look into a solution with .htaccess.

juniperwasting

12:52 am on Mar 11, 2003 (gmt 0)

heini

I do have a robot.txt file, in fact I have had it much longer then the meta tag. I am wondering if I am causing a bot confusion.

jdMorgan

2:12 am on Mar 11, 2003 (gmt 0)

juniperwasting,

If a page is disallowed in robots.txt, no 'good' bot will not be confused, since it will never fetch that page and thus never see the <meta robots> tag.

Validate your robots.txt file [searchengineworld.com]. If it valid, I'd suspect some other problem, such as alias link-paths to the files you have disallowed.

Bad 'bots pay no attention to either method of robots control, so do not base your debugging decisions on their behaviour.

HTH,
Jim

juniperwasting

4:12 pm on Mar 11, 2003 (gmt 0)

Thanks jdMorgan,

my file validates, and I can only wait for it now.