Welcome to WebmasterWorld Guest from 54.167.216.93

Forum Moderators: goodroi

Message Too Old, No Replies

Robot.txt and Templates

The <meta>Robot tag in templates, with specific disallow in robot.txt

     
11:22 pm on Mar 10, 2003 (gmt 0)

10+ Year Member



The store I manage uses a template/database based language, meaning I only have 3 main pages in the directory,( with info I would really want indexed), index.itml, prod.itml, and level.itml. I have a <meta name="robots" content="index, follow"> on all three of these templates, as I encourage as much traffic as possible. The templates then generate pages based on MySQL database node id numbers and the info stored in each directory as such. I do have a couple of development regions on the database, and I have attempted disallow to the specific URLs relating to them. The pages in question are still being visited by all the "nice" bots, (and a slew of nasty). I am wondering if my robot.txt is useless due to the template/database format I am operating with.
12:33 am on Mar 11, 2003 (gmt 0)

WebmasterWorld Senior Member heini is a WebmasterWorld Top Contributor of All Time 10+ Year Member



juniperwasting, welcome to WebmasterWorld

What you are using is not even robots.txt, but the meta name="robots". In short: That's not the appropriate solution to your problem.
Robots.txt is a file, which you put in the root of your pages. With robots.txt you can set rules for spiders as to which directories and files they are allowed to visit.
Since robots.txt is based on a convention spiders should adhere to those rules, but rogue bots often just don't care.
Anyhow, here's all about robots.txt [searchengineworld.com]

I would suggest however to look into a solution with .htaccess.

12:52 am on Mar 11, 2003 (gmt 0)

10+ Year Member



heini

I do have a robot.txt file, in fact I have had it much longer then the meta tag. I am wondering if I am causing a bot confusion.

2:12 am on Mar 11, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



juniperwasting,

If a page is disallowed in robots.txt, no 'good' bot will not be confused, since it will never fetch that page and thus never see the <meta robots> tag.

Validate your robots.txt file [searchengineworld.com]. If it valid, I'd suspect some other problem, such as alias link-paths to the files you have disallowed.

Bad 'bots pay no attention to either method of robots control, so do not base your debugging decisions on their behaviour.

HTH,
Jim

4:12 pm on Mar 11, 2003 (gmt 0)

10+ Year Member



Thanks jdMorgan,

my file validates, and I can only wait for it now.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month