homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Robot.txt and Templates
The <meta>Robot tag in templates, with specific disallow in robot.txt

 11:22 pm on Mar 10, 2003 (gmt 0)

The store I manage uses a template/database based language, meaning I only have 3 main pages in the directory,( with info I would really want indexed), index.itml, prod.itml, and level.itml. I have a <meta name="robots" content="index, follow"> on all three of these templates, as I encourage as much traffic as possible. The templates then generate pages based on MySQL database node id numbers and the info stored in each directory as such. I do have a couple of development regions on the database, and I have attempted disallow to the specific URLs relating to them. The pages in question are still being visited by all the "nice" bots, (and a slew of nasty). I am wondering if my robot.txt is useless due to the template/database format I am operating with.



 12:33 am on Mar 11, 2003 (gmt 0)

juniperwasting, welcome to WebmasterWorld

What you are using is not even robots.txt, but the meta name="robots". In short: That's not the appropriate solution to your problem.
Robots.txt is a file, which you put in the root of your pages. With robots.txt you can set rules for spiders as to which directories and files they are allowed to visit.
Since robots.txt is based on a convention spiders should adhere to those rules, but rogue bots often just don't care.
Anyhow, here's all about robots.txt [searchengineworld.com]

I would suggest however to look into a solution with .htaccess.


 12:52 am on Mar 11, 2003 (gmt 0)


I do have a robot.txt file, in fact I have had it much longer then the meta tag. I am wondering if I am causing a bot confusion.


 2:12 am on Mar 11, 2003 (gmt 0)


If a page is disallowed in robots.txt, no 'good' bot will not be confused, since it will never fetch that page and thus never see the <meta robots> tag.

Validate your robots.txt file [searchengineworld.com]. If it valid, I'd suspect some other problem, such as alias link-paths to the files you have disallowed.

Bad 'bots pay no attention to either method of robots control, so do not base your debugging decisions on their behaviour.



 4:12 pm on Mar 11, 2003 (gmt 0)

Thanks jdMorgan,

my file validates, and I can only wait for it now.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved