homepage Welcome to WebmasterWorld Guest from 54.197.111.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt and htaccess
Script to write these files, is there such a thing?
mack




msg:1526488
 12:05 pm on Sep 25, 2002 (gmt 0)

I was just wondering if there was such a thing as a perl script that could be used to automaticaly write to your robots.txt file or your htaccess file.

This would be very usefull if it has some form of admin page where you could specify a user-agent and check a box for allow or disalow etc. or you coudl specify directories that are to be protected with robots.txt

To be honest I dont even know if this is possible but is would be a very usefull tool for the web developer.

 

Damian




msg:1526489
 12:53 pm on Sep 25, 2002 (gmt 0)

Do you mean a script for people who do not know how to edit these files by hand, or a script that adds entries to these files automatically?

It's both certainly possible, shouldn't be hard to write.

For the first case I think it's probably best to just learn the theory needed and do it by hand, giving maximum control and minimal bugs :)..if it's just one or a few files that need to be edited. It's not much to learn..a day learning should teach you all you need to know.

The second case could save some time, ie..blocking rogue spiders automatically.
I know of one script that does part of that..it can automatically block an ip by adding it to an .htaccess file.

It's called Apache Guardian, from xav.com. The feature is called 'blacklist'..it's not mentioned as one of the main features of the script on the site..I guess because it does have some drawbacks to block ip's automatically. ie..blocking innocent ip's, blocking proxy servers ..blocking spiders you do not want to block..htaccess files growing very large..

This script is not 'industrial strenght' though..
If you know some perl you could possibly tweak it to do exactly what you need in your case...clear the .htacces files automatically according to need...build in some better proxy detection..write to robots.txt instead of an .htacces file etc..

Maybe others know of better solutions..

circuitjump




msg:1526490
 3:57 pm on Sep 25, 2002 (gmt 0)

You know who might like that? People who run web hosting sites. You can offer that to your customers cause not many have the time to go and learn how to write a .htaccess file and a robots.txt file. So if you make an app that will allow them to exclude spiders from a database and also add other spiders that are not in the database for exclusion. That would be neat.

I don't know really though just thought about that.

mack




msg:1526491
 12:55 am on Sep 26, 2002 (gmt 0)

Thanks for both your replies. I wasa just thinking along the lines of making it easire to do without having toi upload your files and not having to have a lot of knowledge. A silmpe interface that explains everything and is form based woudl be excelent for the beginer.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved