Forum Moderators: phranque
I've away from web mastering for a few years and see by all the activity here, much has changed as it was back then.
I was a victim of Email Siphons and I am now looking for a direction to locate either a place to learn how to write an update Spider.txt file or be able to use something like a freeware/shareware or sample of one I can modify to keep out the pests while allowing weekly updates from all Spiders for SE's.
Will anyone here help this 55 yr old returning mayor/webmaster and help or provide map to locate updating education? TIA if U do!
I think you are looking for help with a robots.txt file, and there has been quite a deal of discussion about them here over the last few months. Particularly one memorable thread started by Brett with what could be the ultimate Robots.txt file.
Try using the search function above and search for robot.tst and see how you go. Holler if you get lost and well help out.
>55 yr old
Ah, so there's hope for me yet. ;)
Onya
Woz
OK, I'll bite (not myself or U) where wood I find the ultimate spider.txt :) or do you provide maps ?
I use Putz cuz it fits me well :)
Although we are in the habit of referring to them as "spiders", officially they are web robots. Its just one of those bits of jargon, I guess
There is an excellent resource at [robotstxt.org...] including links to the spec documents
I was sent a file like this...
<.Files .htaccess>
deny from all
<./Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR] RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR] RewriteCond %
{HTTP_USER_AGENT} ^NICErsPRO [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus.
*Webster [OR] RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %
{HTTP_USER_AGENT} ^LinkWalker [OR] RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR] RewriteCond %
{HTTP_USER_AGENT} ^ia_archiver [OR] RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR] RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector RewriteRule ^.* - [F] RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]
What exactly is this, how duz it work and is there a top to it, what duz it llok like and where can I find it? I no, a bunch of questions, but the help here is of such great quality, thanks for all the help!
Named appropriately Putz, sumtimes with an R added...;)