homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Robots.txt - What's the Point?

 4:49 pm on May 30, 2003 (gmt 0)

I've been devouring every piece of information I can on the site regarding robots.txt, and I still come to the same question: Why bother?

Isn't it better to have no robots.txt at all and let all the spiders in?



 4:51 pm on May 30, 2003 (gmt 0)

Some spiders harvest email addresses, do you want them grabbing yours?

Others bombard your server with requests one after another too fast and bring your server to its knees.

Sometimes you have pseudo-sensitive data you don't want crawled.

Other times you have a development server online that you don't want crawled and indexed, just your production server should be.



 4:52 pm on May 30, 2003 (gmt 0)

depends if you want to let all the spiders in.

there are places that people just do not want to be indexed.

personal data or subscriber data.



 4:53 pm on May 30, 2003 (gmt 0)

You may have pages that you do not want the spiders to bother to index. It's a waste of bandwidth and time.

I have a few. Another reason on our site is we duplicate content (with permission) from another website. That would get us penalised by google if we let them in.



 4:54 pm on May 30, 2003 (gmt 0)

gibble : Some spiders harvest email addresses

and of course they would obey the no robots.txt file ;)



 6:56 pm on May 30, 2003 (gmt 0)

many rogue spiders don't obey robots.txt. you may need to ban the via .htaccess


 6:59 pm on May 30, 2003 (gmt 0)

well...yeah...you have a point .htaccess is much more efficient for actually STOPPING a spider

hehe oops :p

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved