Welcome to WebmasterWorld Guest from 50.19.190.144

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt - What's the Point?

     
4:49 pm on May 30, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 5, 2003
posts:72
votes: 0


I've been devouring every piece of information I can on the site regarding robots.txt, and I still come to the same question: Why bother?

Isn't it better to have no robots.txt at all and let all the spiders in?

4:51 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 13, 2002
posts:662
votes: 0


Some spiders harvest email addresses, do you want them grabbing yours?

Others bombard your server with requests one after another too fast and bring your server to its knees.

Sometimes you have pseudo-sensitive data you don't want crawled.

Other times you have a development server online that you don't want crawled and indexed, just your production server should be.

Etc...

4:52 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 5, 2001
posts:2466
votes: 0


depends if you want to let all the spiders in.

there are places that people just do not want to be indexed.

personal data or subscriber data.

dave

4:53 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


You may have pages that you do not want the spiders to bother to index. It's a waste of bandwidth and time.

I have a few. Another reason on our site is we duplicate content (with permission) from another website. That would get us penalised by google if we let them in.

TJ

4:54 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 5, 2001
posts:2466
votes: 0


gibble : Some spiders harvest email addresses

and of course they would obey the no robots.txt file ;)

DaveN

6:56 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 23, 2000
posts:1277
votes: 0


many rogue spiders don't obey robots.txt. you may need to ban the via .htaccess
6:59 pm on May 30, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 13, 2002
posts:662
votes: 0


well...yeah...you have a point .htaccess is much more efficient for actually STOPPING a spider

hehe oops :p

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members