Welcome to WebmasterWorld Guest from 54.145.176.120

Forum Moderators: goodroi

Message Too Old, No Replies

No Robots.txt page

what's the harm?

   
9:02 pm on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Will not having a robots.txt page cause a bot not to spider a site?

And other than an occaisional psycho bot ..any other harm in not running a robots.txt page?

9:08 pm on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Will not having a robots.txt page cause a bot not to spider a site?

Robots.txt is designed to EXCLUDE robots - not to invite them in.

9:19 pm on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mardi_Gras that was always my understanding , I have a bot that stops at robots.txt looking for instryuctions.. it gets a 404 ..leaves and dosent pick up any other pages..

I guess I'm just irrated that it leaves al the time..wondering if Ican use the robots.txt page as an invite using index/follow instructions

<meta name="robots" content="index,follow">

then create a link to my site map?

9:28 pm on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just drop in a simple robots.txt and see what happens - it can't hurt to try :)
9:35 pm on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Cant hurt to try LOL.. actually that's why I was asking.. because most of the pages rank fairly well anyways on the se's I am in and I didnt want to screw up any of those rankings by "playing around" in the robots.txt

99% sure it shouldnt cause any problems.. it's just that 1% uncertainty that keeps nagging at me ..

I'll try it out on one of my other least important sites..

thanks for your help

7:26 am on Feb 29, 2004 (gmt 0)

10+ Year Member



I'd put one in there just to prevent the error_log growing with 404 errors due to search engines requesting robots.txt
4:11 am on Mar 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I agree... it's so easy and you'll gain from the experience anyway. Not bad to have a task to do that has no risk, no deadline, no cost, eh?

It's easy. Just point your browser to your favorite website and put in the domain name follwed by /robots.txt like this:

[webmasterworld.com...] <enter>

and you'll get their robots.txt file. Edit is and upload it to your root directory next to the INDEX file. Here's a clip from Brett's - he had mentioned last weke he had to exclude unknown bots because they were hiting his site so hard and costing him bandwidth. Not a bad idea IMHO to start with this one...

paybacksa-----

#
# WebmasterWorld.com: robots.txt
# GNU Robots.txt Feel free to use with credit given to WebmasterWorld.
# Please, we do NOT allow nonauthorized robots any longer.
# [searchengineworld.com...]
# Yes, feel free to copy and use the following.

User-agent: msnbot
Disallow: /

User-agent: scooter
Disallow: /

User-agent: naver
Disallow: /

User-agent: dumbot
Disallow: /

User-agent: Hatena Antenna
Disallow: /
-----truncated by paybacksa

 

Featured Threads

Hot Threads This Week

Hot Threads This Month