Welcome to WebmasterWorld Guest from 54.145.105.173

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt

Do you use it if you want to all all robots?

     
2:14 pm on Jan 2, 2001 (gmt 0)

New User

10+ Year Member

joined:May 26, 2005
posts:16
votes: 0


Do you use Robots.txt if you want to allow all robots from all engines? Or is it only used when you want to disallow something?
3:48 pm on Jan 2, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member macguru is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 30, 2000
posts:3300
votes: 0


Hi WebSeeker,

I use this file mainly to disalow access to some robots, files or folders. Also to cut down on thoses 404 Errors.

GWJ

3:23 pm on Jan 3, 2001 (gmt 0)

Full Member

joined:June 21, 2000
posts:339
votes: 0


Hi Macguru,

>>Also to cut down on thoses 404 Errors.

I'm sorry, could you go into more detail on this statment. How does it cutn down?

TIA,

Brian

GWJ

3:24 pm on Jan 3, 2001 (gmt 0)

Full Member

joined:June 21, 2000
posts:339
votes: 0


Hi Macguru,

>>Also to cut down on thoses 404 Errors.

I'm sorry, could you go into more detail on this statment. How does it cut down?

TIA,

Brian

3:33 pm on Jan 3, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member macguru is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 30, 2000
posts:3300
votes: 0


Hi GWJ,

Maybe I was unclear on that, sorry, my mother tongue is french. (I try hard to make sense... ;) )

When most robots crawl your site, they look for the robots.txt at the root level.
If they dont find it, a 404 error is registered to the error log, when they do, they read it and abide by it.

Whith an error log filled with 404 error from robots looking for this file it is harder to concentrate on "real" 404 errors.

GWJ

12:43 pm on Jan 4, 2001 (gmt 0)

Full Member

joined:June 21, 2000
posts:339
votes: 0


Gotcha MacGuru. Very good English by the way.

Brian

8:30 am on Apr 16, 2001 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38059
votes: 13


One thing I noticed while doing the robots.txt crawl/validator over at SEW, was so many people were redirecting to a 404. Often that 404 was "seamless" with no redirect or forward of any nature. You just pull the robots.txt and get their 404 page. A search engine will then have to figure out if it is really a robots.txt or an html page. That is pretty easy to do (look for <html> or <body> tags), but if it is a frame set page, it can be confusing. One trick I did find, was someone using a frameset page that actually worked as both a robots.txt AND as the frameset page. This knowledge will be kept under lock and key - it's the last thing we need on the web.