Welcome to WebmasterWorld Guest from 54.160.131.144

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt

Do you use it if you want to all all robots?

     

WebSeeker

2:14 pm on Jan 2, 2001 (gmt 0)

10+ Year Member



Do you use Robots.txt if you want to allow all robots from all engines? Or is it only used when you want to disallow something?

Macguru

3:48 pm on Jan 2, 2001 (gmt 0)

WebmasterWorld Senior Member macguru is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Hi WebSeeker,

I use this file mainly to disalow access to some robots, files or folders. Also to cut down on thoses 404 Errors.

GWJ

3:23 pm on Jan 3, 2001 (gmt 0)



Hi Macguru,

>>Also to cut down on thoses 404 Errors.

I'm sorry, could you go into more detail on this statment. How does it cutn down?

TIA,

Brian

GWJ

3:24 pm on Jan 3, 2001 (gmt 0)



Hi Macguru,

>>Also to cut down on thoses 404 Errors.

I'm sorry, could you go into more detail on this statment. How does it cut down?

TIA,

Brian

Macguru

3:33 pm on Jan 3, 2001 (gmt 0)

WebmasterWorld Senior Member macguru is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Hi GWJ,

Maybe I was unclear on that, sorry, my mother tongue is french. (I try hard to make sense... ;) )

When most robots crawl your site, they look for the robots.txt at the root level.
If they dont find it, a 404 error is registered to the error log, when they do, they read it and abide by it.

Whith an error log filled with 404 error from robots looking for this file it is harder to concentrate on "real" 404 errors.

GWJ

12:43 pm on Jan 4, 2001 (gmt 0)



Gotcha MacGuru. Very good English by the way.

Brian

Brett_Tabke

8:30 am on Apr 16, 2001 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



One thing I noticed while doing the robots.txt crawl/validator over at SEW, was so many people were redirecting to a 404. Often that 404 was "seamless" with no redirect or forward of any nature. You just pull the robots.txt and get their 404 page. A search engine will then have to figure out if it is really a robots.txt or an html page. That is pretty easy to do (look for <html> or <body> tags), but if it is a frame set page, it can be confusing. One trick I did find, was someone using a frameset page that actually worked as both a robots.txt AND as the frameset page. This knowledge will be kept under lock and key - it's the last thing we need on the web.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month