Welcome to WebmasterWorld Guest from 54.144.68.27

Forum Moderators: goodroi

Message Too Old, No Replies

MSN and Yahoo both missing robots.txt

     

bppilot

4:43 am on Jul 14, 2004 (gmt 0)

10+ Year Member



I found it kind of interesting when I used the robots validator (http://www.searchengineworld.com/cgi-bin/robotcheck.cgi) that both Yahoo and MSN don't have a robots file at all - it just 404's out. Google has a very extensive one.

There was previous debate that lack of a Robots file would possibly lead to problems but this doesn't seem to be the case if these major players aren't using them.

jdMorgan

5:57 am on Jul 14, 2004 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The only problem it leads to is an error log full of 404 errors. That can be cured by putting up a blank file named robots.txt, which is equivalent to allowing all robots access to all pages, but it stops the 404 errors.

Jim

baggiho

9:42 pm on Jul 14, 2004 (gmt 0)

10+ Year Member



I try to use [searchengineworld.com...] and place my URL

There are lots of error. Why?

wkitty42

3:26 pm on Jul 29, 2004 (gmt 0)

10+ Year Member



baggiho,

well, assuming that the domain listed in your profile is the domain you are talking about, it appears that since you grabbed a copy of the robots.txt from searchengineworld that allows everything, that you have the problem fixed...

how do i know about where it was copied from? because it says so right in it ;)


==== domain obfuscated for TOS ===================================
07/29/04 11:16:45 Browsing http://****-xxxxxxxx-xxxx.com/robots.txt
Fetching http://xxx-xxxxxxxx-xxxx.com/robots.txt ...
GET /robots.txt HTTP/1.1
Host: xxx-xxxxxxxx-xxxx.com
Connection: close
User-Agent: Sam Spade 1.14

HTTP/1.1 200 OK
Date: Thu, 29 Jul 2004 15:16:25 GMT
Server: Apache/1.3.31 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.3 FrontPage/5.0.2.2634a mod_ssl/2.8.18 OpenSSL/0.9.7a
Last-Modified: Thu, 15 Jul 2004 05:08:56 GMT
ETag: "1a093-7b-40f6117e"
Accept-Ranges: bytes
Content-Length: 123
Connection: close
Content-Type: text/plain

# Robots.txt file from http://www.searchengineworld.com
#
# All robots will spider the domain

User-agent: *
Disallow:
==================================================================


;)

anyway, i guess you have this all sussed out now?

 

Featured Threads

Hot Threads This Week

Hot Threads This Month