Forum Moderators: goodroi

Message Too Old, No Replies

Allow search spider - error 503 socket.gaierror: (7, 'getaddrinfo fail

Allow search spider - error

         

shahindastur

8:04 pm on Sep 8, 2005 (gmt 0)

10+ Year Member



Hi,

I need to be able to allow a search spider (Ultraseek) to search content on my server but no luck! I think Ive tried everything. Any help is appreciated.

returns this error:
503 socket.gaierror: (7, 'getaddrinfo failed') (config.py:3919):

Copied below is the content of the robots.txt file (also, tried with a blank file, didn't work):

User-agent: Ultraseek
Disallow:

User-agent: *
Disallow: /

Any ideas?

bill

4:24 am on Sep 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Welcome to WebmasterWorld shahindastur.

If this is your robots.txt file:

User-agent: *
Disallow: /

That means disallow all robots from all directories on the site. Ultraseek and any other bot that follows robots.txt will see that and leave your site.

Keep in mind that robot.txt is used to restrict access to areas of your site, not allow access. We have an entire forum dedicated to this topic: robots.txt [webmasterworld.com].

shahindastur

1:25 pm on Sep 9, 2005 (gmt 0)

10+ Year Member



Hi Bill,

Thanks for your reply.

The robots.txt file does have some code though to allow Ultraseek, the spider I would like to allow- everything else is blocked. Also tried with a blank file, no luck.

file content:

User-agent: Ultraseek
Disallow:

User-agent: *
Disallow: /

bill

7:05 am on Sep 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To the best of my knowledge robots.txt doesn't work that way. Your file as written will disallow everything from your entire site.

tedster

5:39 am on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just looked it up, bill, and it looks like the robots.txt that shahindastur posted is correct. The authority website has an exactly parallel example:

To allow a single robot

User-agent: WebCrawler
Disallow:

User-agent: *
Disallow: /

[robotstxt.org...]

I'd say the issue appears to be a server configuration problem, more than a problem in the robots.txt file itself. Can't find anything detailed enough on the web, but I would suggest looking into that specific error message. Python is out of my comfort zone, so I can't offer much more than that.