Getting 403 Forbidden for robots.txt

Forum Moderators: goodroi

Message Too Old, No Replies

Getting 403 Forbidden for robots.txt

Can anyone help

GodLikeLotus

11:17 pm on Dec 17, 2003 (gmt 0)

Last week I read up on robots.txt files and added to 1 of our sites. Since last week we have been have pages vanishing from Google's index.

Today "The Contractor" has been looking into our problem of missing pages and spotted the problem with the robots.txt and the fact that it was giving a "403 Forbidden". I then removed the file and then found that when we looked at oursite.com/robots.txt it was still returning the dreaded 403 still. I contacted the so-called hosting company who say:

We have found the root of your cause. You are getting 403 Forbidden error for robots.txt because at times the search engines cache are not cleared and also the ISPs cache are not cleared. They get refreshed periodically, so we need to wait till then. Once the cache gets cleared you will receive appropriate 404 File not found error since you have deleted robots.txt file as you had mentioned earlier.

Can anyone help me, this is just beyond beleif now.

panic

11:20 pm on Dec 17, 2003 (gmt 0)

Make sure permissions allow the file to be read in the first place. If you're running *nix (which I'm pretty sure you are), set the permissions as follows :

chmod 777 robots.txt

That should solve your problem, but if it doesn't, post a follow up :)

GodLikeLotus

11:30 pm on Dec 17, 2003 (gmt 0)

The robots.txt has been deleted so we should get the 404. We are getting 403 Forbidden for a file that is not there.

I did try exactly what you said earlier today but still got the 403.

panic

11:33 pm on Dec 17, 2003 (gmt 0)

Hrrrm... are other files (like the index, for instance) getting 403's as well, or is it just the robots.txt?

GodLikeLotus

11:34 pm on Dec 17, 2003 (gmt 0)

just robots.txt

panic

11:39 pm on Dec 17, 2003 (gmt 0)

I've checked other sites on the same server that you sticky'ed me with, just to see if maybe this was a server setting, and was giving 403's for robots.txt on all domains. It isn't giving 403's for other domains, so that's leading me to believe that it's either a permissions problem or a .htaccess problem.

Also try deleting robots.txt all together and see if it returns a 404 when you try to access it.

GodLikeLotus

11:49 pm on Dec 17, 2003 (gmt 0)

Panic,

Thanks for looking at my problem, I have forwarded what you said and am waiting to hear from my so-called host.

The file is deleted so why we are getting 403 instead of 404 is just crazy.

jdMorgan

12:51 am on Dec 18, 2003 (gmt 0)

A 403-Forbidden response can be specified for any requested resource, whether it exists or not.

On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

In those files, the Deny from or RewriteRule directives can be used to block access to various files based on requestor IP address, remote hostname, http_referer and other parameters. You should check to see if you have an .htaccess file that is unintentionally blocking these accesses.

On MS servers similar funtionality is available using ISAPI Filters and the control panel.

Jim

panic

1:14 am on Dec 18, 2003 (gmt 0)

On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

The domain in question is being hosted by a rather large host. I checked other domains that are hosted on that server, and their robots.txt isn't giving a 403. Is it safe to strike the httpd.conf as a possibility?

GodLikeLotus

1:16 am on Dec 18, 2003 (gmt 0)

All seems to working fine now, just about to upload a new robots.txt

jdMorgan

4:59 am on Dec 18, 2003 (gmt 0)

panic,
Yeah, just being thorough... :)
But let's ask...

GLL,
If it's working now, what changed?

Jim