Welcome to WebmasterWorld Guest from 54.167.153.63

Forum Moderators: goodroi

Message Too Old, No Replies

Getting 403 Forbidden for robots.txt

Can anyone help

   
11:17 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



Last week I read up on robots.txt files and added to 1 of our sites. Since last week we have been have pages vanishing from Google's index.

Today "The Contractor" has been looking into our problem of missing pages and spotted the problem with the robots.txt and the fact that it was giving a "403 Forbidden". I then removed the file and then found that when we looked at oursite.com/robots.txt it was still returning the dreaded 403 still. I contacted the so-called hosting company who say:

We have found the root of your cause. You are getting 403 Forbidden error for robots.txt because at times the search engines cache are not cleared and also the ISPs cache are not cleared. They get refreshed periodically, so we need to wait till then. Once the cache gets cleared you will receive appropriate 404 File not found error since you have deleted robots.txt file as you had mentioned earlier.

Can anyone help me, this is just beyond beleif now.

11:20 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



Make sure permissions allow the file to be read in the first place. If you're running *nix (which I'm pretty sure you are), set the permissions as follows :

chmod 777 robots.txt

That should solve your problem, but if it doesn't, post a follow up :)

11:30 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



The robots.txt has been deleted so we should get the 404. We are getting 403 Forbidden for a file that is not there.

I did try exactly what you said earlier today but still got the 403.

11:33 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



Hrrrm... are other files (like the index, for instance) getting 403's as well, or is it just the robots.txt?
11:34 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



just robots.txt
11:39 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



I've checked other sites on the same server that you sticky'ed me with, just to see if maybe this was a server setting, and was giving 403's for robots.txt on all domains. It isn't giving 403's for other domains, so that's leading me to believe that it's either a permissions problem or a .htaccess problem.

Also try deleting robots.txt all together and see if it returns a 404 when you try to access it.

11:49 pm on Dec 17, 2003 (gmt 0)

10+ Year Member



Panic,

Thanks for looking at my problem, I have forwarded what you said and am waiting to hear from my so-called host.

The file is deleted so why we are getting 403 instead of 404 is just crazy.

12:51 am on Dec 18, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



A 403-Forbidden response can be specified for any requested resource, whether it exists or not.

On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

In those files, the Deny from or RewriteRule directives can be used to block access to various files based on requestor IP address, remote hostname, http_referer and other parameters. You should check to see if you have an .htaccess file that is unintentionally blocking these accesses.

On MS servers similar funtionality is available using ISAPI Filters and the control panel.

Jim

1:14 am on Dec 18, 2003 (gmt 0)

10+ Year Member



On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

The domain in question is being hosted by a rather large host. I checked other domains that are hosted on that server, and their robots.txt isn't giving a 403. Is it safe to strike the httpd.conf as a possibility?

1:16 am on Dec 18, 2003 (gmt 0)

10+ Year Member



All seems to working fine now, just about to upload a new robots.txt
4:59 am on Dec 18, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



panic,
Yeah, just being thorough... :)
But let's ask...

GLL,
If it's working now, what changed?

Jim

 

Featured Threads

Hot Threads This Week

Hot Threads This Month