homepage Welcome to WebmasterWorld Guest from 54.197.15.196
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Getting 403 Forbidden for robots.txt
Can anyone help
GodLikeLotus

10+ Year Member



 
Msg#: 220 posted 11:17 pm on Dec 17, 2003 (gmt 0)

Last week I read up on robots.txt files and added to 1 of our sites. Since last week we have been have pages vanishing from Google's index.

Today "The Contractor" has been looking into our problem of missing pages and spotted the problem with the robots.txt and the fact that it was giving a "403 Forbidden". I then removed the file and then found that when we looked at oursite.com/robots.txt it was still returning the dreaded 403 still. I contacted the so-called hosting company who say:

We have found the root of your cause. You are getting 403 Forbidden error for robots.txt because at times the search engines cache are not cleared and also the ISPs cache are not cleared. They get refreshed periodically, so we need to wait till then. Once the cache gets cleared you will receive appropriate 404 File not found error since you have deleted robots.txt file as you had mentioned earlier.

Can anyone help me, this is just beyond beleif now.

 

panic

10+ Year Member



 
Msg#: 220 posted 11:20 pm on Dec 17, 2003 (gmt 0)

Make sure permissions allow the file to be read in the first place. If you're running *nix (which I'm pretty sure you are), set the permissions as follows :

chmod 777 robots.txt

That should solve your problem, but if it doesn't, post a follow up :)

GodLikeLotus

10+ Year Member



 
Msg#: 220 posted 11:30 pm on Dec 17, 2003 (gmt 0)

The robots.txt has been deleted so we should get the 404. We are getting 403 Forbidden for a file that is not there.

I did try exactly what you said earlier today but still got the 403.

panic

10+ Year Member



 
Msg#: 220 posted 11:33 pm on Dec 17, 2003 (gmt 0)

Hrrrm... are other files (like the index, for instance) getting 403's as well, or is it just the robots.txt?

GodLikeLotus

10+ Year Member



 
Msg#: 220 posted 11:34 pm on Dec 17, 2003 (gmt 0)

just robots.txt

panic

10+ Year Member



 
Msg#: 220 posted 11:39 pm on Dec 17, 2003 (gmt 0)

I've checked other sites on the same server that you sticky'ed me with, just to see if maybe this was a server setting, and was giving 403's for robots.txt on all domains. It isn't giving 403's for other domains, so that's leading me to believe that it's either a permissions problem or a .htaccess problem.

Also try deleting robots.txt all together and see if it returns a 404 when you try to access it.

GodLikeLotus

10+ Year Member



 
Msg#: 220 posted 11:49 pm on Dec 17, 2003 (gmt 0)

Panic,

Thanks for looking at my problem, I have forwarded what you said and am waiting to hear from my so-called host.

The file is deleted so why we are getting 403 instead of 404 is just crazy.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 220 posted 12:51 am on Dec 18, 2003 (gmt 0)

A 403-Forbidden response can be specified for any requested resource, whether it exists or not.

On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

In those files, the Deny from or RewriteRule directives can be used to block access to various files based on requestor IP address, remote hostname, http_referer and other parameters. You should check to see if you have an .htaccess file that is unintentionally blocking these accesses.

On MS servers similar funtionality is available using ISAPI Filters and the control panel.

Jim

panic

10+ Year Member



 
Msg#: 220 posted 1:14 am on Dec 18, 2003 (gmt 0)

On Apache servers, this is typically done in one of two files, or possibly in both. The files are httpd.conf - the server configuration file, and .htaccess - a user-level configuration file that can exist in any or all of your directories.

The domain in question is being hosted by a rather large host. I checked other domains that are hosted on that server, and their robots.txt isn't giving a 403. Is it safe to strike the httpd.conf as a possibility?

GodLikeLotus

10+ Year Member



 
Msg#: 220 posted 1:16 am on Dec 18, 2003 (gmt 0)

All seems to working fine now, just about to upload a new robots.txt

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 220 posted 4:59 am on Dec 18, 2003 (gmt 0)

panic,
Yeah, just being thorough... :)
But let's ask...

GLL,
If it's working now, what changed?

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved