homepage Welcome to WebmasterWorld Guest from 54.205.236.46
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
does a search engine care if a robots.txt file is 403 instead of 404
kahuna




msg:1526590
 11:53 am on Mar 27, 2004 (gmt 0)

I recently put a site on a new host and I don't use a robots.txt file (no need) and with this method I can see all the search engines visiting the site easily with a cgi error reporting program I run.

But the server is returning 403 errors for everything for the 404s and particularily the robots.txt file...

Do search engines care if non existent robots.txt file
is showing them 403 errors instead of the correct 404 errors?

Thanks group.

 

kahuna




msg:1526591
 3:10 pm on Mar 27, 2004 (gmt 0)

I got this back from somebody else here..
"There was (maybe is) an Apache derived server returning 403 instead of 404.
This was a big enough problem for Google to switch their behaviour and crawl domains where /robots.txt returned 403.
That was some time ago, but if it's still the case then 403 for /robots.txt should be no problem. "

I apologize for putting this question in different forums but it was really worrying me that the site might not get listed because of the 403 errors... I uploaded a blank robots.txt file just in case.

jdMorgan




msg:1526592
 3:45 pm on Mar 27, 2004 (gmt 0)

kahuna,

> But the server is returning 403.

... And that response is incorrect and you should find out why. Just because Google has a work-around, doesn't mean that others do. And maybe their work-around will be accidentally removed some day... Unlikely, but do you want to bet your business?

I strongly suggest you find out why your server is incorrectly returning a 403 server response and fix it.

If your site does not meet HTTP specifications, it is a disaster waiting to happen. 403 has a meaning and 404 has a meaning, and they are definitely not interchangeable! Also, put up at least a blank robots.txt file and avoid all those 404's in your logs!

Jim

kahuna




msg:1526593
 6:20 pm on Mar 27, 2004 (gmt 0)

Thanks again JD... I did put up a blank robots.txt file early this morning when I started to question what was happening.. this is a site I moved off my main servers to another host to satisfy the search engines, it is a virtual host, but I have "tested" many hosting services and never saw this before.
The hosting company told me this...
For the www.mydomain.com/ as being called the "control".

"This is because the folder is control. If you look for the same file in an uncontrolled folder you will get the 404 error. www.mydomain.com/images/robots.txt "

And I completely agree as to an apple is an apple and an orange is an orange in the configuration of the server and the errors being generated.

The operating system is: Linux 2.2.19-6.2.11
The web server is: Apache/1.3.9 (Unix) (Red Hat/Linux) PHP/4.3.1

Thanks
K.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved