homepage Welcome to WebmasterWorld Guest from 54.226.235.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
What was wrong with this robots.txt if anything
montenegro




msg:1528502
 10:58 am on Apr 15, 2005 (gmt 0)

I am just about to start screaming. I have just discovered by accident that my robots.txt file had a huge problems. I checked it before uploading and Search Engine World Robots.txt Validator did not report problems and it still doesn't. This evening I used META TAG ANALYZER available on many sites on the web I got "Error: 403 Forbidden by robots.txt" message for all pages including the mydomain.com/. META TAG ANALYZER returned correct result only for mydomain.com (without the forward slash character).
Here is my robots.txt that I used fir the last 3 years:
User-agent: *
Disallow: /private
Disallow: /cgi-bin
Disallow: /scgi-bin
Disallow: /cpanel-upgrade
User-Agent: sitecheck.internetseer.com
Disallow: /
User-Agent: NPBot-1/2.0
Disallow: /
User-agent: linksmanager
Disallow: /
User-agent: Cyveillance
Disallow: /

After I changed the code to the following every page became visable to the tool (no "Error: 403 Forbidden by robots.txt" returned):
User-agent: *
Disallow: /private
Disallow: /cgi-bin
Disallow: /scgi-bin
Disallow: /cpanel-upgrade
User-Agent: sitecheck.internetseer.com
Disallow: /
User-Agent: NPBot-1/2.0
Disallow: /
User-agent: linksmanager
Disallow: /
User-agent: Cyveillance
Disallow: /

My questions are:
1. Is it possible that because of this my rankings have been affected for the last three years?
2. Is this a bug in the actual tool? All pages on my site are indexed.
ANY COMMENTS?

 

Reid




msg:1528503
 6:32 am on Apr 16, 2005 (gmt 0)

I'd try a few different META analyzer tools maybe this one is using one of the robots you are blocking.

montenegro




msg:1528504
 8:36 am on Apr 16, 2005 (gmt 0)

An important correction to my post. The oroginal post above did not reflect the changes in my robots.txt file.

robots.txt file that I used for 3 years:

User-agent: *
Disallow: /private
Disallow: /cgi-bin
Disallow: /scgi-bin
Disallow: /cpanel-upgrade
User-Agent: sitecheck.internetseer.com
Disallow: /
User-Agent: NPBot-1/2.0
Disallow: /
User-agent: linksmanager
Disallow: /
User-agent: Cyveillance
Disallow: /

After I changed the code to the following every page became visable to the tool (no "Error: 403 Forbidden by robots.txt" returned):

User-agent: *
Disallow: /private
Disallow: /cgi-bin
Disallow: /scgi-bin
Disallow: /cpanel-upgrade

User-Agent: sitecheck.internetseer.com
Disallow: /
User-Agent: NPBot-1/2.0
Disallow: /
User-agent: linksmanager
Disallow: /
User-agent: Cyveillance
Disallow: /

The only change is a blanc line inserted after "Disallow: /cpanel-upgrade"

My questions are:
1. Is it possible that because of this my rankings have been affected for the last three years?
2. Is this a bug in the actual tool? All pages on my site are indexed.
ANY COMMENTS?

Span




msg:1528505
 9:25 am on Apr 16, 2005 (gmt 0)

The blank line makes no difference, as far as I know.
You should, however, if you want to exclude directories from being spidered, use a trailing slash:

Disallow: /private/

- if there's no slash at the end bots assume it is a file and they do spider the directory.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved