homepage Welcome to WebmasterWorld Guest from 23.20.77.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
GWT URL Error (on my Blogger.com site)
"URL restricted by robots.txt" - without a robots file
adfree




msg:3468728
 6:57 am on Oct 4, 2007 (gmt 0)

I get quite a couple of URL restriction alerts in GWT mentioning my robots file restricts them.

Well, I don't have a robots.txt file (it's a blogger.com site and I haven't figured out yet how to put one in the root, hints welcome).

What's the issue here?

Thanks!

 

Drew_Black




msg:3469078
 2:53 pm on Oct 4, 2007 (gmt 0)

I think there's a bug in GWT. I reported it in one of their forums. I've noticed that if there's a 301 redirect from Page A to Page B and Page A is disallowed in robots.txt then Page B will be reported as Restricted in robots.txt in GWT.

silverbytes




msg:3473194
 10:02 pm on Oct 9, 2007 (gmt 0)

I have exactly same issue and sadly don't know how to fix it, all my sites are ok but blogger blog has 19 errors this kind:

http://mysite.blogspot.com/search/label/alojamiento URL restricted by robots.txt [?] Sep 30, 2007

I think I have no robots.txt there... how do we fix that?

[edited by: tedster at 6:10 am (utc) on Oct. 10, 2007]
[edit reason] delink [/edit]

Drew_Black




msg:3473409
 3:40 am on Oct 10, 2007 (gmt 0)

I don't think you can fix it if someone is doing a 301 redirect to your site from a site that you don't control. This is the nature of the bug. The destination page is appearing as disallowed by robots.txt when it's the source page that did the 301 that should really be listed.

Example:

example.com has an outbound traffic tracking script that records outbound clicks using a page like http://example.com/click.php Click.php is in example.com's Disallow: section for Googlebot (or for *). When user clicks the link to http://yoursite.com/yourpage.html the click.php page records the click and redirects the user with an HTTP 301 to your site. For some reason GWT is reporting the destination URL as blocked by robots.txt.

I have this happening on many hundreds of links.

[edited by: tedster at 6:11 am (utc) on Oct. 10, 2007]
[edit reason] de-link [/edit]

adfree




msg:3473553
 7:30 am on Oct 10, 2007 (gmt 0)

Thanks Drew, know of any negative impact?

Susan Moskwa




msg:3474290
 11:10 pm on Oct 10, 2007 (gmt 0)

Hi folks--
I work with the Google webmaster tools team and can clarify some of these issues for you:

All Blogger blogs have a robots.txt file automatically created for them (add "/robots.txt" to the end of your blog's URL and you'll see yours). These files all disallow the /search directory, which is part of the path when you're viewing all of your blog posts that have a particular label. Disallowing crawlers from this section of your blog basically keeps them from crawling and indexing the same blog post in multiple places (on its permalink URL *and* under each of its labels), which reduces potential problems with duplicate content.

adfree and silverbytes, it sounds like this is the cause of the errors you're seeing in webmaster tools. There's no way to get rid of them, since Blogger doesn't let you edit your robots.txt file, but they're not something you need to worry about (since those URLs were disallowed deliberately).

Drew_Black, it sounds like the 301 redirect issue you're talking about may be unrelated to Blogger? If there's a 301 redirect from page A to page B and page *B* is blocked by a robots.txt file, then page *A* will show a "Restricted by robots.txt" error in webmaster tools (there's a blog post from September '06 on the Google Webmaster Central blog with more details about this). But the opposite shouldn't be true (page A is blocked but page B shows the error). I believe I've found your thread in our Webmaster Help Group (a search for [chaosunlimited destination] in our Help Group returns your question, right?); could you post an example URL there so that we can look into the issue further? Thanks!

tedster




msg:3474301
 11:24 pm on Oct 10, 2007 (gmt 0)

Hello Susan. Welcome to the forums and thanks for pitching in on these questions.

I espcially appreciate the insight into the Blogger robots.txt issues. Good to know that Google has preventative steps in place to avoid those nasty duplicate url issues.

Also thanks for respecting our policies about links and taking the example URL discussion with Drew_Black over to Google's Webmaster Help Group.

adfree




msg:3475648
 7:28 am on Oct 12, 2007 (gmt 0)

Neat Susan, this is helpful!
Thanks.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved