homepage Welcome to WebmasterWorld Guest from 54.224.53.192
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google cached my robots.txt file
atlrus

10+ Year Member



 
Msg#: 3234479 posted 1:44 am on Jan 28, 2007 (gmt 0)

This just killed me - I dont know what to think, so I am putting it out there just to see what you guys think of it:

When I do a site: search, one of the pages Google has looks like this:


User-agent: * Disallow: /file/ (this is the title)
User-agent: * Disallow: /file/ (this is the description)
www.webmasterworld.com/robots.txt - 1k - Supplemental Result

Am I missing something here? Seems like there must be some very simple explanation why G cached my robots.txt, making up it's own title and all. The original file is fine, and it's on two lines, unlike the cached version:

User-agent: *
Disallow: /file/

Maybe I made a mistake, or misspelled a word?

 

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3234479 posted 10:10 pm on Jan 29, 2007 (gmt 0)

As far as I know, it only gets indexed if someone somewhere links to it.

Google does index text files you know.

atlrus

10+ Year Member



 
Msg#: 3234479 posted 12:10 am on Jan 30, 2007 (gmt 0)

Google does index text files you know.

I didn't think Google looks at robots.txt as just a text file - Google always looks for robots.txt and if what you say it's true - everyone's robots.txt would have been cached.

MThiessen

10+ Year Member



 
Msg#: 3234479 posted 2:29 am on Jan 30, 2007 (gmt 0)

maybe not altrus, I think there has to be at least "one" link to it somewhere for it to show up in the search.

atlrus

10+ Year Member



 
Msg#: 3234479 posted 2:44 am on Jan 30, 2007 (gmt 0)

Not true. I can have a page indexed without any links to it.

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3234479 posted 2:54 am on Jan 30, 2007 (gmt 0)

Sure, if you continually submit it to Google.

But barring pushing the envelope, only URLs with links stay in the index. That's how Google works; and why no site ever needs submitting; just link to it, and Google will, er, follow the links :)

There's never a need to link to a robots.txt file. Google will find that if the domain is indexed.

On the other hand, having that in Google's index really will do no harm (and no good). Best just to be normal, however - too much experimenting can be harmful to your income ;)

[edited by: Quadrille at 2:55 am (utc) on Jan. 30, 2007]

atlrus

10+ Year Member



 
Msg#: 3234479 posted 5:03 am on Jan 30, 2007 (gmt 0)

On the other hand, having that in Google's index really will do no harm (and no good). Best just to be normal, however - too much experimenting can be harmful to your income ;)

Yeah, but if Google has a robot.txt doesnt this mean that it does not look at it as a robots.txt but just a simple text file? Will it obey the disallow?

P.S. or I can submit a page through the sitemaps :)

grandpa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3234479 posted 8:24 am on Jan 30, 2007 (gmt 0)

I doubt that GoogleBot will disregard your robots.txt as a result of also having it indexed. A quick search for the phrase turns up 3 of the largest web sites (popularity-wise) with an indexed robots.txt file - WW, Whitehouse and Google's own. Brett or an administrator could confirm if robots.txt is being disregarded for this site. I will speculate and say 'it ain't so'.

Is your robots.txt listed in your sitemap file? It seems to me that *might* be considered a link.

MHes

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3234479 posted 8:38 am on Jan 30, 2007 (gmt 0)

We have robots.txt indexed as well and it happenned 5 days before we got hit by the 950 penalty (22 Dec). At the time, it was listed as our number1 page on a site:search, though this may have been meaningless. Since then it went supplemental. I'm not sure if Google stopped obeying it during that time but previously (and for many years) we had disallowed pages listed as urls only when doing a site:search, then they just disappeared and have only just returned back to normal.

I'm beginning to wonder if all this was and is related to the 950 penalty. Many members reported having their supplementals listed above their main pages when first hit by 950 penalty.... I wonder how many have a robots.txt indexed as well?

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3234479 posted 11:31 am on Jan 30, 2007 (gmt 0)

P.S. or I can submit a page through the sitemaps
.

You surely can; you surely can. But it remains a pointless exercise. The way to effectively have a page indexed is to link to that page. Period. "Forcing" Google, repeatedly, to include a file that shouldn't be there does not sound appropriate use of sitemaps or your time; indeed, getting a robots.txt indexed does not sound particularly useful, either.

Whether Google cares either way, I couldn't know. But if they do care, you can bet that's not 'care' as in 'fond affection'.

If you 'care' about your site, I'd suggest you stop playing games with it - sooner or later, the dragon will stir.

Never forget the Hogwarts motto: "Draco dormiens nunquam titillandus," which means "Never tickle a sleeping dragon." ;)

[edited by: Quadrille at 11:32 am (utc) on Jan. 30, 2007]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3234479 posted 4:54 pm on Jan 30, 2007 (gmt 0)

I wonder what Disallow: /robots.txt does?

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3234479 posted 5:22 pm on Jan 30, 2007 (gmt 0)

I don't know.

But I'll bet it's not pretty. ;)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved