homepage Welcome to WebmasterWorld Guest from 54.204.68.109
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google indexing robots.txt file
kunwarbs




msg:3109338
 11:19 am on Oct 5, 2006 (gmt 0)

Interesting to see that Google has indexed and cached robots.txt file of reputed websites like nytimes, BBC and Google itself...

[google.com...]

 

Brett_Tabke




msg:3109347
 11:38 am on Oct 5, 2006 (gmt 0)

ya, mentioned alot in the 4 years they've been doing it.

NedProf




msg:3109372
 12:02 pm on Oct 5, 2006 (gmt 0)

Is that because of the text/html mime-type in stead of the text/plain that it should be?

g1smd




msg:3110011
 7:47 pm on Oct 5, 2006 (gmt 0)

It's because someone somewhere links to that file, so they treat it as content as well as its true purpose.

Jordo needs a drink




msg:3110444
 2:08 am on Oct 6, 2006 (gmt 0)

It's because someone somewhere links to that file, so they treat it as content as well as its true purpose.

The best example is in the search results you posted. #1 is Wikipedia expaining robots.txt. #2 is the White House robots.txt itself.

Look again at at the Wiki article and you'll see they link to the White House robots.txt

etgsgroup




msg:3110501
 3:23 am on Oct 6, 2006 (gmt 0)

Why Google database show robots.txt file?

GaryK




msg:3110505
 3:38 am on Oct 6, 2006 (gmt 0)

Why Google database show robots.txt file?

That sure seems like an obvious question to me too. It seems like it would be simple enough for Google to implement. Is anyone really interested in seeing the contents of a robots.txt file in their SE results?

Tastatura




msg:3110513
 4:00 am on Oct 6, 2006 (gmt 0)

Number 4 is BT's robots.txt blog :)
number 5 is google's own robots txt file


Webmasterworld: Robots.txt
Brett Tabke experiments with writing a weblog in a text file usually read only by robots. Trenchant commentary on the world of search engine marketing.
www.webmasterworld.com/robots.txt - 2k - Cached - Similar pages

google's robots txt - [ Translate this page ]
User-agent: * Allow: /searchhistory/ Disallow: /news?output=xhtml& Allow: /news?output=xhtml Disallow: /search Disallow: /groups Disallow: /images Disallow: ...
www.google.com/robots.txt - 3k - Cached - Similar pages


smells so good




msg:3110524
 4:23 am on Oct 6, 2006 (gmt 0)

It's one of the few ways that Brett will have his blog found.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved