Welcome to WebmasterWorld Guest from 54.163.54.95

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Google is displaying my robots.txt page as a search result

     
5:00 pm on Sep 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

joined:Apr 14, 2010
posts:3169
votes: 0


I began noticing traffic to my robots.txt page and by looking at Google I can see that they're indexed my robots.txt page. It has no added text and is fairly short yet there it is.

I'm sure this has happened before but what I found worthy of posting here is that the TITLE google has assigned my robots.txt page is the actual content of the file up until reaching the maximum number of characters after which it shows ...

The description is, you guessed it, the contents again. Fubar.
5:35 pm on Sept 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


This annoys the **** out of me when it happens.
7:39 pm on Sept 1, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


"If it is an URL, we will index it."

Is there a thread that explains how to attach a "noindex" directive to something that isn't html? robots.txt and sitemap.xml will do for starters.

:: quick detour to check obvious corollary question ::

SO FAR, an image search for "favicon" does not bring up a slew of actual sites' actual favicons. But give them time; I'm sure it is merely an oversight.
2:22 am on Sept 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


how to attach a "noindex" directive to something that isn't html

There is a technology called an x-robots-tag that allows a noindex directive to be placed in the http header that's sent by the server. It's very handy for non-html document types, such as video files, pdf files, etc.

For details, see this page from the Google developers site: Robots meta tag and X-Robots-Tag HTTP header specifications [developers.google.com]
5:58 am on Sept 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:Mar 9, 2010
posts:1806
votes: 9


robots.txt page is such an unique name and it should be easy for them to exclude it from their index without any directive. But these days they are so focused on user ex. you know....

The interesting part is what does it rank for to get the traffic? Is it some file or folder name that is unique and which you won't find easily elsewhere on the web or is it a keyword that does drive some traffic to sites?
8:16 am on Sept 2, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


For details, see this page

... which someone, quite possibly yourself, has already pointed me to in the recent past. I think I even looked at it.

Um. Ahem.

Oh well. I did manage to get chummy with mod_expires yesterday. Only took about seven tries-- and NO pleas for help-- to hit on the right wording for what I wanted to do.
12:19 pm on Sept 2, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


someone

(cough)
http://www.webmasterworld.com/robots_txt/4478700.htm [webmasterworld.com]
1:40 pm on Sept 2, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2005
posts:12919
votes: 195


There is a technology called an x-robots-tag that allows a noindex directive to be placed in the http header that's sent by the server.


This is what I did for all .txt files (and some others) after some of them started showing up in the serps. Never saw a robots.txt though; that's beyond ridiculous.