Welcome to WebmasterWorld Guest from 54.159.26.69

Message Too Old, No Replies

Google is displaying my robots.txt page as a search result

     

Sgt_Kickaxe

5:00 pm on Sep 1, 2012 (gmt 0)

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I began noticing traffic to my robots.txt page and by looking at Google I can see that they're indexed my robots.txt page. It has no added text and is fairly short yet there it is.

I'm sure this has happened before but what I found worthy of posting here is that the TITLE google has assigned my robots.txt page is the actual content of the file up until reaching the maximum number of characters after which it shows ...

The description is, you guessed it, the contents again. Fubar.

g1smd

5:35 pm on Sep 1, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This annoys the **** out of me when it happens.

lucy24

7:39 pm on Sep 1, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



"If it is an URL, we will index it."

Is there a thread that explains how to attach a "noindex" directive to something that isn't html? robots.txt and sitemap.xml will do for starters.

:: quick detour to check obvious corollary question ::

SO FAR, an image search for "favicon" does not bring up a slew of actual sites' actual favicons. But give them time; I'm sure it is merely an oversight.

tedster

2:22 am on Sep 2, 2012 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



how to attach a "noindex" directive to something that isn't html

There is a technology called an x-robots-tag that allows a noindex directive to be placed in the http header that's sent by the server. It's very handy for non-html document types, such as video files, pdf files, etc.

For details, see this page from the Google developers site: Robots meta tag and X-Robots-Tag HTTP header specifications [developers.google.com]

indyank

5:58 am on Sep 2, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



robots.txt page is such an unique name and it should be easy for them to exclude it from their index without any directive. But these days they are so focused on user ex. you know....

The interesting part is what does it rank for to get the traffic? Is it some file or folder name that is unique and which you won't find easily elsewhere on the web or is it a keyword that does drive some traffic to sites?

lucy24

8:16 am on Sep 2, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



For details, see this page

... which someone, quite possibly yourself, has already pointed me to in the recent past. I think I even looked at it.

Um. Ahem.

Oh well. I did manage to get chummy with mod_expires yesterday. Only took about seven tries-- and NO pleas for help-- to hit on the right wording for what I wanted to do.

phranque

12:19 pm on Sep 2, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



someone

(cough)
http://www.webmasterworld.com/robots_txt/4478700.htm [webmasterworld.com]

netmeg

1:40 pm on Sep 2, 2012 (gmt 0)

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



There is a technology called an x-robots-tag that allows a noindex directive to be placed in the http header that's sent by the server.


This is what I did for all .txt files (and some others) after some of them started showing up in the serps. Never saw a robots.txt though; that's beyond ridiculous.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month