homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

New Thought to an Old Thread
Before you reply, read the entire post including the thread link

10+ Year Member

Msg#: 4506639 posted 9:51 pm on Oct 10, 2012 (gmt 0)

I would like to amend my original post [webmasterworld.com] to pose the following -

Rather than disavowing sites through the various SE’s webmaster tool sets, why not simply put the information in a single location; such as the robots.txt file? I know this will be like moving mountains, but read on.

If there was a robots.txt rule, which when a bot see's the following in the robots.txt file:

User-agent: (Robot name)
Disallow: www.example.com
Disallow: www.example.com/page1.htm

When the bot sees this it will notate the domain and url, under the directive that this domain and this URL have been dis-avowed by the domain which hosts the robots.txt file. I will leave it up to the search engines to figure out how they want to handle this notated data.

The robots.txt file is advisory information. Shouldn’t we have the ability to further advise the bots when they “read” the robots.txt file?
We advise them on which pages/directories NOT to follow on our site, why not also let us advise them about the links from other domains to our ours?

Realizing this may be like moving mountains to get the robots standards changed, I wanted to bring this idea to this forum for discussion.

This will be well worth the effort for the search engine’s bots, plus it’s a single location for webmaster’s to go to, to make this happen.



WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 4506639 posted 1:34 am on Oct 11, 2012 (gmt 0)

the process of "crawling the url database" is completely divorced from "assigning link equity".
robots.txt is about excluding crawlers.
excluding link equity would require a separate protocol.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved