homepage Welcome to WebmasterWorld Guest from 54.205.189.156
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
What's with the robots.txt of this site?
CWebguy




msg:3866948
 5:01 am on Mar 10, 2009 (gmt 0)

A little amusing.

Also somehow managed to get 2,500,000 pages indexed despite having

User-agent: *
Disallow: /

[edited by: CWebguy at 5:03 am (utc) on Mar. 10, 2009]

 

goodroi




msg:3867899
 11:52 am on Mar 11, 2009 (gmt 0)

brett usses the robots.txt file for "multiple" purposes. the real secret why WebmasterWorld has over 2.5 million pages indexed is because it is heavily used and well-linked.

Receptional Andy




msg:3867979
 1:16 pm on Mar 11, 2009 (gmt 0)

The notes explain the situation, and make the code available: [webmasterworld.com...]

CWebguy




msg:3868181
 5:25 pm on Mar 11, 2009 (gmt 0)

even with the disallow it still gets indexed? So

if (pagerank>5){bots do whatever they want}?
;)

[edited by: CWebguy at 5:28 pm (utc) on Mar. 11, 2009]

jdMorgan




msg:3868244
 7:00 pm on Mar 11, 2009 (gmt 0)

I think you're missing the point: robots.txt is generated on-the fly by a script here, and different user-agents see different robots.txt directives. If you are a genuine Googlebot from a valid IP address range, you don't see the "Disallow: /" at all.

If you are a browser, you get the "bot blog page."

Jim

CWebguy




msg:3868245
 7:02 pm on Mar 11, 2009 (gmt 0)

Gotcha ;)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved