Welcome to WebmasterWorld Guest from 54.211.86.24

Forum Moderators: goodroi

What's with the robots.txt of this site?

   
5:01 am on Mar 10, 2009 (gmt 0)

5+ Year Member



A little amusing.

Also somehow managed to get 2,500,000 pages indexed despite having

User-agent: *
Disallow: /

[edited by: CWebguy at 5:03 am (utc) on Mar. 10, 2009]

11:52 am on Mar 11, 2009 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



brett usses the robots.txt file for "multiple" purposes. the real secret why WebmasterWorld has over 2.5 million pages indexed is because it is heavily used and well-linked.
1:16 pm on Mar 11, 2009 (gmt 0)



The notes explain the situation, and make the code available: [webmasterworld.com...]
5:25 pm on Mar 11, 2009 (gmt 0)

5+ Year Member



even with the disallow it still gets indexed? So

if (pagerank>5){bots do whatever they want}?
;)

[edited by: CWebguy at 5:28 pm (utc) on Mar. 11, 2009]

7:00 pm on Mar 11, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I think you're missing the point: robots.txt is generated on-the fly by a script here, and different user-agents see different robots.txt directives. If you are a genuine Googlebot from a valid IP address range, you don't see the "Disallow: /" at all.

If you are a browser, you get the "bot blog page."

Jim

7:02 pm on Mar 11, 2009 (gmt 0)

5+ Year Member



Gotcha ;)
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month