Welcome to WebmasterWorld Guest from 54.159.50.111

Forum Moderators: goodroi

Message Too Old, No Replies

the wrong way to do robots.txt

kinda funny, whitehouse.gov robots file

     
6:57 am on Oct 28, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 16, 2002
posts:2010
votes: 0


Not touching the political side of this, but check out the whitehouse.gov robots.txt file [whitehouse.gov]
Don't even know where to begin telling them how incorrectly that is done.
I think they need a WebmasterWorld membership ;)
8:10 am on Oct 28, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:July 18, 2003
posts:167
votes: 0


yup, that's funny alright.
whats got me curious is what were you doing looking at that? What motivated you to say, "hey, i wonder what the whitehouse robots.txt looks like"?
:)
8:16 am on Oct 28, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 16, 2002
posts:2010
votes: 0


Actually I was wondering what CMS (content management) they used, and while Googling for it I found a blog comment about the robots.txt ;) so nothing evil I swear (hey what are those men in the dark suits doing at my door?)
12:39 pm on Oct 28, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 28, 2002
posts:1324
votes: 0


Do you think the dark suits look at the logs? If they do can you imagine them all sitting around wondering whats going on.
12:47 pm on Oct 28, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 27, 2002
posts:959
votes: 0


All of you, please look right into the red light. What you saw was a digital weather balloon.
12:53 pm on Oct 28, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 2, 2002
posts:1167
votes: 0


I will resist the obvious comment about the Disallow parameters being deliberately scooched way over to the right ... or maybe I won't, hehehehe
1:02 pm on Oct 28, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 21, 2002
posts:160
votes: 0


I like the irony in this entry:
Disallow:/sitemap.html
12:25 pm on Nov 1, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 22, 2002
posts:1749
votes: 0


Definitely the funniest robots.txt I've ever seen!

Disallow: /firstlady/recipes/iraq
;)
3:43 am on Nov 3, 2003 (gmt 0)

New User

5+ Year Member

joined:July 26, 2008
posts:32
votes: 0


So many interesting URLs. However, every path I tried to explore turned out 404. Did anyone else experience this?

Daniel Odulo

9:28 pm on Nov 8, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:May 14, 2003
posts:376
votes: 0


whoa... hahahahahaha... seems that they are trying to prevent bots from accessing everything on the site... guess they don't know or understand that many bots can be configured to ignore robots.txt and to even present a real looking UA string... too funny... it appears that they update it regularly from looking at the section that contains the blocks for the new releases section... its got the current year and month...
7:31 pm on Nov 10, 2003 (gmt 0)

Senior Member

joined:Mar 8, 2002
posts:2897
votes: 0


We've had clients who want to NOT get found on search engines.

It happens.

Of course, robots.txt is a voluntary protocol.

I note they are no.1 on Yahoo for "Whitehouse" - just above a sex site at whitehouse.com.

7:41 pm on Nov 10, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2003
posts:2355
votes: 0


Don't forget the .org hilarious parody site right behind them in the results.
7:42 pm on Nov 10, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 16, 2003
posts:1298
votes: 0


LOL.
Thanks bcolflesh ;)