homepage Welcome to WebmasterWorld Guest from 54.198.8.124
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
How to make a page *inaccessible* to searches
without using password protection?
halfandhalf




msg:1527953
 5:15 pm on Nov 5, 2003 (gmt 0)

Hi,
I need to create a page... the URL will be sent to a select few, so we don't want people to come across it by searching. Is this even possible or does the page have to be password protected?

Thanks.
h+h

 

Nick_W




msg:1527954
 5:20 pm on Nov 5, 2003 (gmt 0)

Sure, just add a robots.txt [webmasterworld.com] file to your http document root (where you keep your pages).

The file should look like this:


User-agent: *
Dissallow /

That'll stop all bots.

Nick

bcolflesh




msg:1527955
 5:23 pm on Nov 5, 2003 (gmt 0)

That'll stop all bots.

That'll stop all bots who obey/read the robots.txt file - don't count on this method - don't put data in a public place that you don't want to be "found".

halfandhalf




msg:1527956
 5:25 pm on Nov 5, 2003 (gmt 0)

wow, thanks for that. can i do that in a separate directory, so that only the page in that directory is inaccessible to the bots?

Nick_W




msg:1527957
 5:30 pm on Nov 5, 2003 (gmt 0)


Disallow /dir/page.html

I *think* that'd do it...

Nick

halfandhalf




msg:1527958
 5:33 pm on Nov 5, 2003 (gmt 0)

And does the robots.txt page go within that directory or still at the root?

heini




msg:1527959
 5:35 pm on Nov 5, 2003 (gmt 0)

I'd say password protections is by far the better method in this case. One of the reasons rogue bots are called rogue is that they don't obey to robots.txt.

jomaxx




msg:1527960
 6:19 pm on Nov 5, 2003 (gmt 0)

Don't use Nick_W's first suggestion unless you want to ban ALL good bots from your site, which I don't think is your intention. His second suggestion would also work, but has the downside of telling the whole world the name of this secret page.

I suggest protecting a subdirectory using the sample below, and then putting the page there and giving it a more-or-less unguessable name (i.e. not index.html). Also make sure that if someone nosy tries to access that directory, that your server doesn't cough up a list of the files in the directory. That should be sufficient unless the data is really sensitive, in which case you should definitely password-protect it.

User-agent: *
Dissallow /some_private_dir/

halfandhalf




msg:1527961
 6:53 pm on Nov 5, 2003 (gmt 0)

thanks for your help!

jomaxx




msg:1527962
 10:41 pm on Nov 5, 2003 (gmt 0)

**** TYPO ALERT: "Disallow" has just one "s" in it. I cut-and-pasted part of that code from an earlier post without proofreading it. Use this instead:

User-agent: *
Disallow: /some_private_dir/

Mohamed_E




msg:1527963
 11:22 pm on Nov 5, 2003 (gmt 0)

If all you want to do is make sure the page is not advertized to the whole world in the major search engines then I suppose that a suitable robots.txt will suffice.

If you want to be sure that nobody unauthorized can read it there is simply no alternative to a password protection scheme.

rasslin russ




msg:1527964
 8:04 pm on Nov 10, 2003 (gmt 0)

One trick I've heard is to link to the new url using a button.onClick command or a nested form with a submit button to go to the page in question. Most robots (and please correct me if I'm wrong) don't follow these types of links.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved