Forum Moderators: goodroi

Message Too Old, No Replies

Need to double check my Entry

Q - regarding ( / ) (Forward Slash) in Entry

         

Tutti

4:32 am on Oct 2, 2005 (gmt 0)



Regarding ( / ) Forward Slash in a Disallow Entry, I need to double check a couple of things.

1-If a Folder, (store), has a Sub Folder, (admin), which also has its own (2 or 3) Sub Folders, one being (admin/importexport) + several other files, such as (admin/datasource.inc) –
Considering that the intent is to Disallow - all Files, Folders & Sub Folders in the (admin) Folder;

Which Entry would be best or correct?

Disallow: /store/admin

Or

Disallow: /store/admin/

2-If for example, the Entry, Disallow: /store/admin happens to be the correct one and is used,
Would it be correct to assume that in doing so - all other intended files in the (store) Folder will be indexed.

My intention is to allow indexing most of the files in the (store) Folder – while Disallowing some of the Folders & Files.

Research Note: I have looked at similar case studies, – in the Forum, that recommend moving all Disallowed Files and Folders into a new folder.
However, in my case, it would be an impossible task – that would require hours of program re-write & page editing.

Thanks,

Dari

Lord Majestic

3:59 pm on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Disallow: /store/admin

Or

Disallow: /store/admin/

#1 is certainly best because /store/admin is a perfectly valid URL for a directory but unless Disallow states /store/admin then it won't be disallowed.

Some web servers execute redirect to proper directory with /'s at the end and good bots that double-check robots.txt upon rediction will still be disallowed by 2-nd version, but some webservers don't redirect and not all bots support double-check on redirect.

Thus #1 is the best choice.

--

Your second problem with allowing some files will either require moving them into separate folder or making very detailed robots.txt - this may be hard to maintain.

Of course you can use META commands to prohibit pages from being indexed, but this will still result in them being crawled.