Forum Moderators: phranque
The home page, in the main directory, is named index.html I haven't done this with any of the sub-directories because even if someone wanted to just go see the directory contents, I wouldn't care.
I've assured the owner that I will be happy to name an index.html for each directory - but he seems intent on being unhappy. (but hasn't given me any specific reason for it! I don't even know how he came to find out.)
So, for my own benefit, is there a standard I should be following? Are there pros and cons to this? If he's being unreasonable in his demand, he'll have to pay (-me ducking-), but if I'm allowing some sort of breach, which is the implication, then I'll need do something for him beyond just setting it right.
I have facility to do it automatically with hosting but in addition to adding a blank index page if necessary, there's a way to configure so that the subdirectory roots, if empty, return a code of "forbidden" when someone backs up to view what's in them.
It isn't a good idea to leave them exposed and visible.
I don't understand the breach though. How can someone actually get in and do damage to my files?
---there's a way to configure so that the subdirectory roots, if empty, return a code of "forbidden"---
I'm not familiar with this. Is there a 'standard' method? If it's too complex, I probably won't go with it as a first choice, but am interested in how that would work.
<Directory /www/html>
Options -Indexes
</Directory>
in your httpd.conf file (or .htaccess if overrides are set to all)
where /www/html is your root directory
if you wanted a nicely formatted page you could set up a custom error page for http 403 error codes that would be generated if directory browsing was attempted
There's nothing concrete, and a lot of people do /widgets-directory/widgets.html for the page linked to as the main one for the section. But it's just as easy to make the main page of the section that's linked to /widgets/ and it doesn't leave gaps. The only problem with that is in editing pages locally, making sure the right index.htm is uploaded to the right directory online because of several having the same filename - index.htm.
htt*://www.widgets.com/bigwidgets/
instead of
htt*://www.widgets.com/bigwidgets/index.html
It also makes for fewer characters if someone wants to externally link to that particular page.
For the same reason, I omit "www" when possible if linking to other sites, though may servers aren't set up properly to serve pages when the "www." is omitted.
I don't understand the breach though. How can someone actually get in and do damage to my files?
For this question we need to define breach. Presently, anyone can see exactly what files are in each subdirectory and type in the URL to view the files contents. IF any of those files contain sensitive info (passwords, credit card numbers, personal names and contact info, etc), I'd consider that a major security breach.
If you are cloaking for example, and one or more subdirectories contain files intended for SE spiders only, your competitors can easily see them, a less serious breach.
Bottom line, there is a breach if someone can see information they are not intended to see, regardless of whether they can modify or delete the info.
I'll still have to handle it, because the owner isn't one who discusses a decision that's been made. It looks like my conscience can be clear, but I will probably head off the whole thing going forward, and specify an index.html for new directories.
However I have found one disadvantage to naming lots of files index.*. I have several thousands of index.html files on my hard drive (ex temp internet files) which make for a very large portion of my web pages. Obviously this is a potential nightmare should I have to sort through them. Naming pages after what they are about does make file management easier. Thousands of pages called index.html makes it quite hard.
The only problem with that is in editing pages locally, making sure the right index.htm is uploaded to the right directory online because of several having the same filename - index.htm.
This is where discipline in organization plays a key role in maintaining a site with many index pages. It's also a tie in to the many recent and past topics on using a WYSIWYG tool like FP or DW. With all the features in these tools, one of them is the ability to deal with hundreds and thousands of sub-directories all with their own index.htm or whatever the extension may be.
I've read those academic papers that Marcia refers to and am a strong advocate of using a sub-directory structure when building web sites. For example...
10 Products
10 Sub-Directories
10 Index Pages
This allows me to provide focus within the site structure. Within each sub-directory are all supporting pages, navigation, css, javascript, etc. In FP (FrontPage), I can create what are called subwebs and edit each one as an individual site, tis a very powerful program.
For the same reason, I omit "www" when possible if linking to other sites.
There are many of us here who would suggest that you not omit the www. from your external links. This is due to the fact that www. and non www. are different entities.
There are many of us here who would suggest that you not omit the www. from your external links. This is due to the fact that www. and non www. are different entities.
Tried to do a quick search on WW for previous discussions on the topic and turned up little. Many (most) sites resolve to the same website whether the "www." is present or not.
Could you explain how this is bad form/improper, or reference a thread I could peruse to educate myself? Thanks!
Therefore, if you wish to provide a correct link you should encourage your partners to link to the full URI of your resource. The same would apply to your own outbound linking practices.
When linking to root level pages, use the shortest URI...
http //www.example.com/
When linking to index pages in a sub-directory, use the shortest URI...
http //www.example.com/sub/
Why not link to the full URI including the index.htm page? Because that index.htm page could change in the future, the underlying technology that is.
Here's one reference where jdMorgan states (message #6) that www.example.com is a subdomain of example.com...
www. vs. non-www. [webmasterworld.com]
Note: The non www. version should be set up via a 301 permanent redirect to the www. version so that the correct URI is resolved to.
This short thread [webmasterworld.com], after much reading of the thread(s) you pointed me to, has set me on the path to proper URI usage.
I've even used subdomains before, so I don't know why it didn't occur to me that "www." was in fact a subdomain and could be construed as different from the 'pure' (eg. example.com) URL.
Thanks for your guidance, and sorry for hijacking the tread into .htaccess discussion.