homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Directory and page naming for restructure of site
Restructuring Dilemma

 1:34 am on Aug 28, 2007 (gmt 0)

I'm currently in a dilemma as to which method of naming my directories and files will serve me best in my ongoing restructure of my site. I keep flitting from one mechanism to another then back again.

My site holds great rankings for my main keywords but not so good for the longer tail. I'll be moving from around 30 pages to 100 of unqiue handwritten targeted content. I'm not too worried about losing the existing structure and will utilise 301's to cover for any renamed pages.

Mechanism 1:
Keeping all pages in the top level directory:
E.g. www.domain.com/keyword1.htm

Mechanism 2:
Naming sub directories appropriately and using index.html:
E.g. www.domain.com/keyword1/index.html

Mechanism 3:
Naming sub directories and pages appropriately:
E.g. www.domain.com/keyword1/keyword2.htm

Opinions please on which of the above might be best, or which I should avoid.



 3:29 am on Aug 28, 2007 (gmt 0)

You can structure your URLs differently from your files -- They're not the same thing, and you can use internal rewriting to 'map' URLs to files as long as there is a consistent naming system in effect that allows this.

> my ongoing restructure of my site.

I hope that you mean that you expect it to take awhile, and not that you expect to have constant URL-churn, yet expect to do well. Plan, consider, consider some more, then implement. Aim for a URL-structure that will support your needs for a decade or beyond [w3.org].

> for the longer tail

Then I'll take two choices not on the list:

You can use mod_rewrite or AcceptPathInfo on Apache, or ISAPI rewrite on IIS to map these URLs to appropriate directories, filenames, and filetypes as needed. URLs structured to support usability and search ranking, directories and filenames structured to support maintenance, who-has-access considerations, and caching and robots.txt attributes.



 4:00 am on Aug 28, 2007 (gmt 0)

Naming sub directories appropriately and using index.html:
E.g. www.domain.com/keyword1/index.html

I like that alternative best because it lets you use elegant, concise URLs for all your internal crosslinking, like this:



 4:06 am on Aug 28, 2007 (gmt 0)

Thanks Jim, for a useful reply.

I think it may have served to enlighten me just how much I dont know, despite being a full time webmaster for 3 years, and 2 years part time before that!

I'm not sure my knowledge/ability extends to server manipulation (for want of a better word) for the mapping you suggest and I'll probably have to stick with static pages and standard urls apart from htaccess tweaks for 301 redirects.

Indeed, the redesign is taking a while. None of it is live at the moment except to the extent that it resides in an unknown directory I can review on the web but that is not accesible to site users. I'll switch the whole thing over once I'm happy.

The link you provided is helpful. Ideally, I would keep the same urls had they been optimised at the time. However, at the time I made the present site (5 years ago) my knowledge was scant at best and I've got to the point that not only does the site look antiquated and I'm loathe to use that old design for new pages, it's also difficult to integrate another 70 odd pages within the existing navigation.

If there are any links you can throw out there that explain more about "mapping" I'd be pleased to read them.

From your other two suggestions re longer tail, am I take it that that you belive index.html in a keyword named sub directory is possibly not the way to go for the best results and that a specified page name would serve better?

I kind of thought that an index page of an appropriately named directory may be seen as being of some weight?


 4:09 am on Aug 28, 2007 (gmt 0)

Buckworks, I like that too.

In fact that is how I am set up at the moment, just wondering how the SE's (principally Google) weight that compared to page names.


 4:40 am on Aug 28, 2007 (gmt 0)

> I kind of thought that an index page of an appropriately named directory may be seen as being of some weight?
Not if you have 300 of them...

There's a bit of pie-in-the-sky in that linked article; Some practices that are 'nice' but probably too difficult for individual Webmasters (all learning as we go) to really stick to. The examples are from large academic-type collections of articles, not from ever-changing e-commerce sites.

But at the same times, there often exists a core set of pages which is relatively unchanging, and at least the URLs for those pages should be nailed down so that they can have long, useful lives. And the idea in that document serves even the most callous Webmaster's interests as well: Changing a URL often disrupts the ranking of pages in search, and that often disrupts traffic and revenue.

I'm 'in it' myself, a working Webmaster like you, and I don't know of any books on the subject of "URL mapping for fun and profit," but just to demonstrate the possibilities, here's what is needed to map clean, extensionless URLs to .htm files on an Apache server, given the lowest level of server configuration access, the lowly .htaccess file:

# Start the engine
Options +FollowSymLinks
RewriteEngine on
# If the requested URL has no extension
RewriteRule $1 !\.[a-z0-9]+$
# but exists as a physical file when ".htm" is added to it
RewriteCond %{REQUEST_FILENAME}.htm -f
# then rewrite the requested URL to the same-named file with a .htm extension added
RewriteRule (.*) /$1.htm [L]

That will take any and all requested URLs which have no "filetype" extension, and rewrite them so that content is returned from a file with the same name, but having a .htm extension. This rewriting will take place as long as the requested URL has no extension, and as long as a .htm file of that name exists. Placed in the top-level .htaccess file, this rule would apply site-wide, regardless of "directory" depth.

What this demonstrates is the power of just a few lines of code in making your URLs 'clean' by removing the technology-dependent portion -- the part that is at greatest risk of changing over time, for example, as the use of static .html pages declines, and the use of dynamically-generated .php and .asp pages increases.

I should note that making a few exceptions to the above code is possible, but I wanted to keep the example and the explanation as simple as possible. The same is true for enhancements; You could have the code check for existing files as .htm, .html, .php, and .asp, etc. in any order, and use whichever existing file it finds first to serve content for the requested URL... Since this is a form of software, the possibilities are seemingly endless.



 8:58 pm on Aug 28, 2007 (gmt 0)

So from an SEO perspective this:


will outperform this:


and this:


will outperform this:



Interesting point on removing the .html when laying out a new structure. May as well I guess.


 11:30 pm on Aug 28, 2007 (gmt 0)

Indexing of a site is more about the click path from the root page to the target content, rather than the folder-structure-like-ness of the URLs. Breadcrumb navigation can be very useful.

I use index files in folders a lot. If you also do, do that make sure that a request for the index file filename by name always serves a 301 redirect back to the "bare" folder name ending in "/" type of URL.

Additionally, never mention the actual index file filename in any link within the site. End the link URL with the trailing "/" as before.


 2:30 pm on Aug 29, 2007 (gmt 0)

As far as ranking though. I understand that keyword-in-URL is relatively minor for ranking, but would you say the demo of mine above is true for ranking? Maybe it doesn't matter at all?


 5:14 pm on Aug 29, 2007 (gmt 0)

Keyword-in-url is a minor but real ranking factor. But I don't think there's any ranking difference based on the trailing slash, or on the presence/absence of a file extension. That makes no sense for an information retrieval algorithm that I can see, because it is such a arbitrary element of the url. In other owrds, it's not related to any relevance signal.

There are good technical reasons for choosing url structures that do not "expose" the underlying server technology, but on their own such urls would not affect ranking.


 8:38 pm on Aug 29, 2007 (gmt 0)

Please consider also usability issues not only technical and SEO. Here I strongly recommend you for site with 100 pages that one level in navigation maps to one level in URL. (and one level in breadcrumbs)

I know that world is about exceptions and also no web pages exactly match the hierarchy, but you have to do it (not only for users, also to have unique URLs).

Keywords in URL should also respect labels of menu items. TO respect not means to be the same, but to be as short as possible however address obviously one and only menu item.

Developing such content, menu and URL structure I consider much more important and also hard than the technical stuff.

Robert Charlton

 6:24 am on Aug 30, 2007 (gmt 0)

Regarding the question of trailing slashes if you have no file extensions, I'd pursued the topic in several other discussions, and it got resolved (by Jim Morgan) in this one....

Blog URL Structure for SEO

The file extension does not matter.

The question for me is whether to have the trailing slashes... and how to handle all cases of not having them.


 9:32 am on Aug 30, 2007 (gmt 0)

Isn't it an advantage having files in directories with final forward slash in case down the road the back end technology changes (say from .asp to .php) there's no noticable difference with little no changing of filenames?


 11:48 am on Aug 30, 2007 (gmt 0)

Yes. Portability/transparency of URL when changing internal technology is a very good thing to plan ahead for.


 3:09 pm on Aug 30, 2007 (gmt 0)

Ok, it sounds like the ideal for me would be:


with no file extensions or directory slashes. I don't see needing directories in the URL with my ecommerce site. The above seems simpler than:



 3:40 pm on Aug 30, 2007 (gmt 0)

The advantage of using slashes instead of hyphens is that it looks less spammy. If the keywords are in a structured least-specific/more-specific/most-specific order -- for example, /product-category/product-type/specific-product, then that's even better.

This order may actually exist as the directory structure of a well-organized static-page site. Or if the site is dynamic, it might not; The magic of Apache mod_rewrite and AcceptPathInfo (or the IIS equivalents) might be at work, but the client has no way to know that as long as the implementation is correct.

I don't use trailing slashes because they make the URL one character longer while providing no benefit -- That is, unless no rewriting is used, the final URL-path-parts are the same as the actual filepaths, and each page is stored as the index page of a unique directory (which makes a site very hard to maintain, IMHO).



 3:26 pm on Aug 31, 2007 (gmt 0)

Well, for a Tall Blue Widget, my page would be:


instead of:




What I'm saying is the keywords that show up in my more specific page names always seem to include the keywords that would be in the less specific page names.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved