Forum Moderators: phranque

Message Too Old, No Replies

html or directory structure for mod-rewrite

html or directory structure

         

fahad direct

8:09 am on Apr 15, 2010 (gmt 0)

10+ Year Member



Hi, What is the most effective technique either to have all the pages html based like ...anyurl.html or directory structure like anyurl/param1/param2/.../ to be crawled by google, yahoo or msn in more better way?

jdMorgan

1:13 pm on Apr 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is no good reason to have your URLs weighted down with an extra ".html" on the end.
As for parameter format, that also makes little difference. You can separate parameters with slashes or hyphens, or even commas or periods if you like.

Slashes and hyphens are used most frequently for usability reasons (readability, both reading the URL on a Web page and reading it out-load -- for example, on the radio), and underscores are to be strictly avoided for both readability and search indexing reasons; Underscores 'hide' under the URL underline and look like spaces, and they *are not* treated as word separators. Thus, a URL of "blue_widgets" will only rank for searches on exactly "blue_widgets" -- The search term must contain a literal underscore for this URL to match and be displayed in the results.

Another issue is the taxonomy of the 'friendly' URL. It must be designed as a "system" that all URLs conform to, so that in every case, when you look at the URL, you know it is, for example, /action/category/product-type/product-brand/product-model/product-color/product-size. If this is not possible, then some method of identifying the meaning of each parameter in each-position must be used. For example, if not all "products" will or can have all parameters and have them in that fixed order, then each parameter will need to be "tagged" with a letter (or word, or number) to identify its meaning: /a-<action>/c-<category>/b-<brand>/c-<color>

This is a fairly complex subject, but to avoid trouble, it's important to realize that your server code (mod_rewrite, script, etc.) must be able to tell what to do with the various "parts" of the requested URL based only on the characters in that URL; The code has no idea what the words and numbers in the URL actually mean and cannot, for example, "know" that the word "blue" is a color and the word "basket" is a product. So it's important that products and colors always appear in the same position in the URL, or that they be explicitly tagged.

As for being "crawled better," I suspect you mean "ranked better." In that regard, it makes little difference which method you choose: Keyword-in-URL is only a small factor in ranking, and while it is an important 'eyeball-grabber' when highlighted in the search results, the page title and description are also (and possibly more) important.

Personally, I prefer the slash separator only because it is an unshifted key on most keyboards.

Jim