Forum Moderators: phranque

Message Too Old, No Replies

Mod rewrite - mapping product names -> ids

One rewrite per product? Database call?

         

arran

12:47 pm on Sep 13, 2005 (gmt 0)

10+ Year Member



I'm working on a new database driven site and would like to rewrite the urls before launching. I've read many threads on this topic but would appreciate some final advice before taking the plunge as i have no experience in this area.

My urls (simplified for this example) look like:

http://www.example.com/category.php?cat=3&subcat=4
(category listing)
http://www.example.com/product.php?id=6299
(single product)

I want them to look like:

http://www.example.com/category-name/subcategory-name.html

http://www.example.com/product-name.html

Here's my understanding of what needs to happen:

1) I need to change the urls in my html (either using preg_replace or by altering my php to use product/category names rather than ids).

2) Create a rewrite rule which translates my new urls back to the old ones.

Step 1) is fairly straight-forward but i need some advice regarding step 2). The problem is trying to map product/category names back to ids. At the moment, I can see 3 solutions:

1) Generate a modrewrite for every product/cateogory using information in the database. This seems to be a nasty/inefficient solution and would involve keeping database and .htaccess in sync.

2) Include both ids and product names in my new urls (e.g.

http://www.example.com/product-name-6299.html
) allowing me to do the mapping using 2 rewrite rules. Although the urls wouldn't be perfect, this would be quite a simple solution.

3) Somehow make a database call when a page is requested to dynamically map product/category names to ids. I imagine this would involve calling a php script from the rewrite (not even sure if this is possible). However, one more database call per page request would impact performance.

I'm thinking solution 2) could be the way to go but would be interested to hear opinions from those who have implemented something similar.

Thanks,
arran.

GregMR

1:24 pm on Sep 13, 2005 (gmt 0)

10+ Year Member



Not sure this is the best way but here's how I do it.

My categories and subcategories are names, like Accessories for example. I have a query that lists the categories and displays it "Accessories". Without mod rewrite the url would look like this:
www.example.com/category.php?PriCat=Accessories.

I have a mod rewrite rule like this:
RewriteRule ^category-(.*)\.html$ category\.php?PriCat=$1

My link to the category links to the html version:
A HREF="category-<?php echo $PriCat);?>.html" so my final url is
www.example.com/category-Accessories.html

There's probably better ways to do this but it works for me. I hope this makes sense.

jdMorgan

2:15 pm on Sep 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



arran,

It's apparent that you've researched this problem, and I think you've got a pretty good grasp on the issues.

The easiest way to do this is to change your html so that links appearing on your pages contain both the product (or category) id and a very short product description -- one or two keywords. Having the product id (or category id) in the URL eliminates the need to do a database lookup when that URL is subsequently requested from your server. One or two mod_rewrite rules can then be used to strip the category or product name, and pass only the category or product id to your php script.

You will need to incorporate three pieces of information in each 'friendly' URL:

1) Category or product name (short description/keywords)
2) Category or product id
3) Tag to identify whether URL contains category or product

This last part -the tag- can either be explicit, or based on the 'form' of the URL. For example, two explicit methods would be:

example.com/prod/blue-widget/1234/
example.com/cat/small-widgets/17/
-or-
example.com/blue-widget-p1234/
example.com/small-widgets-c17/

Basing the 'tag' on the URL 'form' is possible, but can lead to ugly URLs and other problems. But as an example, you could make the rule that product URLs contain hyphens and category URLs contain periods or something similar. I prefer the second method shown above, though.

Another way to do this is to pass all category-and-product-by-name requests to your main script, and let it do the database lookups to translate the category and product names in the URL to the ids that are used to generate pages. This would involve expanding your database to include a 'short description' for use in URLs for each category and product. You'd also have to be careful to assure that each of these short descriptions is unique. In the end, with an eye toward a much larger product selection in the future, you are probably just as well off including the id in the URL and taking the simple approach above, since you'd likely end up putting the product ids back in the URLs in order to keep them unique.

Hyphens are by far the safest word-separator to use in URLs. Search engines treat them as spaces, as opposed to underscores, which are often treated as 'letters' and can destroy the benefits of keyword-in-URL unless the searcher actually types an underscore between his/her search terms (unlikely).

Jim

arran

5:11 pm on Sep 13, 2005 (gmt 0)

10+ Year Member



Greg,

Your solution makes perfect sense but my 'unfriendly' urls are slightly different from yours (i.e. they don't include the product name).

Jim,

Thanks for breaking it down for me - you have a great way of explaining potentially confusing concepts.

I plan to use the [product + id + tag] method you proposed.

Out of interest, is there any reason why you rewrite urls to appear as directories rather than html pages (e.g.

example.com/blue-widget-p1234/
and not
example.com/blue-widget-p1234.html
)?

On a slightly different note, when viewing the products in a particular category, users can sort products in the usual ways (price, brand etc.). My 'unfriendly' urls differs slightly based on how the data is sorted e.g.

example.com/category.php?cat=4&subcat=14&sorted=1
-- sorted by price
example.com/category.php?cat=4&subcat=14&sorted=2
-- sorted by brand

As the data is effectively identical, would it be wise to rewrite both urls to the same 'friendly' url in order to avoid presenting duplicate content to search engines?

arran.

jdMorgan

3:22 am on Sep 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Out of interest, is there any reason why you rewrite urls to appear as directories rather than html pages (e.g. example.com/blue-widget-p1234/ and not example.com/blue-widget-p1234.html)?

No, none that I feel strongly about. But why waste characters? -- Neither search engines nor users care what 'technology' your site is based upon (html vs php, for example).

On a slightly different note, when viewing the products in a particular category, users can sort products in the usual ways (price, brand etc.). My 'unfriendly' urls differs slightly based on how the data is sorted e.g.

example.com/category.php?cat=4&subcat=14&sorted=1 -- sorted by price
example.com/category.php?cat=4&subcat=14&sorted=2 -- sorted by brand

As the data is effectively identical, would it be wise to rewrite both urls to the same 'friendly' url in order to avoid presenting duplicate content to search engines?

No. You rewrite friendly URLs to unfriendly ones, so that when a friendly URL is requested from your server, you can activate your script properly. The 're-writing' of unfriendly to friendly URLs is something you might do manually on your pages, or by the use of preg_replace or string_replace. But the friendly URLs originate on your pages, are clicked on by users or followed by search engines, and then, once requested from your server, are rewritten into the query form needed by your script to produce the *next* page.

A bit of pondering of the above will show that it won't be possible to display a properly-sorted page unless the 'sort-by' information is present in the 'friendly' URL. Bear in mind that mod_rewrite cannot create information; It can only take the information present in the friendly URL, re-arrange it, stick it in a query string, and pass it to your script.

However, the information need not be explicit or obvious. You could, for example, have two variants of 'cat' -- plain-old 'cat' for sorted by brand, and 'cats' for sorted by (s)ale price. Or you could use 'catb' and 'catp' -- it doesn't matter as long as those path elements are unique. As long as you use some consistent and scalable 'tagging' convention, then mod_rewrite pattern-matching can be used to regenerate the correct 'unfriendly' URL to call the script.

Jim

arran

8:20 am on Sep 14, 2005 (gmt 0)

10+ Year Member



Thanks Jim - makes sense.

Now off to implement it...