|Some URL rewriting questions|
| 1:21 pm on Apr 15, 2012 (gmt 0)|
My current site has the classic ASP URLS similar to /Category.asp?CID=42 - I have setup ISAPI Rewrite and got /category/some_friendly_text/42 working.
Question 1 : However I guess I need to 301 the old URLs onto the new (which will rewrite back to the old!) - how do people handle such things? Will I need to replace my existing pages with a 301 and then rewrite onto new versions as the URLs of these will be effectively hidden anyway?
The site I am working on deals with technical products - I am generating the friendly urls from the product and category names - thus the source where I would generate my friendly URLs from can contain all sorts of special characters.
Things like brackets, quotes, full stops etc would not be uncommon at all.
I realise I will need to strip these down, and have been replacing spaces and slashes etc but how far should I go? For example they have a category called :
Fibre Leads Multimode 62.5125µm (OM1) 2mm
That has a full stop, brackets and a µ charcter in it.
Question 2 : Is there a best practice for this sort of thing with regard what charcaters are allowed?
| 2:09 am on Apr 16, 2012 (gmt 0)|
1. URLs are used on the web. Files are used inside the server. URLs and files are not at all the same thing. Your rewrite configuration links a URL request to an internal file found inside the server to deliver the content. Your redirect configuration tells anything externally requesting a particular URL to instead make a new request for a different URL. Redirects and rewrites have different effects. A redirect maps a URL to a URL. A rewrite maps a URL to a file.
2. This is often adequate:
$this->productSlug = preg_replace('/[^a-z0-9]+$/', '', (preg_replace('/[^a-z0-9-]+/', '-', (preg_replace('/[\']+/', '', (preg_replace('/\ ?&\ ?/', '-', strtolower($id['name']))))))));
but you seem to have a bit more of a challenge than most. Brackets and micro would have to be % encoded to be valid in a URL, so simply change brackets to hyphens and change micro to lower case u.
| 7:21 am on Apr 16, 2012 (gmt 0)|
Thanks for that. Didn't think of it in that concept but of course it makes sense.
As my old style URLs are well established and indexed I would want to redirect these with a 301 status to my new style ones though - am I correct in thinking that?
As it's a dynamic site with a couple of thousand pages I was figuring I should use the existing script pages and just get them to lookup what the new URL should be from the database and issue the 301.
Then my existing pages I just rename them and have the rewriting handle the resolution. Effectively they could be called anything now the rewriting is masking what the pages are called?
Does that theory sound OK? It's the first time I have really done this on an established site.
| 9:48 am on Apr 16, 2012 (gmt 0)|
Yep. That's exactly it. Completely and in a nutshell.
Link to new URLs from the pages of your site. Rewrite those requests to old horrible format now used only inside the server. Redirect external requests for horrible URLs to nice new pretty ones.
That short bit of (PHP) code I gave above, generates the text part of the URL for each page based on the name field in the database for that particular page id number.
The full URL of the page looks like this:
example.com/p<number>-<name> for products
example.com/r<number>-<name> for product reviews
example.com/s<number>-<name> for product spec sheets, and so on.
The PHP (in my case, other systems are available :) ) script that generates the HTML content page first checks that the exact right name has been requested for the current page number and redirects to the correct URL if not.
This automated closed-loop system ensures that visitors and bots cannot stray. It also allows the page name to be changed at a later date and all old links to that page automatically be redirected without any additional work.
You have the page number on the end of the URL. For more efficient RegEx patterns I prefer it at the beginning, as above.
Oh. And if the page numbers are truly unique site-wide, you don't need to specify /category/ in the URL. The benefits of that decision will become utterly apparent should you ever implement any form of multi-faceted navigation system.
If you do wish to retain /category/ in the URL, you should get the script to check it and redirect if it is not the right category requested for this page. That is a big source of duplicate content problems.
You don't want to be able to request a page as if it were in another category and have content directly returned. The request should either redirect to the correct URL or should directly fail as 404.