Forum Moderators: phranque
RewriteEngine On
RewriteRule ^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)?$ http://www.example.co.uk/index.php?lmenu=$2&brand=$1 [L]
This works fine for links like
<a href="catimini/15"></a>
to
http://www.example.co.uk/index.php?lmenu=15&brand=catimini [L]
It fails though if the brand has a space in its path for example
<a href="rip curl/16"></a>
doesn't convert to
http://www.example.co.uk/index.php?lmenu=15&brand=rip curl[L]
I have read quite a bit but I must admit this is taking longer than usual for the penny to drop! Anyone point me in the right direction (as plain english as posssible please) to handle the spaces in the querystrings?
Also just a question on link paths, is it better to use absolute paths or relative paths in links to internal pages?
[edited by: jdMorgan at 5:00 pm (utc) on Mar. 5, 2009]
[edit reason] example.co.uk [/edit]
Neither of these is likely what you wanted.
You can save space in your pattern, and speed up the rule execution by 25% or so, simply by using the [NC] flag on the rule, to make the pattern-matching case-insensitive.
In order to match a space, you must include the space in your [alternate character group]. And in order to avoid the mod_rewrite parser throwing an error, you must escape that space by preceding it with a backslash.
Finally, in order to prevent future problems due to server upgrades and differences in the regex libraries provided by various servers' operating systems, it's a good idea to also escape literal hyphens in regex patterns to distinguish them from the character range operator (e.g. the hyphen in a-z):
RewriteEngine on
#
RewriteRule ^([a-z0-9\-\ ]+)/([a-z0-9\-\ ]+)?$ /index.php?lmenu=$2&brand=$1 [NC,L]
There are three basic forms of linking: Page-relative, server-relative, and canonical (loosely-termed "absolute"). Which of these you use depends on several factors: Test environment, server-configuration canonicalization support, and personal preference.
If you are testing on a PC and don't have a test server running on that PC, then using relative links preserves your ability to test your site without a server.
If you do full subdomain, domain, FQDN, port number, URL-path, and URL-fragment (named anchor) canonicalization in the server configuration (e.g. in httpd.conf or .htaccess), then there is no search-related need to include longer server-relative or canonical links on your pages. If not, then you should consider using these longer forms to prevent duplicate-content from arising, for example, if a search engine indexes your whole site starting with "example.co.uk" instead of the canonical "www.example.co.uk". Lacking forced canonicalization in your server config, this would result in two "copies" of your site one at and one at, both with the exact same content. The pages would essentially compete with each other for ranking, you'd end up with links to the non-canonical domain, and over time, your canonical pages would lose ranking power to the non-canonical ones.
Major search engines have begun to do back-end processing and have recently added an HTML element to address this problem, but the fact remains that using these band-aid approaches introduces an external dependency of your site on outside parties to "get it right." When such problems can be corrected --and in fact prevented entirely-- by proper server configuration, there's little reason to rely on the kindness of strangers for your site's success...
So, you might want to look into the various thread here in the forum and in our Apache Forum Library to see the code snippets needed to do thorough job of forcing canonical URLs, so that any given page on your site can be reached with one and only one unique URL, and all other valid-but-non-canonical requests result in a 301-Moved Permanently redirect to the canonical URL.
One other factor is involved: That of content-scrapers who copy your whole site (usually to slap ads all over it and to steal your traffic). Using canonical links can help here, as they will have to edit all your pages or scripts to change those links to work on their domain. However, sometimes a compromise is appropriate: You can use relative links to everything except for the home page, and use a canonical link for that. This allows you test test most links on a PC without running a server, while still saving lots of bytes in your other link. The degree to which you use canonical links for this application also depends on how well you 'defend' your site against content-scrapers; If you have good battlements around your site and armor on your pages, canonical links as a defense against scrapers may not be needed at all.
All that said, it still comes down to a matter or personal preference.
Jim
To be honest I am not 100% sure why I have to do this and to what level, I believe this makes the site ranking better in Google by making better paths?
I will continue to explore this black magic ;o) but if there is a simple fix for those paths I would appreciate the heads up.
Cheers again
Steve
If you are now having problems with CSS and other URL-paths containing spaces, then you either need to explicitly exclude those paths from being rewritten by the rule, link to those resources using server-relative or canonical URLs (as opposed to page-relative URLs), or further refine your requirements; The code does exactly what you write it to do, not necessarily what you want...
Jim
What I didn't/don't get was why when I had the full path it only affect first level links, when I used your modification without the full path it then affected second and third level links, breaking the stylesheet. I do not understand how such a small change caused that.
I will continue to explore, I do not find this very intuitive at all.
Cheers
Steve
I'll read on :o/
Thanks guys, I can see there are a lot of similar questions on this forum and you are working hard to explain things, which reinforces how cryptic this really is for newbies. Reading them is like reading a foriegn language dictionary, and I can code freely in XHTML, CSS, JavaScript, PHP and MySQL!
Is there a simple and example driven site that covers this topic effectively?
Steve
The basic fact is that it's not simple. It starts off with regular expressions --which alone have had hundreds of books devoted to them -- then carries on through the various syntactical constructs for RewriteRule and RewriteCond, the server variables they can reference, the flags that can be used to modify condition/rule behaviour, and ends up with internal rewrites, external redirects, and proxy through-puts.
And this doesn't even address that fact that there are potential server performance and search-ranking side-effects to every rule; You simply can't hide from the fact that you are adjusting the server configuration, and that is never a "simple" matter. Neither is it particularly safe -- for those who are not detail-oriented.
There are some tutorial and example threads in our Apache Forum Library. There are links to reference material in our Forum Charter. See the links at the top of this page.
Jim