Forum Moderators: phranque
I've been doing many things using mod_rewrite, but this one is proving to be a real chalenge for me.
To get the best possible SEO, I want to offer google links that have the following structure:
Instead of
www.mysite.com/productsearch/hairspray
wich is a pretty good SEO friendly url,
I want
www.mysite.com/hairspray
The problem is that I have the legitimate subdirs at the webroot, and I don't want them to end up at the search script. So I am facing a negated group problem.
I tried this but didn't work:
RewriteCond %{REQUEST_URI} !^/(blog¦cgi\-bin¦mod_perl)
RewriteCond %{REQUEST_URI} ^(.+)$
RewriteRule ^([!\.]+) $ /home/mysite/www/mod_perl/busca.cgi?searchby=productname&keyword=$1 [L]
And putting the negated group pattern in the RewriteRule directive seems to be impossible - by the very nature of the directive, wich is a positive match. So the solution seemed to be near something on the direction above. But this still doesn't work.
Any ideas?
Thanks a lot fellows
Mark
Also note that your "negated-group" syntax was incorrect: Use "^" at the beginning of a [group] to indicate that the grouped characters should be rejected.
If you wish to exclude "legitimate subdirectories" and all URL-paths containing periods (e.g. images, CSS and JS files, robots.txt, sitemap.xml, etc.) from being rewritten, then you don't even need a RewriteCond. Just reject URL-paths containing periods or slashes in the RewriteRule pattern itself:
RewriteRule ^([^/.]+)$ /home/mysite/www/mod_perl/busca.cgi?searchby=productname&keyword=$1 [L]
Jim
Thanks for clearing things up, it was really messy.
But, hey, shouldn't I add a backslash before the period, so it is not interpreted as any-character?
RewriteRule ^([^/\.]+)
And, also, shouldn't I negate it as well?
RewriteRule ^([^/^\.]+)
Also, besides that, there's a problem with that on the directories issue. I do have a 'blog' directory wich can be seen by just typing 'www.mysite.com/blog'. Shoudn't that be a problem? I mean, the trailing slash does get added by apache, but I am not sure if mod_rewrite won't capture it before this happens.
Thanks a lot for the special attention Jim
Mark
BTW: Great tip the DocumentRoot, i'll definately do that.
If mod_dir does not 'fix-up' the missing-slash-subdirectory requests before your new busca.cgi rule executes, then you should add another rule to fix-up those slashless URLs, correcting that problem with an external redirect before attempting to execute the internal busca.cgi rewrite.
Also, review the regular-expressions 'escaping' rules; They are quite different inside and outside [groups].
Inside [groups], only "]", " " (space), "^" used as the first character, and "-" used as anything but the first character need to be escaped. No other 'regex function tokens' such as "." or "+" or "*" are recognized within groups, because those functions wouldn't make sense to use inside a group.
Jim
Thanks for the reply.
Perfect, it worked!
The trailing slash wasn't a problem at all.
(I just had to add a negation to the period, that was missing in your pattern and - contrary to the other ones - wasn't a product of my average-joe knowledge of regexp. Well, at least it worked this way - not sure it would your way too.)
RewriteRule ^([^/^.]+)$
Thanks a lot! All working just fine now!
Mark
Again, you need to review regex escaping requirements. Your pattern ^([^/^.]+)$ means "Match any URL-path that contains one or more characters which are NOT a slash, a carat, or a period. If used as the first character, the carat ("^") means, "Negate this entire group." A carat appearing in any but the first position within the group is taken a literal character to be matched as part of the group.
If my regex group pattern failed the first time you tried it, the likely reason is that you forgot to completely flush (delete) your browser cache before testing after changing your server-side code. If you don't delete your browser cache, then your browser will show you previously-cached pages and server responses, and it will not send a request to your server (unless the requested URL was previously marked as non-cacheable by code on your server).
Jim