Forum Moderators: phranque
On my site I have a section where I put various business listings.
URL for a specific business is as follows:
www.mysite.com/business/category/subcategory/location/companyname.html
example in htaccess:
RewriteRule ^([business]*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)\.html$ /viewclient.php?companyname=$5 [L]
That is all pretty, buy I also wish to give my clients option to have short URL which they can put on their business cards like so:
www.mysite/companyname/
To do so I made this in .htaccess:
RewriteRule ^([john_mechanics]*)\/$ /viewclient.php?companyname=$1 [L]
BTW, I cant use
RewriteRule ^([^/]*)\/$ /viewclient.php?companyname=$1 [L]
cause I already use it for something else : )
But now we obviously have duplicated content on our hands : o
Couple of questions that puzzle me:
1) will all-mighty google punish me for this? (probably..)
2) is there a better way that I might solve this?
3) uhm, I'm a htaccess beginner really, is my code ok? : /
Thanks in advance ! : )
That is most likely not what you want, and I would think that simply using
^(john_mechanics)\/$
is what you intended.
There is a short regular-expressions tutorial cited in our forum charter, and an even shorter one in the mod_rewrite documentation itself.
Jim
Therefore, this rule
RewriteRule ^(business*)/([^/]*)/([^/]*)/([^/]*)/([^/]*)\.html$ /viewclient.php?companyname=$5 [L] becomes either
RewriteRule ^business/[^/]+/[^/]+/[^/]+/([^/]+)\.html$ /viewclient.php?companyname=$1 [L] or
RewriteRule ^business[^/]*/[^/]+/[^/]+/[^/]+/([^/]+)\.html$ /viewclient.php?companyname=$1 [L] Depending on what you're trying to do with "business" -- Either match it exactly or allow for zero or more trailing characters.
Be aware that since the URL-path-parts matching your old $1 through $4 are irrelevant to the rewrite, it appears that you've created an opportunity here for duplicate-content problems; Each unique "page" on the Web should be accessible with one and only one URL. The code above seems to allow any "business" page to accessed at
example.com/business/<anything-at-all>/<anything-at-all>/<anything-at-all>/<business-name>.html
and that makes your site vulnerable to PageRank/Link-popularity dilution through malicious linking to "random" URLs. If someone points enough links at, for example,
example.com/business/that/cheats/customers/Acme.html
then the search engines may 'pick' that as the preferred search results listing URL for the real Acme business page on your site.
To avoid this, your rule or your viewclient.php script should probably be modified to check the validity of the entire URL-path.
Jim
once again you perfectly noticed another flaw that I have here : )
I am aware that someone might put *anything as a category type and still reach the specific business.
Obviously I though nobody would do such a thing cause there is no actual cilickable link that would lead there, but as you pointed out Jim, there is a lot of bad people out there : (
I have set up things this way to be compatible with my home brew CMS system, I guess I could set things up to dynamically add full links into htaccess for every company.
I'll get to it right away : )
Thanks again Jim !