Forum Moderators: phranque
http://www.example.com/category/3/10/keyword-description-of-the-image.html
The title/desc is pulled from a DB and is entirely superfluous other than being there for SEO purposes. All the images could be accessed as www.example.com/category/3/10/ as well.
RewriteRule ^category/?$ category.php?catid=1&prodid=0 [L]
RewriteRule ^category/([0-9]+)/([0-9]+)/?([^/]+)?/?$ category.php?catid=$1&prodid=$2 [L]
I haven't written it more specifically as ^category/([0-9]+)/([0-9]+)/?([^/.]+\.html)?/?$ because I may change the structure to http://www.example.com/category/3/10/keyword-description-of-the-image/ etc..
I don't have access to use RewriteMap, and plus, I will change the keyword descriptions from time to time. My questions are:
Is this a bad format? Since the description is just "thrown away" in the rewrite, could people maliciously link to http://www.example.com/category/3/10/any-old-thing-they-want.html and affect my PR, and cause duplicate content flags to be raised?
If I change the keyword descriptions, and thus the "filenames" I am linking to, should I add 301 redirects for all the old urls if they have been indexed, even though the rewrite is taking care of this. (since the only info that is really needed is the catid and prodid, the filename (.html) is junk).
I'm not keyword spamming in the URLs, I am just adding a little bit extra SEO, but don't want to shoot myself in the foot. I also don't intend to constantly change the keywords, but right now, a lot of them need refinement and it will be an ongoing process for a while.
Feel free to slap me and tell me what is the best practice for what I need to do.
[edited by: jdMorgan at 8:26 pm (utc) on May 25, 2007]
[edit reason] example.com [/edit]
RewriteRule ^category/([0-9]+)/([0-9]+)[b](/[^/]+)[/b]?/?$ category.php?catid=$1&prodid=$2 [L]
Is this a bad format? Since the description is just "thrown away" in the rewrite, could people maliciously link to http://www.example.com/category/3/10/any-old-thing-they-want.html and affect my PR, and cause duplicate content flags to be raised?
If I change the keyword descriptions, and thus the "filenames" I am linking to, should I add 301 redirects for all the old urls if they have been indexed, even though the rewrite is taking care of this. (since the only info that is really needed is the catid and prodid, the filename (.html) is junk).
I'm not keyword spamming in the URLs, I am just adding a little bit extra SEO, but don't want to shoot myself in the foot. I also don't intend to constantly change the keywords, but right now, a lot of them need refinement and it will be an ongoing process for a while.
This may allow you to reject some spurious URLs as invalid, by checking the higher-level URL-path-parts, and rejecting those that don't actually exist -- For example, in the list above, "wickets" is not present, so a a request for a URL starting with "wickets" can be rejected.
Some sites URLs can be organized/categorized/classified easily, and some can't -- If the list of acceptable URL-parts gets too long, then it becomes hard to maintain, and checking it takes too much time. So again, balance the utility of this approach against the actual level of 'problematic linking' in your market sector. If you find that you must validate URLs, then either get a VPS hosting account that supports using RewriteMaps, or do the URL validation in the script itself -- You could always output a 301 or 403-Forbidden header from your script, as long as you can customize it; the "junk" part of the URL is still available in PATH_INFO.
Also, consider that SE's don't care whether you use "category" or "product" or "cat" and "prod," and the latter are shorter... Less dilution of the keywords following in the URL.
Jim