Forum Moderators: phranque
First of all, i´ve seen some comparable problems in this forum, but because im an absolute beginner I couldnt duplicate the solutions to the following problems. Hope you guyz can help me step by step :D .
I got the following working with some help:
Code .htaccess:
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteRule (.*)/(.*)/(.*)/ product.php?categorie=$1&subcategorie=$2&slug=$3
When I now link to:
mysite.com/hat/red hat/super hat/
The right page and info turns up dynamically (Huray! )
This still leaves me with 4 problems:
1. Images wont show up aymore on the speific product.php page. What should I add or change in the .htaccess for this?
2. The URL still contains %20 (spaces) which I want to convert to dashes. When I manually link with dashes it does work, but there is no way to do this automatcally now because the variables in my database do not contain hyphens.
I now link to each page using this URL:
<a href="http://www.mysite.com/<?php echo $row_Recordset1['categorie'] . "/" . $row_Recordset1['subcategorie'] . "/" . $row_Recordset1['slug'] . "/"; ?>
So how do I rewrite the URL with dashes?
3. How do I rewrite the above given URL so that it automatically replaces dashes with hyphens (even when I link with a link that does not contain hyphens)? I know a 301 is probably involved...but thats about it :).
4. I will create the categorie and subcategorie pages manually. E.g:
hat.php
hat/black-hat.php
How can I rewrite ".php" to "/" ? This is something that should be done for all categories except for 1 dynamic page for user searches where I use a "get", and it should not clash with therewritten product.php page.
Hope you can help!
[edited by: jdMorgan at 3:57 pm (utc) on Oct. 17, 2008]
[edit reason] Please use example.com only. [/edit]
The solution is to use server-relative or canonical links to your images, CSS, etc. Instead of using <img src="logo.gif"> you must use <img src="/logo.gif"> or <img src="http://www.example.com/logo.gif">.
2. Modify your database or the script that generates your pages to use hyphens instead of spaces. No good/efficient .htaccess solution exists, because the URL that appears on your page *defines* the URL, and there is nothing that .htaccess can do to actually "change" the URL as seen by the Web.
You will find that replacing spaces with hyphens is fairly easy in PHP -- see the preg_replace [php.net] function.
Your last RewriteRule is coded very inefficiently, and will require dozens (perhaps hundreds) of "trial fits" in order to find a match. It will therefore run very slowly.
It is also subject to unexpected operation if, for example, a URL such as "example.com/hat/red hat/super hat/extra-stuff/more-extra-stuff" is requested. This incorrect URL may "work" and the result will be duplicate content -- The very same content available at more than one single URL.
I'd suggest:
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/$ product.php?categorie=$1&subcategorie=$2&slug=$3 [L]
Also, remove the [NC] from the first RewriteCond above.
Jim
You where spot on with solution #1. Also, solution #2 works, I dont know if it made everything faster becuse the site is shielded and im the only one using it (already went pretty fast :) ). Maybe you saved me a lot of trouble for the future!
Still leaves me with:
1. small issue of rewriting .php to "/".
2. larger issue of "preg replace". --> I Looked at this, but this goes way beyond my skills. Could you maybe point me in the right direction in the mechanics (how is this exactly supposed to work) and maybe some code?
I currently have a SQL database hooked up with a table called "product" the needed variables for the URL are "categorie", "subcategorie" and "slug" (<-- manually rewritten text of the name of the product with hyphens, or should I simply use the variable "name" (product name) instead. Another expert on this subject told me that "name" would probably be to subjective to error.
I hope to hear from you! Kind regards, Nick.
2. You should ask this question in our PHP-specific forum if you want competent assistance -- I'm no PHP expert, and this forum is specific to Apache configuration, not scripting.
Jim
What I meant foor rewriting to "/" is rewriting for instance mysite.com/whatever.php to mysite.com/whatever/ .
It doesnt need to be "rewritten", just as long as I can link to whatever/ instead of whatever.php and it still executes the actual page whatever.php.
Hope you can help!
Nick.
I would caution against adding a trailing / to the end, and instead go for extensionless URLs. A trailing / usually implies a physical folder on the server.
Im "supprized" to hear this because I know a few sites which have this structure and are doing extremely well SEO wise. So you're saying that linking to www.mysite.com/category1/ is not a goed idea because this wont be indexed correctly?
I thought this would work because my most important keywords are after the first slash:
www.mysite.com/category-x/
Then come the next most important:
www.mysite.com/category-x/subcategory-x/
And then the products:
www.mysite.com/category-x/subcategory-x/product-x/
Because they all fall under a clear structure I thought that this would help boost the keyword in "category-x", then to "subcategory-x" and so on.
I asked this question here: [forums.#*$!.com...] as well to verify, but I dont believe they truly get what I mean.
Example (this is not suitable for work(!)): www dot klara dot nl --> they linked this way with their main keywords which are the ones listed top left and they score extremely high on every one of them, plus they are indexed correctly when you look in Google. How did they do this?
[edited by: Joppiesaus at 1:17 pm (utc) on Oct. 18, 2008]
I saw for instance that Google indexed the URL of a scpecfic page to www.mysite.com/category-x/ .
Because they see it as a very relevant physical older, maybe thats why it gets such a high score? Also because it is very clear that this is the highest level and all underlying pages give this one more relevance? Just thinking out loud :).
But you guyz have the experience that indexing goes wrong often and that they do not score good positions?
(Google has a 96% market share in my country, so thats the only one worth while to look at)
Basically, the search engines don't care, and /category-x, /category-x/, category-x.html-or-whatever, or category-x.html-or-whatever/ are likely to rank so closely as to be indistinguishable. Now which one is shortest and easiest to type? If you put a hundred of each type of link on two pages, which page will be smaller and load faster? Which one will have better link-text keyword density?
Putting a trailing slash on a "page" URL is incorrect in HTTP-speak, and it's also just plain silly; URLs with trailing slashes indicate directories, and pages are not directories. The use of trailing slashes on pages comes from a very few programmers writing popular blog and forum packages who added the slash for no good reason, and the resulting "herd mentality" that causes people to copy them. Although they might indeed have written really nice and widely-used forum and blog packages, it's a good bet they didn't spend any time reading the RFCs concerned with HTTP URL construction and syntax.
Jim
I have a little followup question concerning this point. I currently use this URL php to build my URL:
<a href="/<?php echo str_replace(array('%20', ' '), '-', $row_Recordset1['categorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['subcategorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['slug']) . ".html"; ?>">
This works perfectly, however, I need to write a "slug" variable for every inputted product. The slug is simply the name where dashes and such are replaced by "-" so no errors occur when the GET is done on the product page. No ID number is used because the SLUG is better SEO, t contains the product name...
This takes too much time however. I really want to use the "name" (product name) variable instead. Someone wrote a nifty peace of code for this:
/*
* function to automatically convert text string to URL string
*
* @param string $str text string to convert
* @param int $optimize - set to max number of words in URL
* @return string $str
*/
function writeUrl($str,$optimize = false) {
//global $config;
//uses('sanitize');
$str = strtolower(trim($str));
$str = html_entity_decode($str, ENT_QUOTES);
$patterns[0] = '/ - ¦ \/ /'; // existing hyphen with space, or existing forward slash with spaces (' - ', or ' / ')
$patterns[1] = '/ ¦ /'; // space and double space
$patterns[2] = '/\//'; // forward slash
$replacements[0] = '-';
$replacements[1] = '-';
$replacements[2] = '-';
$str = preg_replace($patterns, $replacements, $str);
//$sanitize = new Sanitize;
//$str = $sanitize->paranoid($str,array('-'));
// shorten and optimize url if required
if ($optimize) {
$stopWords = array('all',
'and',
'its',
'the',
);
$acceptableWords = array(
'le', // eg: le mans
);
$strArray = explode('-',$str);
// remove empty values
$strArray = array_filter($strArray);
// remove stop words and short words
foreach ($strArray as $key => $value) {
if (in_array($value,$stopWords) ¦¦
( strlen($value) <= 2 && !in_array($value,$acceptableWords) ) ) {
unset($strArray[$key]);
}
}
// remove duplicate values
$strArray = array_unique($strArray);
// slice array to max number words, resetting array keys in the process
$strArray = array_slice($strArray, 0, $optimize);
// if last word is numeric, remove it so as not to confuse it with page number
for($i=0;$i<count($strArray);$i++) {
$lastKey = count($strArray)-1;
if (is_numeric($strArray[$lastKey])) {
unset($strArray[$lastKey]);
}
}
// construct complete url string
$str = implode('-',$strArray);
}
return $str;
}
The following code outputs text where all dashes and such are replaced by "-":
$text = 'The quick brown fox jumps over the lazy dog';
echo writeUrl($text, 6);
// result:
// quick-brown-fox-jumps-over-lazy
I think it should be possible to use this to create the URL, something like:
<?php
$text = '". $row_Recordset1["categorie"] ."/". $row_Recordset1["subcategorie"] ."/". $row_Recordset1["name"] .".html';
echo writeUrl($text, 6);
?>
This doesnt seem to be written properly, can you guyz tell me if this is a good way and how exactly do I need to formulate the php input?
Thanks!
The code works so that isn't the problem:
$text = 'The quick brown fox jumps over the lazy dog';
echo writeUrl($text, 6);
// result:
// quick-brown-fox-jumps-over-lazy
I just want to haeit so that instead of 'The quick brown fox jumps over the lazy dog' I can automatically insert my categorie/subcategorie/name.html variables...
Thanks for your advice. I am absolutetly commited to take advice that's given here, thats why I come here and you guyz have way more experienc then me :).
Might it be a good idea to take it step by step? From what I understand from you guyz is that /red/red-stuff/red-hat.html will also be shown using /blue/blue-stuff/red-hat.html because the get only uses the "red hat" name variable? Maybe its smart to first get the URL with the rewritten name variable working at all and then get the consistency stuff fixed like g1smd pointed out?
Hope to hear from you guyz on how t proceed next because this task goed way beyond my basic basic basic php skills I picked up over the last few weeks.
/blog/a-really-interesting-post-about-some-stuff-48228 where the 48228 is used to pull the right database record and the rest of the URL is ignored. In that case, an incoming link for
/blog/this-site-blows-and-the-owner-is-a-criminal-48228 will return exactly the same page of content, and that alternative URL will be indexed by search engines. That is, unless the PHP script also checks that those words are exactly right. How that is done, is that those words are stored as another entry in the database, recorded as being the ones for post 48228. If they don't match, either issue a 301 redirect (from within the PHP script!) to the right URL -- and you already have everything handy to construct that URL, the post number that was requested, and the "words" you just looked up in the database -- or else just issue a "404 Not Found" error from within the script.
<a href"/<?php echo str_replace(array('%20', ' '), '-', $row_Recordset1['categorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['subcategorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['slug']) . "/"; ?>">
It gets 3 variables to construct the URL, but it only needs one (variable "slug" to show the right product. Internally this is not a problem (wont be any duplicate content etc), externally, or in other words people can absolutetly mess around with the URL like you said. I am comitted to later on implement this 301 rule when a wrong categorie or subcategorie is used, but the first step (which I cant seem to get done) is to use the "name" variable (=product name) instead of the "slug variable"(=hand written product name where dashes etc are replced with "-".
Could you guyz please tell me how to use the script above to construct a working internal product URL first with the name variable? If this is done ill focus on consistency because I REALLY need to take it one step at a time ;).
E.G.: Something like <a href="/~w3613561/<?php echo str_replace(array('%20', ' '), '-', $row_Recordset1['categorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['subcategorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['name']) . "/"; ?>"> where the GET on the product page doesnt crash because the name in the URL contains "-" etc and the database version does not...
However, you've said that the item number (48228) is all the script currently cares about, so why is the product page 'crashing' in the first place?
Jim
My URL is like this: /category/subcategorie/slug.html --> and want to make it category/subcategory/name.html
What you are saying about reversing the process is exactly what I need. This code I use now: <a href="/<?php echo str_replace(array('%20', ' '), '-', $row_Recordset1['categorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['subcategorie']) . "/" . str_replace(array('%20', ' '), '-', $row_Recordset1['name']) . "/"; ?>"> (of course doesnt work with name variable, only when I input slug)
This code only replaces the dashes with hyphens, so the reverse process should be doable I guess.
However (and this is the reason i'm here :) ), i'm a complete newbee on this stuff. Would I need to create a special SQL code to convert it back when it GETS the name? What would it look like?
A few other questings about rewriting. I need to rewrite several URL's (I think with redirects (301)) to my new pages, but I'm not completely sure how:
Example.com/Page1.php --> Example.com/Page1/
Example.com/Page1-subject2.php --> Example.com/Page1/Subject-2/
Example.com/Product.php?ID=#*$!&Category=YYY&Name=ZZZ --> Example.com/YYY/AAA/BBB/
This last one is probably not possible because I used to build that URL with two different variables of which the ID variable was used to "GET" the product. NOw this is done by a new variable. To not get duplicate content stuff I guess it needs to be rewritten anyways. What would be best? I dont know anything about apache code, so the exact lines would be EXTREMELY usefull.
RewriteRule ^the-old-url http://www.example.com/the-new-url [R=301,L] For those with query strings you will need a
RewriteCond to examine the %{QUERY_STRING} in some way: RewriteCond %{QUERY_STRING} &?something=value&? If there are multiple values that need to be captured and re-used, the code is a bit more complex.
The dynamic ones are the ones that seem hard because many different variables are used.
Old URL: Bestellen.php?...
Old variables: "ID" "Categorie" and "Name" --> where only the "ID" variable was used for the GET.
URL example: http://www.example.com/Bestellen.php?ID=1000&Name=red%20hat%201000&Cat=Hat
New URL: Product.php?...
New variables: "Slug" (this one is used for the get), Categorie, Subcategorie
Original URL: Example: http://www.example.com/Product.php?Categorie=Hat&Subcategorie=YYY&Slug=ZZZ
Rewritten URL Example: http://www.example.com/Hat/YYY/ZZZ/
All the information for both URL's are in the product row, but I have absolutetly no idea if and how it is possible to rewrite everything to the new situation..
Hope you can give some insight on this one..
[edited by: Joppiesaus at 2:58 pm (utc) on Nov. 3, 2008]
RewriteRule ^the-old-url http://www.example.com/the-new-url [R=301,L]
I get an error saying it goes into an infinate loop. My rewrite looks like this:
RewriteRule ^Old-URL.php http://www.example.com/Old-URL/ [R=301,L]
I think it has something to do with the / at the end. How do I work around this.
Also, I want all .php variants turned into their new "/" variants, do you guyz know the code for this? Turning .php in .html is easy, this seems different...
If so, you do have a loop, because your code redirects /OLD-URL.php to /OLD-URL/ and then DIrectoryIndex will rewrite that back to /OLD-URL.php, and then you redirect it again, DirectoryIndex rewrites it again, and so on.
So, if /OLD-URL/ is the DirectoryIndex page, you need to take one more step in your Rule: Check %{THE_REQUEST} to be sure it was the client which directly requested /OLD_URL.php before you redirect it. If not, then the URL was internally rewritten by DirectoryIndex, and should not be redirected.
Jim
This is not the index page. I tried a lot of stuff and found out that:
RewriteRule ^Contact.html http://www.example.com/Contact/ [R=301,L] does work (but the .html does not actually exist on the server), and when I use:
RewriteRule ^Contact.php http://www.example.com/Contact/ [R=301,L] it doesnt work.
How can I get it tht a simple random page.php is rewritten to page/ ?
Also, can I write a rule that does this automatically for all .php extensions?
Thanks.
... and when I use:
RewriteRule ^Contact.php http://www.example.com/Contact/ [R=301,L]
it doesn't work.
Do you have another rule that rewrites or redirects /Contact/ to /Contact.php, or that rewrites or redirects /<anything>/ to /<anything>.php?
If so then the two rules are clashing, and a loop is to be expected.
How can I get it tht a simple random page.php is rewritten to page/ ?
Also, can I write a rule that does this automatically for all .php extensions?
That is simple. But first, you have to get the single-URL "Contact.php" case working. Trying to make the code more complex before fixing a basic problem is a waste of time (yours and ours).
Jim
Seems like a good plan. The .htaccess does contain some code on rewriting .php files (but it doesnt redirect yet):
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.*[^/])/?$ /$1.php [QSA,L]
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/$ Product.php?categorie=$1&subcategorie=$2&slug=$3 [L]
Do you know a way of mixing this? FYI, I am not experienced enough to write my own apache code...just steal a lot and pray it works. If you have some code that would be superb.
Otherwise, you will be at the mercy of others --some competent and others not-- to keep your server working properly and preserve your search engine rankings... You will also be asking others to do work for you that you could probably do for yourself.
You will find it helpful to write accurate, descriptive comments for your code, and to leave those comments in the code (forever). If you return to a large .htaccess file after several years, you will find that the comments will be very helpful, and may prevent you from making mistakes.
Your first rule posted above and the new redirect are in conflict and will work against each other, causing a loop. I already posted the solution above: You must check %{THE_REQUEST} before redirecting requests for /x.php to /x/ to prevent this loop.
Here is the new redirect combined with your code posted just above, plus another new redirect that you would probably be asking for next... :)
Please note that all changes are intentional and significant, so copy all of this:
# Enable mod_rewrite module
Options +FollowSymLinks
# Enable rewrite engine
RewriteEngine on
#
# Externally redirect direct client requests for Product.php URLs to extensionless static URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /Product\.php\?categorie=([^&]+)&subcategorie=([^&]+)&slug=([^\ ]+)\ HTTP/
RewriteRule ^Product\.php$ http://www.example.com/%1/%2/%3/? [R=301,L]
#
# Externally redirect direct client requests for root-directory .php URLs to extensionless static URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^/.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^([^/.]+)\.php$ http://www.example.com/$1/? [R=301,L]
#
# Externally redirect non-canonical hostname requests to canonical URL
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# If requested extensionless URL resolves to an existing .php file
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
# internally rewrite the requested URL to the .php filepath
RewriteRule ^(([^/]+/)*[^./]+)/?$ /$1.php [L]
#
# Rewrite static product page URL requests to Product.php script filepath
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/$ /Product.php?categorie=$1&subcategorie=$2&slug=$3 [L]