Forum Moderators: phranque

Message Too Old, No Replies

SEF URL's htaccess and godaddy

         

leeparts

8:04 pm on Jun 27, 2007 (gmt 0)

10+ Year Member



Hello All,
I have seen numerous posts relating to search engine friendly urls, godaddy and htaccess files. For days now I have been trying to find my answer and I can not.

Here is my situation and my stumbling blocks:

We have a very large e-commerce website currently hosting on a shared server at goddady. The current site is a database site, but we are using php code to generate html files (Was a good idea at the time). Now that the site is performing better on the search engines and it is seeing more traffic, it can take up to 2 hours to generate the site if needed.

We have purchased a virtual server and currently are building a new version of the site. After waiting months and months for google to pick up our pages, we do not want to start over with a new url structure. It needs to remain the same. I have toyed with the .htaccess file now and I can not get everything to work 100%. Either it all works, but the images do not display or the url structure works, but than I can get any other php files in the root folder to run(It just runs the index.php file).

Here is my current .htaccess file:
RewriteEngine on
RewriteBase /
RewriteRule ^(.*)/(.*)/(.*)/(.*) /index.php?rootcatname=$1&catname=$2&lastcatname=$3&itemurl=$4/

The live e-commerce site is www.example.com and the current test site is www.example.info

Here are the steps to navigate the site:
(1) www.example.info/
(2) http://www.example.info/chrysler_300/
(3) http://www.example.info/chrysler_300/accessories/audio/
(4) http://www.example.info/chrysler_300/accessories/audio/sirius_satellite_radio_system.html

With the htaccess file above, I can make all the steps except #2 work. It will also allow other php files in the root folder to run.

The last issue is there needs to be a trailing slash at the end of each step except step#4 for the urls to work with godaddy.

Can anyone help me finish this up?

[edited by: jdMorgan at 8:36 pm (utc) on June 27, 2007]
[edit reason] No URLs, please. See TOS. [/edit]

jdMorgan

8:51 pm on Jun 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd recommend a case-by-case approach to avoid unexpected results:

RewriteRule ^([^/]+)/([^/]+)/([^/]+)/([^/]+)/?$ /index.php?rootcatname=$1&catname=$2&lastcatname=$3&itemurl=$4 [L]
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/?$ /index.php?rootcatname=$1&catname=$2&lastcatname=$3 [L]
RewriteRule ^([^/]+)/([^/]+)/?$ /index.php?rootcatname=$1&catname=$2 [L]
RewriteRule ^([^/.]+)/?$ /index.php?rootcatname=$1 [L]

The pattern in the last rule is slightly different to prevent rewriting index.php itself, which would cause an 'endless' loop.

Using negative-match subpatterns --"match until you find a slash"-- allows complex patterns such as these to be evaulated in a single left-to-right pass, as opposed to the endless backoff-and-retries needed with ambiguous multiple "*.*" subpatterns. You may actually notice that your server runs faster under load.

You may put these rules in order from most-likely-to-be-requested to least-likely for the sake of efficiency, as long as the [L] flag is present on each rule.

Jim

leeparts

9:01 pm on Jun 27, 2007 (gmt 0)

10+ Year Member



Thank you Jim that works perfectly. I do have one more question. Because the site is on godaddy, it needs to have a trailing slash. I have to code to do that, but I was wondering it is possible to not have the slash appear at the last step where the file ends in .html?

Here is what I have:

RewriteCond %{HTTP_HOST} ^lexample\.com [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L,NC]

RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_URI}!(.*)/$
RewriteRule ^(.*)$ http://www.example.com/$1/ [L,R=301]

RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteCond %{REQUEST_URI} (/¦\.php¦\.html¦/[^.]*)$ [NC]
RewriteRule ^(content/¦component/) index.php

jdMorgan

9:16 pm on Jun 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wasn't sure what you meant.

That is weird, and does not sound like something necessarily imposed by GD. For example, a URL of
/foo/bar/index.html/ refers not to a "page" named index.html in the /foo/bar/ subdirectory, but rather to the DIrectoryIndex-defined index document in a directory named /foo/bar/index.html/

Rather than saddling your site with a kludgey work-around, I would have a word with their tech support crew if I were you...

If you're on Apache 2.x, it may be as simple as asking them to disable AcceptPathInfo

Jim

leeparts

9:45 pm on Jun 27, 2007 (gmt 0)

10+ Year Member



Basically my thought was to stay as true to the original url structure as possible, but if I use the urls from the current site, it automatically adds the slash a displays the correct page. I am just happy to have the code you supplied. Thanks again.

jdMorgan

3:13 am on Jun 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, that will happen as a result of the second of your existing RewriteRules. Comment that out and then see if things work without adding a slash to everything.

Alternatively, change that rule to read:


# If requested URI doesn't end in slash
RewriteCond %{REQUEST_URI} !/$
# and does not exist as a real file
RewriteCond %{REQUEST_FILENAME} !-f
# but does exist as a directory
RewriteCond %{REQUEST_FILENAME} -d
# then add a slash and force a redirect
RewriteRule (.*) http://www.example.com/$1/ [R=301,L]

The new directory-exists check will stop this from interfering with your URLs/pages that are actually handled/generated by your script. Other tweaks were made for the sake of efficiency.

Jim

leeparts

12:08 pm on Jun 28, 2007 (gmt 0)

10+ Year Member



That did the trick. Now my urls will all transfer seemlessly. Thank again, you have saved my hours of work.

leeparts

12:54 pm on Jun 28, 2007 (gmt 0)

10+ Year Member



I stand corrected, it is not quite there yet.
That adds a trailing slash to step 1 and leaves it off of step 4, but it needs to add it to step 2 and step 3 also.

jdMorgan

4:47 pm on Jun 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK. Well, I'm looking at this from the outside-in, and can't claim to understand your whole 'system.' All I can say is that you should put the most-specific external redirect code sections first, then less-specific redirects, then the most-specific internal rewrites, and finally the least-specific internal rewrites.

The rules will be applied in order, so their order should 'make sense' to you in the context of your overall needs. In this case, your first and second redirect should be reversed, with the domain canonicalization redirect being your last redirect.

Other than that, you'll have to figure out what the code means in the context of your overall 'system,' and re-arrange the code to suit, or you'll have to provide a lot more details to make this clear for us. Helpful information would be several problem-example descriptions including:

  • Example requested URL
  • Expected redirected/rewritten URL
  • Actual redirected/rewritten URL if applicable
  • Actual result (non-result) if no redirect/rewrite took place
  • Comments on expected versus actual results

    To avoid misleading results, be sure to completely flush your browser cache after making any change to your .htaccess file(s).

    In addition, questions that may help you to understand the code or code-order problems and address them yourself may be more productive, as the contributors here have practical limits to the time they can spend on individual 'projects.'

    Jim

  •