Forum Moderators: phranque

Message Too Old, No Replies

help with rewrite please

         

asmith20002

12:12 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



Hi,

I have 3 kind of links :

type1.example.com/type2/
type1.example.com
www.example.com

I have :

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule ^([^/]+)/$ index.php?type1=%1&type2=$1 [L]

it works for the first kind, but this :

type1.example.com/type2 (without /) doesn't work.
I added ? after it (as in 1 or 0)
RewriteRule ^([^/]+)/?$ index.php?type1=%1&type2=$1 [L]

but it doesn't give me the $_GET[type2] anymore.
for the second kind, I tried :
RewriteRule ^([^/]+)?/?$ index.php?type1=%1&type2=$1 [L]

all thing messed up.
Please help :/

Thanks

g1smd

12:55 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Firstly, you should never have a situation where two URLs can be rewritten to the same internal server path.

Before the rewrite kicks in, there should be a series of redirects that correct the URL that the visitor sees.

As well as those to strip named index files from the URL, and to fix non-www URLs to include the www, you should also have one that fixes whether or not the / should be on the URL, and redirects the other one.

Once that is done, the rewrite is much more simple and does not expose your site to Duplicate Content indexing.

asmith20002

1:47 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



I can make the no-www paths to be redirected to the correct one.

Firstly, you should never have a situation where two URLs can be rewritten to the same internal server path.

Can I ask to show me some examples ?

jdMorgan

2:10 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> RewriteRule ^([^/]+)/?$ index.php?type1=%1&type2=$1 [L]

That should have worked. Did you completely flush your browser cache after uploading this new rule, before trying to test it?

Note: I completely agree with g1smd that no more that one URL should ever be allowed to resolve to the same content; Any "variations" of that URL should be 301-redirected to the "correct" URL to avoid duplicate-content problems in search engines. So pick no-trailing-slash URLs (or trailing-slash URLs, if you absolutely must) and redirect the non-preferred-format URL requests to remove or add the slash as required.

Jim

g1smd

2:23 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** Can I ask to show me some examples ? ***

Sure.

You discussed above having both of these working:

type1.example.com/type2/
type1.example.com/type2

One should serve content, and the other should redirect before serving content.

Your choice which way round you want that to work.

asmith20002

3:14 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



So I shouldn't have this :
> RewriteRule ^([^/]+)/?$ index.php?type1=%1&type2=$1 [L]

I gotta write this ?
RewriteRule ^(.*)/$ http://www.example.com/$1 [R=301,L]

and I repeat that for other urls too ?

for example for type1.example.com (without /type2/)
I write similar rule again?

jdMorgan

3:29 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I want to inject a warning here that it is a really bad idea to put trailing slashes on things that are not directories. There's an example [webmasterworld.com] just today of how it can cause problems... In today's case, forcing a Webmaster to choose between either a high-maintenance or a very-inefficient solution.

If "type2" is a page, and not a directory, then do not link to /type2/, link to /type2

Jim

asmith20002

3:49 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



Ok, I'll remove slashes, And I'll take it as a "yes", to write similar rule (my last question) for all my urls.

p.s The Vbulettin forums with pretty urls, why they all put slash after their "pages" ? (which is not a directory)

jdMorgan

4:06 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> why they all put slash after their "pages" ?

Because they think it doesn't matter, and they think there is some "mystical advantage" to adding a slash... And they're wrong on both counts.

The problem is that these forum packages and CMSes are not written by people familiar with either servers or SEO -- or even the HTTP protocol, for that matter... :(

As I stated above, your rule should have worked. If it does not, please post your test URLs, the result of the test with each URL, and how those results differed from your expectations. By making it all very clear, we can avoid errors and wasted time here.

If you remove the slashes from the URLs published on your pages, then the rule becomes:


RewriteCond $1 !^(([^/]+/)*[^.]*\.[^.]+)?$
RewriteRule ^([^/]*)$ index.php?type1=%1&type2=$1 [L]

The RewriteCond prevents any URL containing a period in the final path-part from being rewritten. This prevents an "infinite loop" of rewriting index.php to index.php, and it also prevents requests for your robots.txt file, images, media files, css files, external JavaScripts, etc. from being rewritten to index.php.

Jim

[edited by: jdMorgan at 4:21 pm (utc) on Oct. 3, 2008]

asmith20002

5:05 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



for type1.example.com , writing this is enough ?

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.7example\.com
RewriteRule ^$ index.php?type1=%1 [L]

Oh another thing,
I hae another url, more complicated :(

it is like this :
type1.example.com/type2/info/1323-the-info-title

it should go to :
www.example.com/files?type1=%1&type2=$1&number=1323

How do i write this?

[edited by: jdMorgan at 5:28 pm (utc) on Oct. 3, 2008]
[edit reason] Use example.com please [/edit]

jdMorgan

5:32 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> How do i write this?

Based on the rules you already have, how do you think you should write it? (Please see our Forum Charter [webmasterworld.com] for more information about this forum and links to useful resources).

We will be glad to help you correct your code, after you have written it and tested it.

Jim

asmith20002

7:07 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



ok...

This is what i wrote so far. And it works. Just please check to see how I'm doing and I'm little afraid of making duplicate date :

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteCond $1 !^(([^/]+/)*[^.]*\.[^.]+)?$
RewriteRule ^([^/]*)$ index.php?type1=%1&type2=$1 [L]

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule ^$ index.php?type1=%1 [L]

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteCond $1 !^(([^/]+/)*[^.]*\.[^.]+)?$
RewriteRule ^([^/]*)/info/([0-9]*)-(.*)$ files.php?type1=%1&type2=$1&number=$2 [L] (notice I'm not putting $3 for page title words, I 've got doubt about that)

Thanks

g1smd

7:40 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The .* bit might have a more efficient way to be coded, but it probably isn't a major issue.

Now you have the rewrites sorted, you now need a bunch of canonicalisation redirects to go before these rewrites.

That's all the stuff to force www, drop index filenames, and so on. There's a number of threads with ideas in for those.

You should also set things up such that if someone starts requesting URLs with parameters that they are redirected to the new format URLs.

There are several other fixes that might also be nice to do, so it is likely that you'll have at least half a dozen different redirects to code up.

These all have to go before the rewrites.

[edited by: g1smd at 7:44 pm (utc) on Oct. 3, 2008]

jdMorgan

7:44 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good -- It looks like you understand the process pretty well, but I'd suggest a few changes:

RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteCond [b]%{REQUEST_URI}[/b] ![b]^/([/b]([^/]+/)*[^.]*\.[^.]+)?$
RewriteRule ^([^/[b]]+)[/b]/info/([0-9[b]]+)[/b]-[b](.+)[/b]$ files.php?type1=%1&type2=$1&number=$2 [L]
#
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteCond $1 !^(([^/]+/)*[^.]*\.[^.]+)?$
RewriteRule ^([^/[b]]+)[/b]$ index.php?type1=%1&type2=$1 [L]
#
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule ^$ index.php?type1=%1 [L]

I put the rule in order from most-specific to least-specific, which is recommended to prevent unexpected interaction between rules, though not strictly required in this case, since the URL patterns are all mutually-exclusive.

I also changed the "*" quantifiers, meaning "zero or more," to "+" quantifiers, meaning "one or more" -- or equivalently, "at least one."

I also changed "$1" in the now-third RewriteCond to %{REQUEST_URI} and added a leading slash to the pattern (as required when matching this server variable) so that this RewriteCond now examines the entire requested URL-path (as it already does in the second rule, because $1 in that rule contains the entire requested URL path).

Jim

asmith20002

7:55 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



Thanks a bunch guys.
I'll just focus on it more and more.

hmm one last thing about 301 redirecting old ones to new ones.
the new ones contain page title word, when I want to redirect oldones, the system have no idea what keywords the new url has.So it can't find out the new url pattern.
What you do in these cases?

jdMorgan

8:52 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Three ways to do it are to use a RewriteMap to call a small script (usually PERL) to access your vBulletin database and get the new URL, or to modify your index.php file to do this lookup and generate a redirect, or to redirect each page in .htaccess one at a time -- one redirect rule per page. RewriteMaps can only be defined at the server configuration level, and so are not usually available on shared hosting.

Jim

asmith20002

9:03 pm on Oct 3, 2008 (gmt 0)

10+ Year Member



There's no problem I redirect it using index.php? I mean with a php script I redirect it ?

header("location : newpage.html");

It has same effect as 301 redirect on search engines too ?

g1smd

10:29 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That header returns a 302 redirect.

You will need to add another line there to specify that the response should be 301 instead.

header("HTTP/1.0 301 Moved Permanently");

jdMorgan

10:34 pm on Oct 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, you must specify the server status as 301-Moved Permanently by using the "status" directive in PHP as well. Otherwise, you will have very serious problems with search engines. See our PHP forum (and its library) for threads on this subject.

If you provide both the Location and 301 status response headers, then nobody (visitor or search engine robot) knows whether the redirect was done in .htaccess, PHP, or anywhere else... They cannot tell.

Jim

[edited by: jdMorgan at 10:45 pm (utc) on Oct. 3, 2008]