Forum Moderators: phranque
I recently redesigned my site and moved it to a new server. The new site does not have the same page names as the old site, so I am using 301 redirects.
Here is the line in my .htaccess file:
redirect 301 /contact.php http://example.com/index.php?option=com_content&view=article&id=1&Itemid=5
The problem comes in here:
The site is a gallery of art sites. So there are many subdomains and add-on domains under the main site. Many of these also have contact pages named contact.php. But, even when accessing their pages using their domain names (ie. www.example2.com/contact.php) the page is redirected to the main site contact page. This is NOT what I want.
What is the proper way to set this up in the .htaccess file so viewers are directed to the correct contact page (or about.php page). Can this be done in the main .htaccess file? Do I need an .htaccess file in the individual artists folder? Either way, what would the code be?
Additionally, Each artist already has two lines in the .htaccess file, as the subfolder structure is different on the new site.
old structure: example.com/anyartist/
new structure: example2.com/artists/anyartist/
So, to simplify their url I created subdomains for each artists as many do not have their own domain name: anyartist.example.com
Here is the code I'm using to redirect from the old site urls to their new urls:
redirect 301 /anyartist/ http://anyartist.example.com/
## this did not catch urls without trailing slash so added:
redirect 301 /anyartist http://anyartist.example.com/
## some artists now have domain names so their line(s) would be:
redirect 301 /anyartist/ http://example2.com/
redirect 301 /anyartist http://example2.com/
So my main question is, how do I write the redirects to go to the correct contact (for example)page, instead of being redirected to the main contact page? Any advice on making the other 301 redirects better would also be appreciated.
Thank you!
KM
[edited by: jdMorgan at 7:48 pm (utc) on Aug. 23, 2009]
[edit reason] example.com [/edit]
Use a RewriteCond to test %{HTTP_HOST} and inhibit the rule and/or capture the subdomain(s) for re-use in the external-redirect rules.
For "artist" subdomains, use the internal rewrite syntax instead of the external redirect syntax; This will preserve the subdomain-format URL instead of changing the address bar to show the main domain and the /artists/anyartist subdirectories. This will increase the 'branding' of the artist-subdomain, and prevent constantly confusing the search engines by linking to a subdomain, but then redirecting to a subfolder of the main domain whenever that subdomain is requested (potentially very bad for your pages' rankings).
e.g. Requested URL anyartist.example.com/<anything> --rewrite-to-internal-server-filepath--> /artists/anyartist/<anything>
If search engines have already picked up on the /arstists/anyartist subdirectories, add additional rules to externally redirect only direct client requests for example.com/artists/artist/<anything> back to artist.example.com/<anything>
e.g. Requested URL example.com/artists/anyartist/<anything> --301-redirect-to-URL--> artist.example.com/<anything>
Use a RewriteCond to examine %{THE_REQUEST} to be sure that the /artists/anyartist/ URL-path is being requested directly by the client, rather than as a result of your internal subdomain-to-subdirectory rewrite.
Read carefully; There are lots of details in the above. Note especially the distinction between an internal rewrite and an external redirect -- not at all the same thing, and you need some of each. :)
Jim
Thank You,
KM
Try the WebmasterWorld search facility, searching on "rewrite" and "redirect" and include the "RewriteCond" and variable names mentioned above to focus the searches. That plus the words "subdomain" and "subdirectory" should turn up several very useful examples -- and even complete solutions, perhaps.
Specific questions are always welcome -- I answered generally because that seemed most appropriate to your original questions.
Jim
I answered generally...
Okay, for a start (baby steps), I've come up with the following for the contact page to replace the 301 redirect, but it seems to have no effect. I get a 404 error, so must have something wrong.
RewriteCond %{HTTP_HOST} !^www\.example\.com/contact\.php
RewriteRule (.*) http://example.com/index.php?option=com_content&view=article&id=5&Itemid=5 [R=301,L]
ie. if a request for http://example.com/contact.php or http://www.example.com/contact.php comes in it will go to the new page, but will not affect any other contact pages on the site (such as http://anyartist.com/contact.php)
Yes, I've been looking through the forum like crazy and getting lots of tips. I want to reduce the redundancy in my .htaccess file and optimize for SE, but those will be separate posts. :-)
Thank You!
KM
[edited by: jdMorgan at 12:08 am (utc) on Aug. 24, 2009]
[edit reason] example.com [/edit]
If the "www." in the pattern is optional, then it must be so designated. See the regular-expressions "?" quantifier (there's a regex tutorial cited in our Apache forum charter).
HTTP_HOST contains only the client-requested hostname, and no part of the URL-path (/contact.php in this example). It may, however, have a period and/or port number appended, so those must be either handled explicitly, or matched by omitting the end-anchor on the pattern.
The variable that a RewriteRule matches to its pattern is always the local URL-path (localized to the current .htaccess file's directory), so make use of that here.
Taking all of that into account, you get:
RewriteCond %{HTTP_HOST} ^(www\.)?example\.com
RewriteRule ^contact\.php$ http://example.com/index.php?option=com_content&view=article&id=5&Itemid=5 [R=301,L]
RewriteCond %{HTTP_HOST} ^(www.)?\example\.com [NC]
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
example.com.
example.com:80
example.com.:80
www.example.com
www.example.com.
www.example.com:80
www.example.com.:80
Example.Com
Example.Com.
WWW.EXAMPLE.COM
WWW.EXAMPLE.COM.:80
etc.
Note that the second rewritecond in this catch-all rule uses an exact (negated) string match, and does not require character-escaping, as does the first rewritecond using regular expressions matching.
As noted in other threads, put all of your external redirects first, in order from most-specific patterns/conditions to least-specific, followed by all of your internal rewrites, again in order from most-specific to least specific. This avoids chained (multiple) redirects and avoids exposing your internally-rewritten filepaths to clients as URLs. This is a rule of thumb, not a law. Occasional exceptions may arise.
Jim
I understand about the ! that I used (I know some PHP), but am confused about why it is used in your example:
RewriteCond %{HTTP_HOST} !=example.com
You said the catch-all rule uses an exact (negated) string match, is that referring to this line? what does this line do?
I also understand about the optional www. (I read some posts about being consistent for SEO so will eventually rewrite all to www., if it is not a subdomain, after I solve the current stuff).
I'm also still confused about the order as I am not sure what is external and what is internal. Is the RewriteCond external and the Rewrite Rule internal? What about 301 redirects? You can see my current order below.
I'm not new to web development, but have put off learning more about Apache (not to mention regular expressions) for much to long. I feel like a complete newbie.
I used your example and decided to go with the second example.
RewriteCond %{HTTP_HOST} ^(www.)?\example\.com [NC]
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteRule ^contact\.php$ http://www.example.com/index.php?option=com_content&view=article&id=5&Itemid=5 [R=301,L]
One positive is that the sub-domain and addon domain contact pages are no longer affected.
Here is some of the rest of my .htaccess file if that is helpful:
Options All -Indexes
Options +FollowSymLinks
#
<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName example.com
ErrorDocument 404 /errors/404.php
#
########## Begin - Rewrite rules to block out some common exploits for Joomla
## I didn't include these here to conserve space
#
RewriteCond %{HTTP_HOST} ^(www.)?\example\.com [NC]
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteRule ^contact\.php$ http://www.example.com/index.php?option=com_content&view=article&id=1&Itemid=5 [R=301,L]
## Do I need to escape anything above? I guess I might as well start adding the www. now.
## Here is an abbreviated list of the redirects I have been using
#
redirect 301 /anyartist1/ http://www.addondomain1.com/
redirect 301 /anyartist2/ http://anyartist2.example.com/
redirect 301 /anyartist1 http://www.addondomain1.com/
redirect 301 /anyartist2 http://anyartist2.example.com/
##redirect 301 /about.php http://www.example.com/index.php?option=com_content&view=article&id=4&Itemid=6
## the above line, as well as the one for contact.php are currently disabled so they don't affect the same pages in subdomains and addon domains.
redirect 301 /support.php http://www.example.com/index.php?option=com_content&view=article&id=5&Itemid=12
Thank You!
KM
[edited by: jdMorgan at 3:06 pm (utc) on Aug. 24, 2009]
[edit reason] example.com, de-linked [/edit]
Correcting it and adding some comments to clarify:
# If the requested hostname is example.com or www.example.com, or any upper/lowercase
# variation of either, or is in FQDN format, or has a port number appended
RewriteCond %{HTTP_HOST} ^(www.)?\mysite\.com [NC]
# and is NOT *exactly* *precisely* "www.example.com"
RewriteCond %{HTTP_HOST} !=www.example.com
# then redirect the request to www.example.com, retaining the URL-path
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
I will have to go change all of those linked domains names to example.com to comply with our Terms of Service, so that's all for now. More later.
Jim
An external redirect accepts a client-requested URL and 'maps' it to another URL. When the client (e.g browser or search engine robot) requests the 'old' URL, your server code detects it and immediately sends a response back to the requesting client that says, "The resource you have requested has moved. Please ask for it again at this new URL." This terminates the current HTTP transaction, and it is up the the client to begin a new HTTP transaction by requesting the new URL provided in the server's redirect response. Technically, this is optional for the client, but most browsers and SE 'bots will do this immediately (or fairly soon). Note that the result is that in order to get the desired content, the client has to make two HTTP requests, and the server has to handle both of them. So, this slows down the user experience, and your server will log both requests. And whenever the client receives a redirect response, it will update its address bar before issuing the new request, making the new URL visible to the user.
On the other hand, an internal rewrite accepts a client-requested URL and 'maps' it to a different filepath inside the server than would normally be used (in the absence of any rewriting code). So, an internal rewrite might say, "If the client requests the URL-path 'foo.html', serve the file 'bar.php' instead of the 'foo.html' file." So this is a URL-to-filepath translation that occurs only inside the server, and the client is not informed that the request is being served from a non-default filepath. This takes place "behind the scenes," and entirely within the context of the client's original HTTP request.
Jim
You're correct about the no "RewriteEngine on", actually it was there, at the top of some other code I'm not using, but I didn't see that it had a # in front of it. So, I've fixed that, corrected the code and it's working.
Oops, sorry about using mysite.com instead of example.com.
The Redirect 301 statements can now be rewritten. I've started doing that and they are working except for a double slash before file names - but I will deal with that in a separate post. I'm trying it on my own first.
Thanks again for your detailed explanations and information. I am incredibly grateful! I reread several other related posts in the forum today and understand them much better.
KM
Options All -Indexes
Options +FollowSymLinks
#
<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
Options All -Indexes
#
Order deny,allow
#
<Limit PUT DELETE>
Deny from all
</Limit>
Jim