Forum Moderators: phranque

Message Too Old, No Replies

Modrewrite: Adding an R=301 Changes URL

         

chez17

7:32 pm on Apr 16, 2009 (gmt 0)

10+ Year Member



Hello, thanks to Jim's generous help I have a rewrite rule working perfectly. My issue is that now that the site is up, search engines aren't picking up the site, I think it's because I don't have it marked as a 301 redirect. Here is the current rule:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.com\.?(:[0-9]+)?$
RewriteCond $1 !^folder/
RewriteRule ^(.*)$ /folder/$1 [L]

This works great. it keeps the newdomain.com in the url bar which is very important. When I insert something like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.com\.?(:[0-9]+)?$
RewriteCond $1 !^folder/
RewriteRule ^(.*)$ /folder/$1 [R=301,L]

The redirect still works, but it shows the olddomain.com/folder in the address bar as opposed to newdomain.com. Why would making it permanent change the URL? Is there a better way to do this so search engines can track this? Any help is most appreciated.

g1smd

11:56 pm on Apr 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's no example of new and old URLs to understand what isn't happening.

However, I'm not sure if you are using the right method here. A redirect makes the browser request a different URL with a new request. A rewrite connects a URL request to an internal filepath - without revealing what that filepath actually is.

.

Your first example is for a rewrite.

Your second example is for a redirect, and if the domain name is being 'changed' then it is likely you have used CanonicalName on, and the default server domain name is the old domain name.

Two things: You should always specify the correct domain name in the target URL of a redirect (That is, if the rule has R=301 in it, then it should also have the domain name in it), and you might want to look at your usage of CanonicalName On and/or change the default server domain name to be the new domain name for the site..

jdMorgan

12:12 am on Apr 17, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The number one point here is that if you are trying to get newdomain.com pages spidered and listed in search, then the original rule you posted is perfectly-fine, and is NOT the cause of any problems.

Properly implemented, that rule is totally invisible to search engines, which will not know (or care) that your content is being served from the /folder subdirectory. That fact is completely irrelevant and unimportant to search engines anyway; they care about URLs on the Web, not file structures inside servers.

Look elsewhere for the problem.

Jim

chez17

3:00 am on Apr 17, 2009 (gmt 0)

10+ Year Member



Jim,

thanks again. You guys should really put a donate button up. I found this site doing work for a client that already had hosting that I am not used to. I may not be comfortable subscribing $90 for six months but I would love to donate something to the site. Just a thought.

Dave

chez17

10:15 pm on Apr 20, 2009 (gmt 0)

10+ Year Member



Another quick question. It seems google is indexing the site but not as newdomain.com, it's indexing it as olddomain.com/folder, which is where the rewrite is pointing to. If I put a robots.txt file in olddomain.com telling google not to index folder/, will it effect the indexing of newdomain.com?

g1smd

10:25 pm on Apr 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A rewrite does not 'point to a URL'. A rewrite accepts a URL as its input and then fetches content from some place inside the server as defined by the Rewrite Rule.

On the other hand, a redirect tells a browser when it asks for URL A that it needs to make a new request, and in that new request that it needs to ask for URL B.

The robots.txt can be very easily tested. It affects the domain that it was accessed from as seen in the browser address bar. Upload the file for thisdomain.com and check you can read it when you ask for thisdomain.com/robots.txt and then ask for thatdomain.com/robots.txt - if you can see the same file contents then it will also apply to URLs as accessed via thatdomain.com.

jdMorgan

11:34 pm on Apr 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The reson Google is indexing the site as olddomain/folder/ is that you told it to -- when you change the rule from a rewrite to a redirect, you said, "newdomain.com/xyz has been permanently replaced. Please ask for that page again at olddomain.com/folder/xyz." So Google did just that.

So you've got rather a mess here, and the solution is to add a different kind of redirect. In order to prevent further disasters, I'll include the original rewrite as well, and both of these rules must be used.


RewriteEngine on
#
# If the client directly requests a URL-path starting with /folder/ from either olddomain or newdomain
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /folder/
# externally redirect the request to newdomain.com, and remove /folder/ from the URL-path
RewriteRule ^folder/(.*)$ http://newdomain.com/$1 [R=301,L]
#
# Internally rewrite requests for newdomain.com/<path> and www.newdomain/<path> URLs
# to /folder/<path> files:
# If requested hostname is www.newdomain.com or newdomain.com, with optional FQDN or port number
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.com\.?(:[0-9]+)?$
# and we haven't already rewritten this request to /folder
RewriteCond $1 !^folder/
# Then internally rewrite this request to add "folder" to the filepath
RewriteRule ^(.*)$ /folder/$1 [L]

I've also commented the code, so that I can provide some friendly advice: Do not put any configuration code on your server that you have not analyzed and that you do not completely understand. This means understanding the code, what it is intended to do, and what effect it will ahve on visitors and search engines. If necessary, take the code apart character-by-character, referring to the mod_rewrite documantation, and not proceeding until every character's role is fully-understood. References are cited in our Apache Forum Charter.

The complex hostname pattern and description in the second rule points out another shortcoming: You should add one or more rules after the first one shown here in this post to canonicalize all requested hostnames to either www- or non-www domains. You can choose either, but you should choose -- and you should be utterly consistent in linking to the canonical domains only from your own sites.

Jim