Forum Moderators: phranque

Message Too Old, No Replies

Setting up a subdomain

Ask first, no worry later

         

grandpa

7:28 am on Jan 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I thought I would ask before I do something really bad
(stupid). We set up a sub-domain, and created a new
sub-directory for the sub domain. (I want to preserve
the original sub-directory for historical purposes)

Before the sub-domain, my set up looks like this:

h*tp://www.example.com/subdirectory/ where
all the files exist in subdirectory /public_html/abc/

The sub-domain will look like this:
h*tp://subdomain.example/
and all the files exist in the sub-directory
/public_html/def/

What I think will work is

RedirectMatch 301 /abc/(.*)$ htt*://subdomain.example.com/$1

I would place this in the .htaccess file in my
directory /public_html/abc/? Or in the root directory?

Am I on the right track here, or is this train gonna wreck?

What concerns me a bit is all the talk I've read about
redirects and index.html. It's all confusing, but I'm using
index.html?somecode to track PPC's. Should I be concerned
about that with this type of redirect?

Thanks,
grandpa

<edited for example.com>

jdMorgan

10:10 am on Jan 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



grandpa,

Very wise to ask first... :)

Let's talk about domains and subdomains first, and worry about subdirectories later. The main question is, do you want to preserve both the (sub)domains represented by the abc and def subdirectories? Or do you want to redirect all traffic for the www/abc domain to the subdomain/def domain?

One thing that may help is to think of redirects as applying to domains and subdomains, and forget about directory structure for now. You can map your URLs to arbitrary filepaths with internal rewrites, and those need not enter into the discussion of what to do with the (sub)domains (yet). You want to get the URLs (which "show" on the Web) in order first, and then map them to the appropriate resources in your file structure. Breaking the problem between those two functions will help keep it simple.

You've got that added layer of PPC tracking complexity, too. So question number two is, do you currently 301-redirect the index.html?somecode requests to index.html in order to avoid "duplicate page" problems? If so, you'll probably want to integrate the subdomain redirect with those PPC redirects and do it all with one redirect to avoid SE confusion and server load issues.

Jim

grandpa

11:08 am on Jan 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You want to get the URLs (which "show" on the Web)
in order first, and then map them to the appropriate
resources in your file structure.

That was done yesterday with my host. The DNS records
haven't propogated yet (last time I checked). There's
no mail service, so I assume they only set up an A record.
I should be able to start testing in the next day or so.

I don't plan to go live with this sub domain until
everything is checked, but by far the most daunting task
I face is the redirect. One minute it looks so easy, the
next it's gibberish.

Or do you want to redirect all traffic for the
www/abc domain to the subdomain/def domain?

I intend to re-direct all the traffic from www.domain/abc
to subdomain.domain.com - which is to be pointed to def.

I know this is really basic, but given that I have backlinks
to the existing sub-directory (abc) the .htaccess needs to
be modified in the abc directory? If that's not right
some of my basic understanding is shot...

You've got that added layer of PPC tracking complexity, too.

I've seen the discussions... doesn't mean I fully understood
them, however. I have planned to modify the URL in my ads
*after* everything is transitioned over. Bummer! I wasn't
even aware that I might have duplicate page problems as a
result of PPC tracking. Guess I've been too busy trying to
learn how to manage them.

To answer question number two - I don't 301-redirect
the index.html?somecode requests to index.html in order to
avoid "duplicate page" problems. (I supposed if I had been,
I'd be a little farther ahead of this situation.)

I have two questions... because I do want to learn this
and be able to understand it - will the code that I wrote
even come close to doing this job? Or should I really
spend another night, or two, reading these valuable forums?

<edit added>After reading, re-reading and reading again, I
think I see how the PPC tracking can cause a duplicate
page problem. I don't understand to what extent it could
be hurting me, but could a factor in why I can't climb out
of this post-Fl rut that I'm in... I'm off to look deeper
into this.

Guess that answers my #2.
I found this of interest
[webmasterworld.com...]
</edit>

<edit 2>
I didn't find the answer yet, but I did find
a reason to develop another 100 unique pages
for domain.com This place is like a gold mine
studded with diamonds.
</edit>

jdMorgan

5:50 pm on Jan 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So, to be absolutely clear, did the search engines list your original "www.example.com" site in their results as "www.example.com/" or as "www.example.com/abc/" (or as both)?

Jim

grandpa

2:48 am on Jan 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So, to be absolutely clear, did the search engines list your original "www.example.com" site in their results as "www.example.com/" or as "www.example.com/abc/" (or as both)?

I have SERP's showing both.

jdMorgan

6:28 am on Jan 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



:(

Well, the good news is that fixing the redirects to only use one domain name will very likely improve your ranking then. :)

Jim

grandpa

6:30 am on Jan 31, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well this is good news. I ran a little test, now that
the subdomain is active and has been propogated thru
the DNS servers.

The .htaccess file (just below) went into the /acb
subdirectory. I created a pair of html documents, both
named test.html but with different content. One went into
the "abc" directory and the other into the "def" directory
- the reference directory for the subdomain.

Options +FollowSymLinks
RewriteEngine on
RewriteBase /abc
RewriteRule test\.html$ htt*://subdomain.domain.com/test.html

The rewrite works as advertised.
So to expand this for every file in "abc"?

Options +FollowSymLinks
RewriteEngine on
RewriteBase /abc
RewriteRule \.*$ [subdomain.domain.com...]

OK, it almost works. I'm basically just posting my results
here so I can find them later... But what I see happening
is a redirect to the subdomain root. If I try to navigate
to a specific page I'm dropped off at the index page, not
the page I requested.

The log activity reveals:
123.45.67.89 - - [30/Jan/2004:22:11:28 -0800] "GET /subdomain/somedoc.html HTTP/1.1" 302 301 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461; Hotbar 4.3.5.0; .NET CLR 1.1.4322)"

Not sure why I see a 302 and a 301. But let me venture
a guess. The 302 was returned from the original request
and the 301 from the second request? I'm pretty sure
I need to expand the left side so any request is passed
to the right side. Another night of research should get
this part working...

This question is probably beyond the scope of this
discussion, (and tags me a real newbie) but is the
difference between A GET and HEAD or PUT going to affect
how I approach the redirection process?

grandpa

<add>
In the process of getting the subdomain set up
I need to consider any of these requests:
htt*://domain.com/subdir
htt*://domain.com/subdir/
htt*://domain.com/subdir/file.any
htt*://domain.com/subdir/file.any?PPC

<enlightenment> Just sitting here looking at the 4 URL's
above, I think I might understand part of the problem with
PPC tracking. The?PPC has to be stripped out to make the
URI valid for the second request?
</enlightenment>

</add>

jdMorgan

8:10 am on Jan 31, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



grandpa,

This line is malformed:


RewriteRule \.*$ http://subdomain.domain.com/$1

The $1 in the substitution is a back-reference. Unfortunately, it is undefined because it references the first parenthesized subexpression in the pattern, and there isn't one. (See backreferences in the mod_rewrite documentation.)

In addition, you probably want to use a 301-Moved Permanently redirect (I could be wrong). If so, then you need to add the flag [R=301] to the end of the rule. If not, then you should specify a local path only in the substitution, and omit the http://subdomain.com part. Since in this case that would make no sense, I presume you want an external redirect.

And finally, in 99% of all cases, once an external redirect-type rule is applied, you'll want to stop any further rewriting, and go do the external redirect immediately. In that case, an [L] flag is called for. (Actually, this applies to 99% of *all* simple rewrite rules; Once you do a particular rewrite, there is usually no need for mod_rewrite to keep looking at the following RewriteRules for further matches. I always suggest including [L] unless you know of a good reason not to.)

So, the final result would be:


RewriteRule (.*) http://subdomain.domain.com/$1 [R=301,L]

---

Your log is showing you a 302-Moved Temporarily response, which is the default mod_rewrite external redirect (because you did not add the R=301 flag). It also happens that the length of the 302 response is 301 bytes. That's a remarkable coincidence, but that's what it says. :)

Remember that an external redirect response to an initial request causes a new and separate HTTP request from the client. Therefore, you should see a second line in your log with a 200-OK response.

---

I doubt that you'd want to do anything different for GET, HEAD, or PUT, or even OPTIONS or TRACE for that matter. But if you do decide you need to redirect them differently, mod_rewrite can handle it... Trust me. ;)

---

> The?PPC has to be stripped out to make the URI valid for the second request?

Not sure I understand the question, but generally, you want to 301-redirect the PPC requests in order to get rid of the tracking code for two reasons: First, it's already been logged, and you probably don't want the visitor to see it in his/her browser address bar (they're ugly and make the URL "unmemorable"). And secondly, you don't want the visitor to bookmark the URL with the tracking code in place, because if he and 100 friends all use tracking-code-bearing URL bookmarks 3 times a day in the weeks before Christmas, it might throw a monkey wrench into your PPC analysis!

If possible, redirect to the correct domain and strip the tracking code using a single 301 redirect - spiders and visitors with slow modems will appreciate it.

Jim

grandpa

1:02 pm on Feb 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks Jim. I went live with the subdomain yesterday,
and spent most of the day tweaking links, forms, email,
etc.

But, I've also been reading some of the references I found
in this forum... Steve Ramsay's Guide, Learning to Use
Regular Expressions, and of course the Apache pages.
Having some C experience in my background is helping...

While this is going on, I'm giving some thought to how
simple, or complex, my rulesets need to be. As mentioned
previously, the tracking codes need to be addresed. I've
also managed to define a couple of specific requests that
I might want to process. For example, www.domain.com/abc
(without the trailing slash) could be handled one of 2 ways,
I think. Ideally, I would want to redirect that request
to a logon script for the subdomain:
(www.domain.com/cgi-bin/login.cgi),
but it could also redirect straight to the subdomain:
(subdomain/domain.com/). In any event, right now it
returns a 404 since abc is not a valid filename.
<aside>
That address used to resolve to domain.com/abc/index.html,
which always bothered me because that page had no PR.
</aside>

The point is, I think its important to address the rules
I need to apply in the current environments, and with some
thought given to future environments. For instance, I'm
committed (maybe I should be) to the development of
domain.com and www.domain.com as 2 seperate entities -
perhaps to provide unique services. It's not gonna happen
tomorrow, or even next week - but now is probably the
time to avoid shooting myself in the foot later.

grandpa

grandpa

7:54 pm on Feb 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was working with this today and it leaves me perplexed.

In my file structure I have a directory called abc.
Here's the .htaccess in that directory.

Options +FollowSymLinks
RewriteEngine on
RewriteBase /abc
RewriteRule (.*) [def.domain.com...] [R=301,L]

What I trying to do is send a request for
www.domain.com/abc to /domain/cgi-bin/that.cgi with
this rule:

RewriteRule ^abc$ /public_html/cgi-bin/that.cgi [R=301,L]

Instead I still get a 404 Not Found.

OK... so I tried this on another server, with another
domain.. one that does not have the directory abc.
The redirect worked like a charm. So I went back to
the first server, and changed abc to test, and it works.
A request for www.domain.com/test gets rewritten to to /domain/cgi-bin/that.cgi.

Is the problem related to having a directory named
abc, and everything in abc is being redirected?
I also tried the rewrite in the root .htaccess. with the
same results, a 404 Not Found.

grandpa

12:20 am on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Too late to edit my previous post.

I got last problem solved, but not without raising
something else. I can definitely see the power of
mod-write at work :-)

I used this rewrite in my root .hatccess
RewriteRule (abc) /cgi-bin/that.cgi [R=301,L]

and I'm given the cgi script. But it *is* appearing to
be from a different location. The browser address
comes back as htt*://abc.domain.com/cgi-bin/that.cgi

I'd like the browser to show the correct address at
htt*://www.domain.com/cgi-bin/that.cgi...
or even better htt*://www.domain.com/abc - which would
match the original request to begin with.

<off to read some more...>

<added>
I got the right URL in my browser now... that was easy.
</add>

<add2>
RewriteRule (abc) /cgi-bin/that.cgi [R=301,L]

This rule was modified slightly to give me the results
I actually wanted.

RewriteRule (abc$) htt*://www.domain.com/that.cgi [R=301,L]

I don't think the 301 is really necessary, but it probably
won't hurt either. But doing this trial and error method
can cause problems I might not catch right away. So I have
a question about syntax. I added the $ to end a pattern
within parenthesis. I'm sure as I dig deeper I'll discover
why I can use a $ without a ^ in front of it. My question
is what document(s) explain the use of (), [], $, ^ and
when they should or should not be used. Or do I just have
to dig it out of the definition for any given directive?

Thanks

</add2>

I think I was looking for this:
htt*://httpd.apache.org/docs/mod/directive-dict.html

If you folks get tired of watching me talk to myself,
feel free to jump in.... :-)

grandpa

4:02 pm on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well, here's an update. Things seems to be running
smoothly.

I've .htaccess files set up in 3 locations,
from the filesystem view they are at
/users/domain/public_html/
/users/domain/public_html/abc/
/users/domain/public_html/def/

This root copy is here.
AuthName domain.com
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order allow,deny
deny from 65.168.224.
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
<Files ~ "^.*$">
order allow,deny
allow from all
deny from env=ban
</Files>
ErrorDocument 403 /host/errdocs/403.html
ErrorDocument 401 /host/errdocs/403.html
ErrorDocument 401 /host/errdocs/401.html
ErrorDocument 500 /host/errdocs/500.html
ErrorDocument 400 [domain.com...]
ErrorDocument 404 [domain.com...]

RewriteEngine on

#I found this FormMail trap in my research, so I thought
#"why not?" It's already banned 3 people in the first
#few hours. Also, when I initially set it up, I had a
#problem serving up my Forbidden Document, since the domain
#had just been banned. I switched back to the hosts
#Forbidden Document and it works fine now.

RewriteCond %{REQUEST_URI} ^/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/FormMail\.(cgi如l如hp) [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local吒-bin)/FormMail [NC,OR]
RewriteCond %{REQUEST_URI} ^/cgi(\-local吒-bin)/FormMail\.(cgi如l如hp) [NC,OR]
RewriteCond %{REQUEST_URI} (mail.?form圩orm圩orm.?mail妃ail妃ailto)\.(cgi圯xe如l)$ [NC]
RewriteRule .* /cgi-bin/trap.pl [L]

RewriteRule (abc$) [domain.com...] [R=301,L]

#Then I use these to capture my PPC tracking and strip the
#Query String out of the request. This handles everything
#requested from the default domain.

RewriteCond %{QUERY_STRING} ^ppc=string1 [NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string2 [NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string4\+string5[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string6[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string7[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string8[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=string9[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=stringA[NC,OR]
RewriteCond %{QUERY_STRING} ^ppc=stringB [NC]
RewriteCond %{HTTP_HOST} ^(.*)$[NC]
RewriteRule ^(.*)$ [%1...] [R=302,L]

Then in my directory abc I placed this simpler version:

Options +FollowSymLinks
RewriteEngine on
RewriteBase /abc

#The first rule redirects traffic to the new subdomain.
RewriteRule (.*) [def.domain.com...] [R=301,L]

#This rule redirects a request to
#www.domain.com/def - which used to be a valid request
#but started returning a 404 after I set up the subdomain.
RewriteRule ^def$ /public_html/cgi-bin/login.cgi [R=301,L]

And finally, in my directory def I placed this version:
Options +FollowSymLinks
RewriteEngine on
RewriteBase /def

#I needed this to catch any PPC tracking to the new
#new subdomain.

RewriteCond %{QUERY_STRING} ^ppc=string3 [NC]
RewriteCond %{HTTP_HOST} ^(.*)$[NC]
RewriteRule ^(.*)$ [%1...] [R=302,L]

*-----------------------------------------*

Amazingly, I almost understand all of this - but I'm still
practicing writing it.

I'd appreciate a once over by a trained eye, there could
well be something obvioulsy wrong that I'd never see.
I don't think there is, I've been on top of my logs for
the last 24 hours.

Some of the other questions I have forming will most
likely be answered when the time is right... after I
figure out what to ask. I'll be hanging out in this
forum for a while. Just the same, any heads-up would be
appreciated.

Grandpa

jdMorgan

9:40 pm on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Some of the comments in this thread [webmasterworld.com] on internal vs. external redirection and compressing FormMail rules might be of interest to you.

Jim

grandpa

12:49 pm on Feb 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks Jim. I'll read them again. Most anything I have
searched for brings me back to this forum. Don't know
why I bother to leave...

I have a few errors in my log that I need to address,
nothing overwhelming that I can see. But there is one
situation still beyond my grasp. At the risk of finding
this thread in another form I'll say the problem is with
my Overture PPC requests. I append a string, which I can
find with my Query-String conditions. But OV appends their
own string which fails my condition check. Further, their
string appears to be somewhat variable, depending on the
original search request. So, I need to write a condition
that matches a rule which is based on a condition of
of a Query-String - actually the first part of a Query-String.

The web view looks like this:
htt*://www.domain.com/file.html?ppc=string?OVR=VARstring

So far I've only figured out this much of it:
htt*://www.domain.com/file.html?ppc=string

And, I suppose I also would want a rule that generally
catches *any* search request with a string attached, and
strips out the search string. I can see where this rule
will need to be placed below the valid ppc ruleset.

I'm not sure of it's relevance to the subdomain redirect
ruleset. Would it be better to process any subdomain
search requests in the .htaccess file for the subdomain?
<I ask this, and the answer seems obvious. But I ask
because right now every subdomain request is already being
redirected with a 301 status - because the new subdomain
address hasn't been propogated thru the SE's.>

grandpa

jdMorgan

5:12 pm on Feb 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> So far I've only figured out this much of it: htt*://www.domain.com/file.html?ppc=string

You might want to post the code and the full query string, so we can discuss specifics.

As far as the order of the subdomain versus query string rewrites, that depends. If you are rewriting subdomains to subdirectories, that is usually done as a 'silent' internal redirect, whereas removing PPC query strings that ar not used by a script is usually done with a 301 external redirect so that the browser is redirected and URL therefore appears 'clean' to the PPC visitor.

Jim

grandpa

6:32 pm on Feb 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteRule \?ppc=string(\?.+)$ [domain.com...] [R=301,L]

If I read this correctly, it says take any request
that contains the value of "?ppc=string" and
optionally anything after that string that begins
with "?" and rewrite it to my domain index file,
showing the URL [domain.com...]
in the browser navigation bar.

I've been up all night, not going to test this right
away, but let's see if I'm getting any better...

grandpa