Forum Moderators: phranque

Message Too Old, No Replies

redirecting directory to subdomain issues

are there any disadvantages and howto

         

malina

6:40 pm on Nov 27, 2008 (gmt 0)

10+ Year Member



Hi!

What I obviously want to do is to have a website where users can register and get their own address:

www.user.website.com
and probably
[user.website.com...]

I did some searching and found this:
[webmasterworld.com...]

What I want in result is when user types "www.user.website.com" in an address bar a page will be displayed. This page will be stored in www.website.com/letter/user/. Also I want an address "www.user.website.com" and "www.user.website.com/user-content/" be displayed in address bar instead of "www.website.com/letter/user/" or "www.website.com/letter/user/user-content/".

I can not create so many subdomains due to my hosting account restrictions. Also I cannot control DNS settings.

Will mod_rewrite redirection be enough to achieve this effect? If so, what kind of url I have to use in links? (www.user.website.com/user-content/ ?)

How it will be physically done?

I would greatly appreciate any help and/or links for further studying..for a newbie...

ps. I run phpinfo() on my website and it says that mod_rewrite is on, but not mentions mod_vhost_alias...

Can it be done than?

phranque

11:15 am on Nov 29, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], malina!

it is not possible to control browser behaviour as you described.
if you want the browser to show a different url you have to suggest that the browser make a request for a new resource at that url with a 301 or 302 response.
after the browser requests the new url the domain name won't resolve unless you have the dns set up properly.

jdMorgan

9:22 pm on Dec 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe I missed something here, but this seems to be a simple internal rewrite application.

*If* you have wildcard subdomains defined in DNS (likely only if you have a unique IP address for your server), you can do this.

when user types "www.user.website.com" in an address bar a page will be displayed. This page will be stored in www.website.com/letter/user/

I assume that "letter" is the first letter of the username, and that the intent is to avoid having many, many users' subdirectories included in your top-level directory. Let's also make detecting user subdirectories a bit more foolproof and further reduce the clutter in your top-level directory by putting them all under /users/<letter>/<username>/


# If we have not already rewritten this request to a user subdirectory
RewriteCond $1 !^users/[a-z]/
# and if requested hostname does not start with "www.example.com"
RewriteCond %{HTTP_HOST} !^www\.example\.com
# and if requested hostname starts with <letter><letters, numbers, or hyphens><letters or numbers>/example.com -- with or without the leading "www"
RewriteCond %{HTTP_HOST} ^(www\.)?(([a-z])[a-z0-9\-]+[a-z0-9])\.example\.com
# then rewrite URL-path /<anything> to /users/<first letter of username>/<whole username>/<anything>
RewriteRule (.*) /users/%3/%2/$1 [L]

(Count left parentheses to resolve the back-reference assignments)

The RewriteCond that gets the username from the subdomain is a bit complex because in addition to allowing the leading ""www." to be optional and separately extracting the initial letter of the username, it also enforces some restrictions on which characters can appear in which positions, in order to comply with Web standards. Note that as coded, the username must consist of at least 3 characters, must start with a letter, and must end with a letter or a number. Letters, numbers, and hyphens are allowed between the start and end characters.

So, I think the main issue is whether your server has a unique IP address and has wild-card subdomains enabled in DNS.

Jim

malina

9:55 pm on Dec 3, 2008 (gmt 0)

10+ Year Member



Thank You very very much!Your answer is more then I was expecting to get as an answer.

Yes, I have wildcarded domain.

I've been doing some research on this issue for some time and right now I am a bit confused on how this topic (which You explained in details) links with a friendly urls issue.

What I mean is I found some free scripts which works like this:
when a guest enters user.domain.com in the address bar he is taken (using wildcarded domain) to the domain root, where index.php "catches" the user part of the address and redirect guest to some scpecific address.

I presume that when developing a website like this I would rather go for Your solution (mod_rewrite redirection in general)..?

Moreover what I want to do is to have friendly (clean) urls and from what I know this can be achieved in two ways:

(Again) First one would only use a very simple httaccess redirection to point all requests to the root index.php file and this single index.php file would handle all request by checking the url, splitting it and pointing to some appropriate files.

Second technique would only use httaccess and mod_rewrite redirections to handle all the requests - not involving any php coding to determine the content user requested.

QUESTION: Would it be right to say that Your soulution plus some more redirections will give me what I want ("virtual" users' subdomains and clean urls)?

Please excuse I am not making it very clear but as I said Im a bit confused.

jdMorgan

2:00 am on Dec 4, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First touch-points: Learn the difference between an internal rewrite and an external redirect, and between a URL and a server filepath. Errors in using these terms or thinking about these concepts destroy many Web sites' chances to rank.

There are some threads in our Apache Forum Library about the redirect-versus-rewrite subject, and we discuss the URL-versus-filepath subject every day here... :(

As I stated in my first reply, this is an internal rewrite application, and has nothing to do with external redirects at all.

Another thing that can be helpful: Base your "process analysis" starting with the correct URL as typed by a user or as linked-to on one of your pages. Then work through the request arriving at your server, being processed by mod_rewrite and all the other Apache modules, being delivered to your script file(s), the script generating a new page, and that page being sent back to the browser. If you start with a correct link or typed-in URL as the basis, then all of this gets a lot easier to thing about.

Only if the typed-in URL is wrong or if a URL indexed by search engines has changed do you need a redirect. If a link on your page is wrong, you just fix the page itself and no redirect is needed. If you've changed the URL and you want to speed up replacement of the old URL with the new URL in search engines, then you might also need a redirect.

But again, what you are describing here calls for a simple rewrite: "User requests this URL from our server, we rewrite that to this other filepath to generate and deliver content."

Jim

malina

2:16 pm on Dec 4, 2008 (gmt 0)

10+ Year Member



I have been editting this post several times, because I wasn't working, but now I can say it works like a charm:D

I also did some studying on the keywords You mentioned - internal rewrite and external redirection - I hope I get it right now.

Last question (for now - if I could still use Your precious help in the future) is how does search engine bot behave when it comes to my website - i mean it will index the main page but will it also index user's pages the way I want him to - as "user.domain.com"?

Thank You very much!

jdMorgan

3:57 pm on Dec 4, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Search engines will index any URL they find in a link anywhere. They don't index sites or pages, they index URLs.

Now that you have the first part working, you need to make sure that URLs of the form
http://www.example.com/users/<first letter of username>/<whole username>/<anything>
cannot be indexed.

To do that, you need an external redirect. It's a bit complicated, because it must be coded so as not to interfere with your internal rewrite:


RewriteCond %{THE_REQUEST} ^[A-Z]+\ /users/[a-z]/[a-z][a-z0-9\-]+[a-z0-9]/[^\ ]*\ HTTP/
RewriteRule ^users/[a-z]/([a-z][a-z0-9\-]+[a-z0-9])/(.*)$ http://$1.example.com/$2 [R=301,L]

This redirect goes above the rewrite, and before any more-general rules such as domain redirection.

Jim

malina

4:32 pm on Dec 4, 2008 (gmt 0)

10+ Year Member



Every question You answer makes me ask another one :D

Is it for search engines only? Or does it handle urls on the website?

As I said in every user directory I make tests on there is an index.php which tells me whos site I am on (depending on the "subdomain" part) and displays $_SERVER variable. Now when I type test.localhost/home the url in the address bar remains unchanged, but $_SERVER["REDIRECT_URL"]="home" and $_SERVER["REQUEST_URI"]="home" - those two variables change depending on the part after "/".

Is this enough for using friendly urls on the website as now I can check (in the user's index.php and $_SERVER variable) what one has requested and show the specific content?

Can I now use the "<a href="user.domain.com/contact">Contact</a>" on user's page for example? Or do I have to look for some more htaccess techniques?

jdMorgan

5:20 pm on Dec 4, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Is it for search engines only? Or does it handle urls on the website?

Its purpose is to prevent search engines from indexing both the "friendly" URLs and the "unfriendly" ones. It prevents the duplicate-content problems that this would cause. To be clear, for any page on the Web, there should be one and only one URL that can be used to reach that page. All variations in domain name (e.g. www.example.com versus example.com), all variations in uppercase/lowercase, query strings -- in short, any change to any characters in the URL at all, should result in either a different page being served, a 404-Not Found, or a redirect to the correct "canonical" URL. Otherwise, you will have problems ranging from minor nuisances to major search engine ranking disasters... It is best to get all of this right from the outset.

> Can I now use the "<a href="user.domain.com/contact">Contact</a>" on user's page for example? Or do I have to look for some more htaccess techniques?

I don't understand the question, but all of the links must now be in the form of <a href="http://user.domain.com/contact">, and a request for that URL will be served from the server filepath /users/u/user/contact

I suggest that you test it to find out if it's working as you expect.

Jim

malina

5:46 pm on Dec 4, 2008 (gmt 0)

10+ Year Member



Thank You once again for Your patience and will to help. I can almost start coding the website now.

malina

8:00 pm on Dec 4, 2008 (gmt 0)

10+ Year Member



I have to post one more, because i don't think it works the way You described and I dont want to mess with SE.

When I type:

test.localhost/home
or
test.localhost/home/

it shows me content of the index.php file stored in /users/t/test/
and not from users/t/test/home/ (although this one exists with different content). As I said it only sets the two $_SERVER variables.

My .htaccess so far:

Options FollowSymLinks
RewriteEngine On

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /users/[a-z]/[a-z][a-z0-9\-]+[a-z0-9]/[^\ ]*\ HTTP/
RewriteRule ^users/[a-z]/([a-z][a-z0-9\-]+[a-z0-9])/(.*)$ $1.localhost/$2 [R=301,L]

RewriteCond $1 !^users/[a-z]/
RewriteCond %{HTTP_HOST} !^localhost
RewriteCond %{HTTP_HOST} ^(([a-z])[a-z0-9\-]+[a-z0-9])\.localhost$
RewriteRule (.*) /users/%2/%1/ [L]

Stays the same when I extend the last rewrite rule to
RewriteRule (.*) /users/%3/%2/%1/ [L]

jdMorgan

9:36 pm on Dec 4, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You copied the code I posted incorrectly. The "%1" in the now-second RewriteRule must be "$1".
# then rewrite URL-path /<anything> to /users/<first letter of username>/<whole username>/<anything>
RewriteRule (.*) /users/%3/%2/$1 [L]

Please copy the code I posted exactly, and make only the changes needed to correspond with your "real" hostnames and filepaths ("localhost", "/users", etc.).

With mod_rewrite, every single character must be correct, or it will not work.

Completely flush your browser cache after changing any code on your server.

Jim

[edited by: jdMorgan at 9:37 pm (utc) on Dec. 4, 2008]

malina

9:57 pm on Dec 4, 2008 (gmt 0)

10+ Year Member



I have changed it on purpose - with %1 it works like I described above (shows me test user's site when typing test.localhost, but doesn't show "home" index.php file contents when typing test.localhost/home).

With $1 parameter it gives me an error:
Name Error: The domain name does not exist. Although previously worked with %1. I have both "localhost" and "test.localhost" defined in my hosts file pointing to 127.0.01 which seems to be correct.

jdMorgan

2:48 am on Dec 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK. But it won't work with %1 there. Sorry.

Using %1, the correct comment for the rule would be:
# then rewrite URL-path /<anything> to /users/<first letter of username>/<whole username>/<subdomain-name>

Jim

malina

7:31 am on Dec 5, 2008 (gmt 0)

10+ Year Member



Actually it works fine now. The problem was:

RewriteRule (.*) /users/%3/%2/$1 [L]

should be

RewriteRule (.*) /users/%2/%1/$1 [L]

1. The first rewrite rule - in the external redirection :

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /users/[a-z]/[a-z][a-z0-9\-]+[a-z0-9]/[^\ ]*\ HTTP/
RewriteRule ^users/[a-z]/([a-z][a-z0-9\-]+[a-z0-9])/(.*)$ $1.localhost/$2 [R=301,L]

Should there be a $1 or %1? How to check how this rule works?

2. I would like to modify internal rewrite rule - the very rewrite:
RewriteRule (.*) /users/%2/%1/$1 [L]
to
RewriteRule (.*) /users/%2/%1/index.php?param=$1.

Do I have to modify the first Search Engines redirection to somehow match the rewrite?

jdMorgan

3:05 pm on Dec 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, the two rules should be perfect mirror images of each other's function.

The rewrite "connects" the subdomain requests to the correct subdirectory, so that content can be delivered.

The redirect prevents direct client access to that subdirectory, redirecting the client back to the subdomain "version" of the URL. The intent is to prevent search engines from indexing the same content under two URLs, and to prevent users from posting links that point directly to the subdirectory.

Comment out the redirect, and get the rewrite working perfectly first. After you have tested for several days and are completely happy with the way this new URL set-up is working, then modify the redirect to prevent direct access to the subdirectories.

Jim