homepage Welcome to WebmasterWorld Guest from 23.22.97.26
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
I would appreciate a sanity check on a htaccess
newbie asking help to check htaccess
globus999




msg:4584905
 1:00 pm on Jun 17, 2013 (gmt 0)

I am a total newbie who has been perusing other people's labor in order to try to solve a basic problem. I found a solution that seems to work, but, having no experience whatsoever in this area, I would *really* appreciate a sanity check to make sure I am not screwing-up or leaving something opened to hackers.

Scenario:
Typical shared-hosted website using Joomla 311.

full path to joomla installation: home/example/public_html/s1/j1/

this is, if I want to access Joomla's front-end page I need to type:

www.example.com/s1/j1 to get to index.php.

I would like to hide permanently (redirect) the extra path /s1/j1/ so that I only need to type www.exampe.com

Nothing new, I know, but it's new for me.

The htaccess file I am using which *seems* to work is:

==================

Options -Indexes

AuthName example.com

RewriteEngine on

RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.example\.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]


=================

Questions:

1 - Is this OK or am I missing something.
2 - Do I need an [NC] somewhere? I tried accessing the website with all caps and it works ok. I guess the server automatically enabled [NC]?
3 - Do I need an R=301? the Webiste is brand new. From what I understand I don't need it, but, if it is missing, will it have a negative SEO impact because it is "temporary" as opposed to "permanent"?

Any feedback would be greatly appreciated.

 

lucy24




msg:4584983
 4:15 pm on Jun 17, 2013 (gmt 0)

Is this OK or am I missing something.

Yes, you are missing a lot of things. Ouch.

First things first: What else is on your site? Are the top two layers of directories empty, so a request for www.example.com by itself leads nowhere? Otherwise, how can the site tell whether a request for
example.com/pagename.html
is really a request for
example.com/s1/j1/pagename.html
or for
example.com/pagename.html
?

Domain names are case-insensitive, because this part happens before the request ever reaches your site. You can force a particular casing if you want. I don't know whether search engines care about casing; they definitely care whether www. is present or absent, so always redirect to one form.

Your rule as written is not any kind of redirect, whether temporary or permanent. It is a rewrite. This happens to be what you want-- but you MUST precede it with the other part of the rule, the redirect from /s1/j1/ to the form without /s1/j1/. Don't allow users to use both forms, unless you truly don't give a hoot about Duplicate Content.

When you get to the rewrite, the hostname is irrelevant unless you have subdomains to exclude. By the time you get there, all other aspects of the hostname have already been standardized.

globus999




msg:4585022
 6:33 pm on Jun 17, 2013 (gmt 0)

Lucy24, thank you kindly. A few clarifications.

First things first: What else is on your site? Are the top two layers of directories empty, so a request for www.example.com by itself leads nowhere? Otherwise, how can the site tell whether a request for
example.com/pagename.html
is really a request for
example.com/s1/j1/pagename.html
or for
example.com/pagename.html
?


There is currently nothing else in the site. The reason for the two layers of directories is to isolate other domains and apps for future use.

For example, Domain#1 would use the directory S1 and all its apps will be located in S1/J1, S1/J2, S1/J3, etc. One app per J# directory, J1 = Joomla, J2 = Wordpress, J3 = Forum, etc.

Domain #2 would use the directory S2 and all its apps will be located in S2/J1, S2/J2, S2/J3, etc.

Joomla (and all the future apps) will be installed in the different J# directories. Currently, if I look at S1/J1/ I get the entire Joomla tree, where at its root I can find index.php. This is, S1/J1/index.php.

I just want to have all the domains and apps segregated in different directories. Otherwise it will be a total mess.

Without the htaccess file scripted as above, a return from www.example.com returns the Index page listed at the level of public_html, this is, I can see the directories: cgi-bin, images and S1. So yes, a simple request for www.example.com leads to nowhere.



Domain names are case-insensitive, because this part happens before the request ever reaches your site. You can force a particular casing if you want. I don't know whether search engines care about casing; they definitely care whether www. is present or absent, so always redirect to one form.


OK, so the code:

RewriteCond %{HTTP_HOST} ^freedomandpower\.ws$ [OR]
RewriteCond %{HTTP_HOST} ^www\.freedomandpower\.ws$


is OK and covers both scenarios. Correct? In addition, I just disregard case sensitivity.

Your rule as written is not any kind of redirect, whether temporary or permanent. It is a rewrite. This happens to be what you want--


OK, this is good, I guess? :-)

but you MUST precede it with the other part of the rule, the redirect from /s1/j1/ to the form without /s1/j1/. Don't allow users to use both forms, unless you truly don't give a hoot about Duplicate Content.


I would prefer not to have duplicates.
Not sure about this step. Do you mean something like:

RewriteCond %{HTTP_HOST} ^(www.)?example.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /s1/j1/$1 [L]



When you get to the rewrite, the hostname is irrelevant unless you have subdomains to exclude. By the time you get there, all other aspects of the hostname have already been standardized.


I was planning to use subdomains and point them to the different S1/J#'s.

I tried to use the following code, but it did not work.


RewriteEngine on

RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]

RewriteCond %{REQUEST_URI} !^/s1/j1/

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule ^(.*)$ /s1/j1/$1 [L]

RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(/)?$ /s1/j1/index.php [L]


I was getting a page not found, if I remember correctly.

globus999




msg:4585060
 8:16 pm on Jun 17, 2013 (gmt 0)

OK, so the following code should work OK, I mean rewritten and with no duplicates, correct?

Options -Indexes

AuthName example.com

RewriteEngine on

RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.example\.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]

globus999




msg:4585067
 8:26 pm on Jun 17, 2013 (gmt 0)

Now, with the first rewrite, I don't need the [OR] option there, which mean I can clean-up the code to look like:

Options -Indexes

AuthName example.com

RewriteEngine on

RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]


This code should now be clean and OK. Correct?

lucy24




msg:4585091
 9:33 pm on Jun 17, 2013 (gmt 0)

RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
The optimal wording is
!^(www\.example\.com)?$
making two changes:

#1 Escape literal periods. In the case of a hostname the non-escaping is not likely to cause problems, because requests for wwwwexamplezcom will never reach your domain in the first place. But stick with the habit.

#2 The ()? part is to allow for requests using http/1.0. Every time someone thinks they can safely exclude 1.0 because it's only used by robots, another legitimate proxy shows up. It may also be used by some satellite-internet systems, but proxies are the most visible.

OK, now I understand the "host" condition. You've got multiple domains passing through the same htaccess and you need to keep them separated. If you currently only have one domain, you can comment-out this line. But keep it for future use.

You still need the /s1/j1/ redirect. It comes before all your existing rules. That means before the domain-name redirect --also before the 'index.html' redirect, which you haven't got but is widely quoted throughout this forum, so you can look it up. It will look like this

RewriteCond %{THE_REQUEST} /s1/j1/
RewriteRule ^s1/j1/(.*) http://www.example.com/$1 [R=301,L]

Here the RewriteCond is for insurance. In case something goes haywire and the request passes through mod_rewrite a second time, you need to be sure the /s1/j1/ part originated with the human user, not your own RewriteRule. The same goes for index.html redirects and anything else that involves putting something into a pretty form while serving content from an unpretty location.

globus999




msg:4585148
 1:39 am on Jun 18, 2013 (gmt 0)

okidoki, tx! for the pointers.
How about this:

Options -Indexes

AuthName example.com

RewriteEngine on

RewriteCond %{THE_REQUEST} /s1/j1/
RewriteRule ^s1/j1/(.*) http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^(www\.example\.com)$ [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]

RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(/)?$ /s1/j1/index.php [L]

globus999




msg:4585150
 2:07 am on Jun 18, 2013 (gmt 0)

OK, assuming that the last code is more-or-less acceptable, I would like to move on to re-directing two domains: example1 and example2 to s1/j1 and s2/j1 respectively. How about this code:

Options -Indexes

RewriteEngine on

# ========= Domain example1

RewriteCond %{THE_REQUEST} /s1/j1/
RewriteRule ^s1/j1/(.*) http://www.example1.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^(*\.example1\.*)$ [NC]
RewriteCond %{HTTP_HOST} !^(www\.example1\.com)$ [NC]
RewriteRule (.*) http://www.example1.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example1.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]

RewriteCond %{HTTP_HOST} ^(www.)?example1.com$ [NC]
RewriteRule ^(/)?$ /s1/j1/index.php [L]


# ========= Domain example2

RewriteCond %{THE_REQUEST} /s2/j1/
RewriteRule ^s2/j1/(.*) http://www.example2.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^(*\.example2\.*)$ [NC]
RewriteCond %{HTTP_HOST} !^(www\.example2\.com)$ [NC]
RewriteRule (.*) http://www.example2.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example2.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s2/j1/$1 [L]

RewriteCond %{HTTP_HOST} ^(www.)?example2.com$ [NC]
RewriteRule ^(/)?$ /s2/j1/index.php [L]



The following code is the key modification:

RewriteCond %{HTTP_HOST} ^(*\.example1\.*)$ [NC]
RewriteCond %{HTTP_HOST} !^(www\.example1\.com)$ [NC]
RewriteRule (.*) http://www.example1.com/$1 [R=301,L]


it is meant to do:

If the browser request contains the string "example1" somewhere in the domain name but it is not properly formatted, then rewrite permanently to www.example1.com. Same with example2.
This is to differentiate from !^(www\.example1\.com)$ which would trigger equally for example1 and example2.

lucy24




msg:4585158
 3:53 am on Jun 18, 2013 (gmt 0)

RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(/)?$ /s1/j1/index.php [L]


You don't need this. The null request is covered by the general rewrite, which uses .* "anything or nothing" Since index.php is a physical file belonging to a physical directory, mod_dir will handle it silently by issuing a rewrite of its own.

Other stuff later, because I gotta go clean the rat cage :(

globus999




msg:4585358
 1:25 pm on Jun 18, 2013 (gmt 0)

tx!, removed the offending lines:

Options -Indexes

RewriteEngine on

# ========= Domain example1

RewriteCond %{THE_REQUEST} /s1/j1/
RewriteRule ^s1/j1/(.*) http://www.example1.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^(*\.example1\.*)$ [NC]
RewriteCond %{HTTP_HOST} !^(www\.example1\.com)$ [NC]
RewriteRule (.*) http://www.example1.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example1.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s1/j1/$1 [L]


# ========= Domain example2

RewriteCond %{THE_REQUEST} /s2/j1/
RewriteRule ^s2/j1/(.*) http://www.example2.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^(*\.example2\.*)$ [NC]
RewriteCond %{HTTP_HOST} !^(www\.example2\.com)$ [NC]
RewriteRule (.*) http://www.example2.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www.example2.com$
RewriteCond %{REQUEST_URI} !^/s1/j1/
RewriteRule (.*) /s2/j1/$1 [L]



Other stuff later, because I gotta go clean the rat cage :(


Know the feeling... two cats here... just cleand-up a huge mess... x2... ;-(

lucy24




msg:4585436
 6:26 pm on Jun 18, 2013 (gmt 0)

Before we continue...

You've got assorted domains in assorted subdirectories. Isn't there some way you can get the DNS to point directly to where each domain lives?

Different hosts have different setups. Mine for example goes:

/userspace/
/userspace/domain1/
/userspace/domain2/
/userspace/domain3/

Each domain has its own independent htaccess. You can also choose to put an htaccess in the userspace, where it will be seen by all requests. (Requests don't teleport straight to the target. They have to move through all the layers of the directory structure, and obey any rules they meet along the way.) I use this for things like IP-based access control that is the same for all domains. There is NO mod_rewrite in the outer htaccess.

In another common pattern, you get

/domain/
/domain/otherdomain1/
/domain/otherdomain2/

Here, all requests have to pass through the first /domain/ htaccess. In addition, the "add-on" domains can have independent htaccess files that aren't seen by any other domain.

globus999




msg:4585555
 3:09 am on Jun 19, 2013 (gmt 0)

You got me thinking, but the bottom line is that I think I can't because I am using the cheapest option which is shared-hosting where most of this stuff is pre-determined and I cannot change it.

Let's start from the beginning:

You've got assorted domains in assorted subdirectories. Isn't there some way you can get the DNS to point directly to where each domain lives?


I don't think I can. All I have is cPanel which... writes to the .htaccess file. I also have DNS Zone Editor, which does not allow to specify directories or paths.

Setup:

/userspace/
/userspace/domain1/
/userspace/domain2/
/userspace/domain3/


Nice, but no. I have more-or-less the "classic":

/domain/
/domain/otherdomain1/
/domain/otherdomain2/


which I changed to:

home/example1/public_html/s1/j1 <-- Main example1
home/example1/public_html/s1/j2 <-- subdomain1.example1
home/example1/public_html/s1/j3 <-- subdomain2.example1
home/example1/public_html/s2/j1 <-- Main example2
home/example1/public_html/s2/j2 <-- subdomain1.example2
home/example1/public_html/s2/j3 <-- subdomain1.example3

the "home/example1/public_html/" part was given to me and I cannot change it. The rest of the structure, I concocted and can be changed. However, since each j# directory will have a different app, I would prefer to keep them segregated from each other.

The main .htaccess file is located in "home/domain/public_html/.htaccess"

I am assuming I can place independent .htaccess files in each j# directory, but I am not 100% sure it will work or if it can solve anything because I need to rewrite into j# anyways.

Here, all requests have to pass through the first /domain/ htaccess.


I believe this to be my case.

Wrt add-on domains, I was planning to use this functionality for the remaining domains. You mentioned that:

In addition, the "add-on" domains can have independent htaccess files that aren't seen by any other domain.


This means, for example, that I could do through cPanel:

home/example1/public_html/s2/j1 <-- Main example2

and place example2 .htaccess as:

home/example1/public_html/s2/j1/.htaccess

However, the first rewrite (home/example1/public_html/s2/j1 <-- Main example2 ) happens in the main .htaccess file, which is where cPanel writes. This file won't have all the checks and balances that the main .htaccess file has for the domain example1.
If I try to add the missing elements (remove duplication, ensure proper format, etc. for example2) in the secondary .htaccess, wouldn't it be too late?

That's why I was thinking into bypassing cPanel altogether because is too limited and it writes into the main .htaccess anyways.

Am I making any sense... probably not? It's quite late and I am tired...

Many tx!

lucy24




msg:4585575
 5:30 am on Jun 19, 2013 (gmt 0)

Can't help you on cPanel, because I've got one of those mega-hosts who are so big, they made their own control panel. And it should be a separate issue from DNS anyway. Domain names have to resolve to something-- unless, of course, you've forgotten to put a file called "index.html" in the appropriate directory. Your host may pop in a placeholder whenever you add a domain. I know mine does. (If they don't, users get that mournful browser message that says "It seems legit, but I just can't find it anywhere!")

If your structure is {primary domain plus any number of addon domains living in their separate directories}, then frankly it might be easier not to use the primary domain at all. Then you don't have to worry about conflicting RewriteRules between the outer htaccess and the inner ones.

This is a little tricky if, like most people in the world, you didn't think of making a second domain until long after the first one was established. :(

globus999




msg:4585747
 2:49 pm on Jun 19, 2013 (gmt 0)

Can't help you on cPanel, because I've got one of those mega-hosts who are so big, they made their own control panel.


That's OK. I wasn't planning on using cPanel. It is only capable of writing basic commands into .htaccess anyways. That's why I was planning in bypassing it altogether.

And it should be a separate issue from DNS anyway. Domain names have to resolve to something-- unless, of course, you've forgotten to put a file called "index.html" in the appropriate directory. Your host may pop in a placeholder whenever you add a domain. I know mine does. (If they don't, users get that mournful browser message that says "It seems legit, but I just can't find it anywhere!")


Correct. My main domain resolves to home/example/public_html/index.html, which is the dummy file that the hoster placed in. I cannot change this resolution, neither my hoster is willing to do it for me (I asked ;-( ).

I have no use for that index.html since I don't have static pages (I removed the page). In Joomla (and most other script-based apps) it's always index.php.

If your structure is {primary domain plus any number of addon domains living in their separate directories}, then frankly it might be easier not to use the primary domain at all. Then you don't have to worry about conflicting RewriteRules between the outer htaccess and the inner ones.


I see your point and it would be nice, but unfortunately, that's how pretty much all the shared-hosting is setup now days. The other option is to go VPS, which is x10 more expensive and hopelessly beyond my budget.

This is a little tricky if, like most people in the world, you didn't think of making a second domain until long after the first one was established. :(


Believe it or not I did... but I never though that rewriting would be *so* friggin complicated. I guess what I am trying to say is that I am stuck.

The good news is that at least I can try out the different scripts in a test box I just hacked out of a junked PC. It has the same setup as my hoster and I can cheat on nameservers using local ones (through cPanel - no idea how to use BIND or anything similar) and re-directing IPs through the hosts file in my PC (because I am obviously using non-routable IPs on my LAN). So that when I type www.example.com in my browser it hits the hosts file which re-directs to 192.168.1.101 which is the IP assigned by the server to the domain example.com and so the request resolves to the index.php in the Joomla directory in my "server". This seems to work OK purely for testing purposes.

lucy24




msg:4585834
 6:41 pm on Jun 19, 2013 (gmt 0)

I have no use for that index.html since I don't have static pages (I removed the page). In Joomla (and most other script-based apps) it's always index.php.

It doesn't have to be .html, it just has to be index.something. In fact if you go wild on your DirectoryIndex specifications, it could be called "main.asp" or "default.jsp" or anything you like, so long as the server has been told to look for it. The Apache default is still "index.html" and nothing else, but you have to assume that all hosts add-- at a minimum-- .php and .htm

globus999




msg:4585941
 1:21 am on Jun 20, 2013 (gmt 0)

It doesn't have to be .html, it just has to be index.something.


You were absolutely right. In my case is called "index.hoster" (where hoster is the name of my hoster). Its contents are just regular html code stating that there is no content yet with a few links to my hoster's help files and knowledge base.

g1smd




msg:4585987
 7:38 am on Jun 20, 2013 (gmt 0)

Get into the habit of escaping literal periods in patterns. Always.

Comment each line of code in plain English, so that it is clear what it is supposed to do.

RewriteCond %{HTTP_HOST} ^(*\.example1\.*)$ [NC] is not a valid piece of code. There are multiple errors.

Be sure you know the difference between redirecting a request and rewriting a request. You've said "rewrite" several times when the code you are describing is for a "redirect". A RewriteRule can be configured to redirect or to rewrite. That's what makes it so powerful and useful.

globus999




msg:4586430
 2:14 pm on Jun 21, 2013 (gmt 0)

g1smd, thank you kindly for your reply. I just found a couple of problems about how cPanel works and .htaccess. It sometimes writes into the file and sometimes it does not. I am still trying to clarify what's going on. Will continue posting asap I guet a definitive answer. tx! again.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved