Forum Moderators: phranque

Message Too Old, No Replies

MOD REWRITE Quickie

MOD_REWRITE quick question

         

tymbow

11:07 am on Jan 29, 2009 (gmt 0)

10+ Year Member



We have a client who has a hosted document repository service (not hosted by them). They want to be able to access it by a different host name and path. Where it gets complicated is that the document repository service provider can't (won't I suspect) do any of the redirection for them. The situation is as follows (cccc is the company's name):

Service is hosted as (a.b.c.d is just an IP address - no host name):

[a.b.c.d...]

Client wants to access it as:

[cccc.property.com.au...] or [cccc.property.com.au...]

We can do this with MOD_REWRITE through a different hosting provider where we can configure things but the client also wants the redirect to be silent. To my understanding this can't work as the redirect will result in the address changing in the client browser because it is an external redirection. As I understand things the only way to do this silently is to use the proxy [P] option with MOD_REWRITE.

Is this correct? If so, I'd appreciate some comments on the correct MOD_REWRITE configuration.

I've knocked up something I think is getting there but I'm no expert with Apache. Effectively what I want is:

1. Client requests either [cccc.property.com.au...] or [cccc.property.com.au...]
2. Client is silently redirected to [a.b.c.d...]

Note that there are query strings involved.


RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.cccc\.property\.com\.au$ [NC]
RewriteRule ^/doclibrary(.*)$ http://a.b.c.d/cccc$1 [NC, P, L]
RewriteRule ^(.*)$ http://a.b.c.d/cccc$1 [NC, P]

Thanks for any help.

jdMorgan

5:22 pm on Jan 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



  • RewriteConds apply only to the single RewriteRule that follows them.
  • No spaces are permitted, except where required by the mod_rewrite documentation.
  • Regular expressions can be used to avoid redundant rules.
  • The [L] flag used with [P] is redundant (see docs).

    Try:


    RewriteEngine on
    #
    RewriteCond %{HTTP_HOST} ^www\.cccc\.property\.com\.au [NC]
    RewriteRule ^/(doclibrary/)?(.*)$ http://a.b.c.d/cccc$1 [NC,P]

    This is neither an internal rewrite nor an external redirect; It is a proxy through-put.

    Assuming that this code goes into httpd.conf or other server-configuration-level file, you could also use mod_proxy to define this reverse-proxy function; Use of mod_rewrite is not required.

    If using mod_proxy, only a reverse-proxy should be enabled (use ProxyPass only); Do not enable a forward proxy for this application, as doing so is not be necessary and has potential security implications.

    Be aware that the back-end "service" server will see (and log) all requests as coming from your front-end server, rather than from the originating client. If client information logging is needed on the back-end, configure the X-Forwarded-For HTTP header on the front-end server, and modify the back-end server log configuration (using mod_log_config) to log the X-Forwarded-For header rather than Remote_Addr.

    As a general recommendation, you should configure your server to redirect all non-canonical URL requests to the canonical URL, rather than allowing multiple URLs to resolve to the same content. In this case I am referring to the /doclibrary/abc.xyz and /abc.xyz URLs, both of which will be proxied to a.b.c.d/abc.xyz on the back-end server. To avoid duplicate-content and resultant search ranking problems, one URL or the other should be chosen, and the non-preferred URL should be 301-redirected to the preferred one before invoking the reverse proxy.

    Similarly, issues such as www- versus non-www hostnames and hostnames with FQDNs or appended port numbers should be corrected before invoking the reverse proxy. We have previously discussed the many possible duplicate-content factors here, and I recommend a search of WebmasterWorld for more information this subject.

    Jim

  • tymbow

    8:44 pm on Jan 29, 2009 (gmt 0)

    10+ Year Member



    I don't yet know if we are using httpd.conf or .htaccess - I'm waiting for a response from the other provider.

    I'm very aware of the log issue and I've already indicated to the client this will occur and given the nature of the document library provider to not help with this sort of stuff this will be a problem for them.

    Indexing/page ranking is not an issue for the client. The "www.hostname" vs "www.hostname/doclibrary" issue is temporary as "www.hostname" will eventually have it's own content.

    Personally I'm not happy with what they have been told (by someone else) they can do because I think they are complicating things for themselves.

    Thanks for the help.

    jdMorgan

    9:21 pm on Jan 29, 2009 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    If you need to use this code in .htaccess, then remove the leading slash from the RewriteRule pattern; All else should be unaffected.

    Not sure if I was clear, but the code I posted goes on the front-end server. The only (possibly-desirable) change to the back-end "service" server would be the mod_log_config modification to allow for logging the X-Forwarded-For header in preference to Remote_Host or Remote_Addr.

    Jim