Forum Moderators: phranque

Message Too Old, No Replies

Apache 2 301 redirects / rewrite mod IP to www

apache 301 redirect rewrite_mod ip www

         

martinmartin

4:41 pm on Mar 6, 2010 (gmt 0)

10+ Year Member



Hello and thanks for reading me;
I have seen on this blog your answer to pesky issue of indexed pages under IP and the solution to redirect using .htaccess directives.

My issue is complex. My website was dropped from G index due to dup content. IP was being indexed. Server crashed and on restart was throwing IP. We fixed with canonical link and permanent 301 using mod rewrite in .htaccess file. However, we are still getting indexed under our IP.

Also, and maybe related - G will not index any URLs located in this folder example.com/artist/(name of artist). The /artist folder itself is not accessible but all /artist/... URLs are crawl-able. Indexed now, unfortunately, in G under our IP though.

Here below is the mod_rewrite we are using on Apache. I was told by se omoz company that there are 2 301 redirects happening from our IP to the www.

Not sure how this is happening. Totally frustrated. Thanks and hope you can help.

RewriteEngine On

# Make sure people go to www.myapp.com, not myapp.com
RewriteCond %{HTTP_HOST} ^example.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com$1 [R=301,L]
# Yes, I've read no-www.com, but my site already has much Google-Fu on
# www.blah.com. Feel free to comment this out.

# Uncomment for rewrite debugging
#RewriteLog logs/myapp_rewrite_log
#RewriteLogLevel 9

# Check for maintenance file and redirect all requests
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]

# Rewrite index to check for static
RewriteRule ^/$ /index.html [QSA]

[edited by: jdMorgan at 5:50 am (utc) on Mar 7, 2010]
[edit reason] example.com [/edit]

g1smd

4:54 pm on Mar 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's no code there to redirect requests for the IP address.

The code for domain canonicalisation contains several flaws. It does not redirect domain name calls with an appended port number and/or trailing period. It should redirect them.

The above code will also only work in httpd.conf, not in the .htaccess file. In .htaccess the leading / of URL requests cannot be seen by Mod_Rewrite RewriteRule directives.

[QSA] is the default action. No need to mention that. Instead, you do need the [L] flag on every rule.


A double redirect (a.k.a 'redirection chain') will be very bad news. You must avoid that happening.

Use 'Live HTTP Headers' to examine the server requests and responses.


Your 'maintenance' rewrite presently slows down every single request that your server handles. It should be changed so that it only looks at requests for HTML pages, and does not look at requests for images, CSS files, JS files, and other non-HTML content. You also do not want to be sending requests for robots.txt and SE verification files to your 'maintenance' page.

In any case, having a maintenance page is a dangerous way of doing this. You should instead return '503' status code while the site is being updated; and you should change your processes so that you have a live site in one folder and a copy in another folder, and you just swap between the two to go live with a new version.

martinmartin

5:13 pm on Mar 6, 2010 (gmt 0)

10+ Year Member



Thank you very much for this information.
>>The above code will also only work in httpd.conf, not in the .htaccess file. In .htaccess the leading / of URL requests cannot be seen by Mod_Rewrite RewriteRule directives. <<

this code IS in a conf file. I am sorry for making that mistake. It is named www.mysite.com.conf

jdMorgan

5:57 am on Mar 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In httpd.conf or other config files, two more ways to play:

# Externally redirect requests for all non-blank, non-canonical hostnames
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com$1 [R=301,L]


# Externally redirect requests for non-www hostname or IP address as hostname
RewriteCond %{HTTP_HOST} ^example\com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^[0-9]+(\.[0-9]){3}$
RewriteRule ^(.*)$ http://www.example.com$1 [R=301,L]

Jim

martinmartin

9:17 pm on Mar 8, 2010 (gmt 0)

10+ Year Member



Hi jdMorgan,
Here copied is the change that my developer did after I showed him your comments. Did he do it correctly? Will this work? Thanks so much!

DM

Here is the included file /etc/apache2/vhosts.d/www.mysite.com.conf

----------------------------------------------------------------------------------------------------------------------------------



<VirtualHost 208.75.151.227:80>
ServerName mysite.com
ServerAlias www.mysite.com
DocumentRoot /home/mysite/apps/production/mysite/current/public

<Directory "/home/mysite/apps/production/mysite/current/public">
Options FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
Deny from 189.188.67.73
</Directory>

ProxyPass / balancer://mysite_production_cluster/
ProxyPassReverse / balancer://mysite_production_cluster/

RewriteEngine On

# Make sure people go to www.myapp.com, not myapp.com
RewriteCond %{HTTP_HOST} ^mysite.com$ [NC]
RewriteRule ^(.*)$ [mysite.com$1...] [R=301,L]

RewriteCond %{HTTP_HOST} ^208.75.151.227$ [NC]
RewriteCond %{REQUEST_URI} !^/google.*.html$ [NC]
RewriteRule ^(.*)$ [mysite.com$1...] [R=301,L]

# Yes, I've read no-www.com, but my site already has much Google-Fu on
# www.blah.com. Feel free to comment this out.

# Uncomment for rewrite debugging
#RewriteLog logs/myapp_rewrite_log
#RewriteLogLevel 9

# Check for maintenance file and redirect all requests
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]

# Rewrite index to check for static
RewriteRule ^/$ /index.html [QSA]

# Rewrite to check for Rails cached page
RewriteRule ^([^.]+)$ $1.html [QSA]

# Redirect all non-static requests to cluster
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://mysite_production_cluster%{REQUEST_URI} [P,QSA,L]

# Deflate
AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

# Uncomment for deflate debugging
#DeflateFilterNote Input input_info
#DeflateFilterNote Output output_info
#DeflateFilterNote Ratio ratio_info
#LogFormat '"%r" %{output_info}n/%{input_info}n (%{ratio_info}n%%)' deflate
#CustomLog logs/myapp_deflate_log deflate

ErrorLog /var/log/apache2/www.mysite.com_errors_log
CustomLog /var/log/apache2/www.mysite.com_log combined


</VirtualHost>

g1smd

9:37 pm on Mar 8, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A good many of the comments both myself and jdmorgan made, were completely ignored.

martinmartin

1:12 am on Mar 9, 2010 (gmt 0)

10+ Year Member



Do these redirects work as they are written here for subdirectories too? Ex: [images2.example.com...]

thx

martinmartin

2:31 pm on Mar 13, 2010 (gmt 0)

10+ Year Member



Jim said: In httpd.conf or other config files, two more ways to play:

# Externally redirect requests for all non-blank, non-canonical hostnames
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com$1 [R=301,L]


# Externally redirect requests for non-www hostname or IP address as hostname
RewriteCond %{HTTP_HOST} ^example\com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^[0-9]+(\.[0-9]){3}$
RewriteRule ^(.*)$ http://www.example.com$1 [R=301,L]

Question: could/should we use both of these and replace with what we now have?

g1smd

8:12 pm on Mar 13, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pick just one of the three options, but each has a specific server configuration that it can be used with.