Welcome to WebmasterWorld Guest from 54.161.110.186

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Not able to avoid external redirect

     
4:48 pm on Jun 5, 2010 (gmt 0)

5+ Year Member



In my .htaccess file, I have this code:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/localtig/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/localtig/public_html/_vti_pvt/service.grp

Options +FollowSymLinks
RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{REQUEST_URL} !=/favicon.ico
RewriteRule ^(.*)$ http://xyz.com/index.php?q=$1 [L,QSA]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.xyz.com/$1 [R=301,L]


This should redirect a url like

http://subdomain.xyz.com/sample-page


to

http://www.xyz.com/index.php?q=sample-page


...while retaining the URL in the address bar as

http://subdomain.xyz.com/sample-page


The problem is that this same .htaccess file acts differently on 2 machines.

One of them has version 2.2.13 of apache, and another has 1.3.37. On the former, the redirect is becoming external (i. e. the URL in the address bar is changing), while in the latter (1.3.37), it works just fine.

I do not even know if (or think) the version is a problem, since the 2.2.13 version, for another site hosted on it using the same httpd.conf file and with exactly the same directives, works just fine for all rewrites.

Does anyone have a clue what is going wrong? I'll be grateful for any help!
5:53 pm on Jun 5, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This is a problem we warn about every single day in this forum.

The solution is very simple.

List the redirect block of code BEFORE the rewrite block of code.

Remove the domain name from the target of the rewrite.

Leave the domain name intact on the target of the redirect.


Also, "REQUEST_URL" should be "REQUEST_URI" I believe.
7:39 pm on Jun 5, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> Also, "REQUEST_URL" should be "REQUEST_URI" I believe.

Yes, it should be "REQUEST_URI". But that RewriteCond isn't needed at all, because any .ico filetype is already excluded by the first rewritecond.

Jim
6:27 am on Jun 6, 2010 (gmt 0)

5+ Year Member



Thank you so much, g1smd. And I was thinking this would be one of *those* problems that will never have a simple solution. I just removed the domain name and it worked. That is why I love webmasterworld.com so much. The best of folks hang out here.

And jdMorgan, thanks for that suggestion. I changed that to REQUEST_URI.

I did not understand this line, however - it would be great if you could give an example:

"List the redirect block of code BEFORE the rewrite block of code."

Thank you so much again.
6:36 am on Jun 6, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If you had added comments to your blocks of code, the first comment would have said:

# Internally rewrite incoming URL requests to the script file.


and the second comment would have said:

# Externally redirect URL requests with index.php to remove filepath from URL.


Now that you know what each block of code does, the instruction should be a little more clear.
1:20 pm on Jun 6, 2010 (gmt 0)

5+ Year Member



Thanks again, g1smd. Have a good day.
4:40 pm on Jun 6, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Let's see the final code!
4:56 pm on Jun 6, 2010 (gmt 0)

5+ Year Member



Here goes:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/localtig/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/localtig/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]


Another small clarification. REQUEST_URI in apache is just the $_SERVER['PHP_SELF'] of PHP, right? Since $_SERVER['REQUEST_URI'] of PHP includes the query string, too?
4:57 pm on Jun 6, 2010 (gmt 0)

5+ Year Member



And while you suggested I list the redirect part before the rewrite part, this was working, so I did not dare touch it :). I added the comments, though, so you know what I was trying to do.
3:11 pm on Jun 7, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



If you do not put the redirect first, then your code will 'expose' the internal /index.php filepath to search engines, and hut your search engine rankings by creating duplicate content.

Do not ignore this advice. If you are hesitant to re-arrange the rules, then by all means, do so "temporarily" and then test the results thoroughly. But do not ignore the advice; you got it from someone with a lot of experience and knowledge...

Jim
4:13 am on Jun 8, 2010 (gmt 0)

5+ Year Member



Thanks, jdMorgan. I made the change suggested by g1smd and you, and luckily everything seems to work fine (so far):


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]

RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]
7:24 am on Jun 8, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Add comments to the last block of code to describe the rewrite!
1:01 pm on Jun 9, 2010 (gmt 0)

5+ Year Member




# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]

#redirect *.xyz.com/abc-def to *.xyz.com/index.php?q=abc-def where abc-def is not index.php or an image/css/ico file or robots.txt
RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]


I think that is a little more explanatory. Thank you so much for all the tips!
2:44 pm on Jun 9, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Still needs a bit of clean-up... Several syntax problems, "non-optimal" coding, and directives out of order, likely to cause failures:

# Don't list FrontPage or .htaccess files in auto-generated directory index pages
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
#
# Access Control
Order deny,allow
#
<Limit GET POST>
Deny from all
Allow from all
</Limit>
#
<LimitExcept GET POST>
Deny from all
</LimitExcept>
#
# Authentication/authorization
AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp
#
# Set required option to enable mod_rewrite
Options +FollowSymLinks
#
# Enable the rewriting engine
RewriteEngine on
#
# Externally redirect direct client requests for URL-path
# /<any subdirectories>/index.php<optional query and/or fragment> to URL
# www.example.com/<any subdirectories>/<optional query and/or fragment> so
# that Google indexes only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1%1 [R=301,L]
#
# Internally rewrite requests for URL-path /abc-def to internal filepath /index.php?q=abc-def
# where abc-def does not resolve to a physically-existing file or directory, and excluding
# index.php, image/css/ico files, or robots.txt to avoid unnecessary file-exists checks
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [QSA,L]
#
# -end-

You should also consider adding a domain canonicalization rule. For example, redirect all requests for hostnames which are not exactly equal to "www.example.com" to hostname "www.example.com". This rule would follow the first the rule above, as it is a less specific redirect than the first rule, and all external redirects must generally precede any internal rewrites.

Jim
6:09 am on Jun 10, 2010 (gmt 0)

5+ Year Member



Wow, that's totally professional work - sets the benchmarks for me hereon :).

I was wondering if I can do away with these chunk of lines at the top:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks


I did not put them there - they seem to have come by default through cPanel, using which I created the account.

Also, another small clarification. REQUEST_URI in apache is just the $_SERVER['PHP_SELF'] of PHP, right (and not the $_SERVER['REQUEST_URI'] of PHP)? Since $_SERVER['REQUEST_URI'] of PHP includes the query string, too?
8:11 am on Jun 10, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



REQUEST_URI is the path part of the literal "GET /somepath?some-params HTTP/1.1" request sent by the browser.

The "IndexIgnore" rule stops people reading your .htaccess file and other configuration files from the web.

The "deny" rule for PUT and DELETE requests stops hackers messing with your site.

The _vti stuff is for uploading files directly from Frontpage. You can # Comment just those two lines out to see if anything bad happens.

The "Options" line is usually required for correct server operation.
8:20 am on Jun 10, 2010 (gmt 0)

5+ Year Member



Thanks a lot, g1smd. This thread has cleared up a lot of things for me, and I am grateful you and jdMorgan chose to take time off so generously to share your knowledge.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month