homepage Welcome to WebmasterWorld Guest from 54.167.173.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Not able to avoid external redirect
knkk




msg:4147399
 4:48 pm on Jun 5, 2010 (gmt 0)

In my .htaccess file, I have this code:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/localtig/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/localtig/public_html/_vti_pvt/service.grp

Options +FollowSymLinks
RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{REQUEST_URL} !=/favicon.ico
RewriteRule ^(.*)$ http://xyz.com/index.php?q=$1 [L,QSA]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.xyz.com/$1 [R=301,L]


This should redirect a url like

http://subdomain.xyz.com/sample-page

to

http://www.xyz.com/index.php?q=sample-page

...while retaining the URL in the address bar as

http://subdomain.xyz.com/sample-page

The problem is that this same .htaccess file acts differently on 2 machines.

One of them has version 2.2.13 of apache, and another has 1.3.37. On the former, the redirect is becoming external (i. e. the URL in the address bar is changing), while in the latter (1.3.37), it works just fine.

I do not even know if (or think) the version is a problem, since the 2.2.13 version, for another site hosted on it using the same httpd.conf file and with exactly the same directives, works just fine for all rewrites.

Does anyone have a clue what is going wrong? I'll be grateful for any help!

 

g1smd




msg:4147423
 5:53 pm on Jun 5, 2010 (gmt 0)

This is a problem we warn about every single day in this forum.

The solution is very simple.

List the redirect block of code BEFORE the rewrite block of code.

Remove the domain name from the target of the rewrite.

Leave the domain name intact on the target of the redirect.


Also, "REQUEST_URL" should be "REQUEST_URI" I believe.

jdMorgan




msg:4147468
 7:39 pm on Jun 5, 2010 (gmt 0)

> Also, "REQUEST_URL" should be "REQUEST_URI" I believe.

Yes, it should be "REQUEST_URI". But that RewriteCond isn't needed at all, because any .ico filetype is already excluded by the first rewritecond.

Jim

knkk




msg:4147606
 6:27 am on Jun 6, 2010 (gmt 0)

Thank you so much, g1smd. And I was thinking this would be one of *those* problems that will never have a simple solution. I just removed the domain name and it worked. That is why I love webmasterworld.com so much. The best of folks hang out here.

And jdMorgan, thanks for that suggestion. I changed that to REQUEST_URI.

I did not understand this line, however - it would be great if you could give an example:

"List the redirect block of code BEFORE the rewrite block of code."

Thank you so much again.

g1smd




msg:4147607
 6:36 am on Jun 6, 2010 (gmt 0)

If you had added comments to your blocks of code, the first comment would have said:

# Internally rewrite incoming URL requests to the script file.

and the second comment would have said:

# Externally redirect URL requests with index.php to remove filepath from URL.

Now that you know what each block of code does, the instruction should be a little more clear.

knkk




msg:4147679
 1:20 pm on Jun 6, 2010 (gmt 0)

Thanks again, g1smd. Have a good day.

g1smd




msg:4147723
 4:40 pm on Jun 6, 2010 (gmt 0)

Let's see the final code!

knkk




msg:4147725
 4:56 pm on Jun 6, 2010 (gmt 0)

Here goes:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/localtig/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/localtig/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]


Another small clarification. REQUEST_URI in apache is just the $_SERVER['PHP_SELF'] of PHP, right? Since $_SERVER['REQUEST_URI'] of PHP includes the query string, too?

knkk




msg:4147727
 4:57 pm on Jun 6, 2010 (gmt 0)

And while you suggested I list the redirect part before the rewrite part, this was working, so I did not dare touch it :). I added the comments, though, so you know what I was trying to do.

jdMorgan




msg:4148177
 3:11 pm on Jun 7, 2010 (gmt 0)

If you do not put the redirect first, then your code will 'expose' the internal /index.php filepath to search engines, and hut your search engine rankings by creating duplicate content.

Do not ignore this advice. If you are hesitant to re-arrange the rules, then by all means, do so "temporarily" and then test the results thoroughly. But do not ignore the advice; you got it from someone with a lot of experience and knowledge...

Jim

knkk




msg:4148593
 4:13 am on Jun 8, 2010 (gmt 0)

Thanks, jdMorgan. I made the change suggested by g1smd and you, and luckily everything seems to work fine (so far):


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]

RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]

g1smd




msg:4148678
 7:24 am on Jun 8, 2010 (gmt 0)

Add comments to the last block of code to describe the rewrite!

knkk




msg:4149596
 1:01 pm on Jun 9, 2010 (gmt 0)


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks

#below to redirect www.xyz.com/folder1/folder2/.../foldern/index.php to www.xyz.com/folder1/folder2/../foldern/
#was written to redirect www.xyz.com/index.php to www.xyz.com/ for Google to index only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ /$1 [R=301,L]

#redirect *.xyz.com/abc-def to *.xyz.com/index.php?q=abc-def where abc-def is not index.php or an image/css/ico file or robots.txt
RewriteEngine on
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]


I think that is a little more explanatory. Thank you so much for all the tips!

jdMorgan




msg:4149681
 2:44 pm on Jun 9, 2010 (gmt 0)

Still needs a bit of clean-up... Several syntax problems, "non-optimal" coding, and directives out of order, likely to cause failures:

# Don't list FrontPage or .htaccess files in auto-generated directory index pages
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
#
# Access Control
Order deny,allow
#
<Limit GET POST>
Deny from all
Allow from all
</Limit>
#
<LimitExcept GET POST>
Deny from all
</LimitExcept>
#
# Authentication/authorization
AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp
#
# Set required option to enable mod_rewrite
Options +FollowSymLinks
#
# Enable the rewriting engine
RewriteEngine on
#
# Externally redirect direct client requests for URL-path
# /<any subdirectories>/index.php<optional query and/or fragment> to URL
# www.example.com/<any subdirectories>/<optional query and/or fragment> so
# that Google indexes only the latter
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1%1 [R=301,L]
#
# Internally rewrite requests for URL-path /abc-def to internal filepath /index.php?q=abc-def
# where abc-def does not resolve to a physically-existing file or directory, and excluding
# index.php, image/css/ico files, or robots.txt to avoid unnecessary file-exists checks
RewriteCond $1 !(^index\.php|\.(gif|jpe?g|ico|css)|^robots\.txt)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [QSA,L]
#
# -end-

You should also consider adding a domain canonicalization rule. For example, redirect all requests for hostnames which are not exactly equal to "www.example.com" to hostname "www.example.com". This rule would follow the first the rule above, as it is a less specific redirect than the first rule, and all external redirects must generally precede any internal rewrites.

Jim

knkk




msg:4150234
 6:09 am on Jun 10, 2010 (gmt 0)

Wow, that's totally professional work - sets the benchmarks for me hereon :).

I was wondering if I can do away with these chunk of lines at the top:


# -FrontPage-

IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AuthName xyz.com
AuthUserFile /home/xyz/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/xyz/public_html/_vti_pvt/service.grp

Options +FollowSymLinks


I did not put them there - they seem to have come by default through cPanel, using which I created the account.

Also, another small clarification. REQUEST_URI in apache is just the $_SERVER['PHP_SELF'] of PHP, right (and not the $_SERVER['REQUEST_URI'] of PHP)? Since $_SERVER['REQUEST_URI'] of PHP includes the query string, too?

g1smd




msg:4150285
 8:11 am on Jun 10, 2010 (gmt 0)

REQUEST_URI is the path part of the literal "GET /somepath?some-params HTTP/1.1" request sent by the browser.

The "IndexIgnore" rule stops people reading your .htaccess file and other configuration files from the web.

The "deny" rule for PUT and DELETE requests stops hackers messing with your site.

The _vti stuff is for uploading files directly from Frontpage. You can # Comment just those two lines out to see if anything bad happens.

The "Options" line is usually required for correct server operation.

knkk




msg:4150291
 8:20 am on Jun 10, 2010 (gmt 0)

Thanks a lot, g1smd. This thread has cleared up a lot of things for me, and I am grateful you and jdMorgan chose to take time off so generously to share your knowledge.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved