Forum Moderators: phranque
eg: www.mydomain.com/login.php is accessed by
www.mydomain.com/secure/login
etc
This works fine - however, I'm a bit concerned about the index.php file - I currently have this rule:
RewriteRule ^$ index.php
So when the site is accessed via www.mydomain.com, it automatically uses this file and doesn't display www.mydomain.com/index.php in the address bar.
My concern is what will happen if www.mydomain.com/index.php gets indexed by search engines, or is this unlikely?
Also, is this a valid thing to do? Do Search Engines look for index.php by default, and with the above rule will they not index it?
Is there any way to prevent access to my site using /blahblah.php? Ie throw a 404 if this is attempted?
Thanks for any advice.
Normally a search engine will index the default page, of / and index.what-ever-the-default-is if there are any links to the full version of the default. There are theories of PageRank splitting, and other oddities because of this. (Not my forte, but this is what I've read.)
What you can do to stop this:
A. Rather than rewriting silently to index.php (EG the browser does not change) you can do an external or visible rewrite by adding [R=301,L] to your rule like this:
RewriteRule ^$ index.php [R=301,L]
This way you site will always default to yoursite.com/index.php
B. You can 'catch' any direct, original request for index.php and rewrite it back to /. To do this and still serve the page correctly, you will have to use a {THE_REQUEST} condition before the rule like this:
RewriteRule ^$ index.php [L]
RewriteCond %{THE_REQUEST} ^/index\.php\ HTTP/ [NC]
RewriteRule $index\.php$ [yoursite.com...] [R=301,L]
If you use this method, remember the order must be intact, or you may create an ugly loop.
Either of these should get you close to eliminating duplicates. Go with whatever way you think is better for your situation.
Hope this helps and gives you some ideas.
Justin
If you decide to use either and have trouble getting them to work, keep posting and I'll look closer at them.
I'm off to try your suggestions and will report back.
I don't want to use your first suggestion:
RewriteRule ^$ index.php [R=301,L]
Since my reason in using rewrites is to hide the php extensions.
Tried the second suggestion, and www/mysite.com/index.php is not being re-written.
In this line:
RewriteCond %{THE_REQUEST} ^/index\.php\ HTTP/ [NC]
Can you explain further what this does exactly - is it supposed to check for a request coming in from somewhere external to my site?
This line:
RewriteRule $index\.php$ [yoursite.com...] [R=301,L]
Is the first "$" supposed to be a "^"?
In either case, it's not working.
I tried removing the RewriteCond line, and the rewrite works, but the URL in the address bar remains the same - is this normal? (plus I'm now getting looping in certain cases)
I was hoping that www.mysite.com/index.php would actually be changed to www.mysite.com in the address bar, or is this not the case?
I also have these lines in the file:
RewriteCond %{HTTP_HOST}!^$
RewriteCond %{HTTP_HOST}!^www\.mysite\.com [NC]
RewriteRule ^(.*) [mysite.com...] [L,R=301]
to change www.mysite.com to [mysite.com,...] and this is reflected in the address bar - I have no idea why the other rewrites are not doing this.
Thanks
RewriteCond %{THE_REQUEST} ^/index\.php\ HTTP/
RewriteRule . [yoursite.com...] [R=301,L]
Yes, you were correct in thinking the $ was wrong.
The single .(dot) will check all non-blank requests for a match.
%{THE_REQUEST} matches only original header requests, not re-written requests, so by keeping this 'set' at the end, you should only match direct requests. EG mouse click on a link and typed requests.
This should work better for you.
Justin
You will also have to make sure you use the www version of yoursite.com in the rule.
RewriteCond %{THE_REQUEST} ^/index\.php\ HTTP/
RewriteRule . [yoursite.com...] [R=301,L]
I don't get a 404
Any more ideas as to what could be wrong?
And it "appears" to be working now, or did I just break something?
GET /mypage.php HTTP/1.1
Jim