Welcome to WebmasterWorld Guest from 54.162.139.105

Forum Moderators: Ocean10000 & incrediBILL & phranque

Is this htaccess file crashing my site?

     
12:41 pm on Oct 5, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Nov 29, 2015
posts:60
votes: 14


Hi All,

I've pieced this together - and I'm not sure if this is the reason my apache keeps overloading and needs restarting...

# Dont list files or folders
Options -Indexes
#
# Dont show server details
ServerSignature Off
#
RewriteEngine On
RewriteBase /
#
# Add WWW
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*) http://www\.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
#
# add trailing slash if missing
rewriteRule ^(([a-z0-9\-]+/)*[a-z0-9\-]+)$ $1/ [R=301,L]
#
# Allow Access to real files or folders
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#
# Everything else goes to loader
RewriteRule ^(.*)$ loader.php?t=$1 [L,QSA]
#
# CACHE EVERYTHING...
<IfModule mod_expires.c>
# Enable expirations
ExpiresActive On
# Default directive
ExpiresDefault "access plus 1 month"
# Html
ExpiresByType text/html "access plus 1 month"
# My favicon
ExpiresByType image/x-icon "access plus 1 year"
# Images
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpg "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
# CSS
ExpiresByType text/css "access plus 1 month"
# Javascript
ExpiresByType application/javascript "access plus 1 year"
# Music
ExpiresByType audio/mp3 "access plus 1 year"
</IfModule>


It's a really popular site (about 1million page views a day).
Up to now its been static html pages - and we are trying to move to a template system - so don't know if the server just cant cope with everything being routed to loader.php using rewrite rules.


Thanks so much!
3:41 pm on Oct 17, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Nov 29, 2015
posts:60
votes: 14


For some reason - any request to a "index.php" file is now redirected to just the directory. (e.g /directory/index.php becomes /directory/)

I really need:
example.com/directory/
and
example.com/directory/index.php
to both be QSA loaded by the loader.php - which handles the page.


I've been trying to figure it out with the help of regex guides, and I think this is close to what I'm trying to do:
Options -Indexes
ServerSignature Off
RewriteEngine On

# Combined these...
RewriteRule ^(files|loader\.php) - [L]

RewriteRule ^(([a-z0-9\-]+/)*[a-z0-9\-]+)$ http://www.example-example.com/$1/ [R=301,L]

# Removed to stop redirect to directory only view
#RewriteCond %{REQUEST_URI} ^/((?:\w+/)*)index\.\w
#RewriteRule index\.(?:php|html)$ http://www.example-example.com/%1 [R=301,NS,L]

RewriteCond %{HTTP_HOST} !^(www\.example-example\.com)?$ [NC]
RewriteRule (.*) http://www.example-example.com/$1 [R=301,L]

# Added this rewrite rule to catch genuine files - it's the only way I could make the change below work - from ^([^.]*)$ to ^(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteRule ^([^.]*)$ /loader.php?t=$1 [L,QSA]
# Changed above rule - into the below to catch everything...
RewriteRule ^(.*)$ /loader.php?t=$1 [L,QSA]


But I think I know the problem I am having with the root index.php -- all the template engines I've looked at - don't have a "loader.php" - they use the "index.php" as the loader - so "index.php" in the root of the site always exists. I think it is what I will have to do too.
7:16 pm on Oct 17, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14245
votes: 550


any request to a "index.php" file is now redirected to just the directory. (e.g /directory/index.php becomes /directory/)

I really need:
example.com/directory/
and
example.com/directory/index.php
to both be QSA loaded by the loader.php - which handles the page.
No, you don't. What is happening now is what should be happening. For any given page, there should be only one URL to reach it. That's why every site has an index redirect. (Here "page" means "page that is seen by the visitor", whether or not it physically exists on the server.)

Why do you think /directory/ and /directory/index.php need to exist in parallel?
7:40 pm on Oct 17, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Nov 29, 2015
posts: 60
votes: 14


It was just to mimic the way the site worked before I moved to the template system.

Originally - if a user went to:

example.com/sub-page/
They were loading index.php - but they didn’t see the index.php part.

If they went to example.com/sub-page/index.php
The page would stay as “index.php” so the user would see index.php

Both would be loading the same file.

I was trying to recreate the same outcome with htaccess - just to insure there are no SEO changes.

I wanted to make this template system completely behind the scenes - so the users (and search engines) have no idea anything has changed.

So your index.php rewrite is probably the right way to do it — but just to keep the site working as before - I wanted users to be able to access both:

example.com/sub-folder
And
example.com/sub-folder/index.php


Thanks so much for the help again Lucy24 :-)

(I hate htaccess. More and more each day. Lol)
2:39 am on Oct 18, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14245
votes: 550


I wanted users to be able to access both
For a human user, a redirect is transparent. So even if they've got the "wrong" form bookmarked, they'll get to the right URL and never even notice the difference.

Search engines would definitely be happier if they didn't have to keep requesting both.

:: detour to check something in logs ::

Going back to 2011-2012, before I had an index.html redirect in place, fully 1/4 of all Googlebot requests for directories--and over 1/3 of bingbot requests--were in the form /directory/index.html. That's a lot of overhead, a lot of duplicate requests.

The next year, after the redirect was in place (my record-keeping is slapdash, but it looks as if I started in September 2012), Googlebot requests for /directory/index.html plummeted to less than 1% of all directory requests. Predictably, bing--which spends an inordinate proportion of its crawl budget requesting material that has been 301'd for years--dropped off far more slowly. But they, too, are now at less than 1% "index.html".

Well. That was interesting. I don't think I've ever done that particular lookup before. An unexpected headscratcher was that, after the redirect was well established, Bing and Google only requested /index.html in two specific directories, one of which has never had /index.html in its visible URL. Both happened to be second-order directories, /directory/subdir/. That means it was ultimately my fault for not adequately coding the redirect when I moved sites (end of 2013):
example.old/directory/index.html >> example.new/directory/
BUT
example.old/directory/subdir/index.html >> example.new/directory/subdir/index.html
(part of a catch-all redirect for all interior pages) leading search engines to think that subdir/index.html exists, even though it was promptly redirected (again) on the new site. Oops and phooey.

So you see, there are ramifications upon ramifications.
This 34 message thread spans 2 pages: 34