homepage Welcome to WebmasterWorld Guest from 204.236.254.124
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
More Wordpress htaccess
Mark_J




msg:4128551
 2:28 am on May 7, 2010 (gmt 0)

jdMorgan, I've been researching some tweaks for htaccess on Wordpress and wanted to ask you for clarification, as well as to thank you for providing this back in January:
The end result of all these modifications results in WP .htaccess code that looks like this:

# BEGIN wordpress
#
RewriteEngine on
#
# Unless you have set a different RewriteBase preceding this point,
# you may delete or comment-out the following RewriteBase directive
# RewriteBase /
#
# if this request is for "/" or has already been rewritten to WP
RewriteCond $1 ^(index\.php)?$ [OR]
# or if request is for image, css, or js file
RewriteCond $1 \.(gif¦jpg¦ico¦css¦js)$ [NC,OR]
# or if URL resolves to existing file
RewriteCond %{REQUEST_FILENAME} -f [OR]
# or if URL resolves to existing directory
RewriteCond %{REQUEST_FILENAME} -d
# then skip the rewrite to WP
RewriteRule ^(.*)$ - [S=1]
# else rewrite the request to WP
RewriteRule . /index.php [L]
#
# END wordpress


My question is this.. I converted an html site to wordpress, and I've found that there are existing links elsewhere on on the web to my old index.html file that now 404. So I appended the folowing to your code:

DirectoryIndex index.php index.html index.shtml index.cgi index.php3 index.phtml index.htm home.html welcome.html
redirect 301 /index.html http://www.example.com/


Q: Is there a more effecient way to accomplish this?

Thanks!
Mark

[edited by: tedster at 2:52 am (utc) on May 7, 2010]
[edit reason] switch from my-site to example.com - it cannot be owned [/edit]

 

jdMorgan




msg:4128747
 1:33 pm on May 7, 2010 (gmt 0)

Good grief! :) Do you really have "index pages" with all of those names? If not, get rid of them. The server will search for each of those files whenever a requested URL-path ends with a slash and this may require a physical disk access (slow), so the fewer filenames in that list, the better.

I don't think this actually solves your problem though. All it does is allow the first of any of those files that exists to be served in response to any URL-path request that ends in a slash.

You might want to dig around in some of the current and previous threads (current example [webmasterworld.com]) to see rules that many Webmasters want on their sites as "standard kit." These include (in order): Anti-hotlinking, rejection of other abusive/malicious requests, old URL redirects, named-index-page redirects (your current question), malformed-URL-path redirects (301 to the correct URL), hostname canonicalization, as well as others.

When constructing your code, put the access controls first, followed by all external redirects in order from most-specific patterns and conditions (fewest URLs affected) to least-specific (more URLs affected), and finally internal rewrites, again in order from most- to least-specific.

To see other examples of index page redirects, try a site search in this forum for "redirect index slash rewritecond" and similar phrases. The Apache Forum Library and the resources cited in our Forum Charter may also come in handy. Links to all of these are at the top of this page.

Jim

Mark_J




msg:4130748
 11:40 am on May 11, 2010 (gmt 0)

Good grief! :) Do you really have "index pages" with all of those names?


It's just like when I used to "tune up" my own car by changing the points and plugs... I get my hands under the hood and it could spell trouble. That line of coded was actually installed by the hosting control panel when I selected use index.php as the priority over index.html from one of the configuration boxes.

However, it now reads:
DirectoryIndex index.php
redirect 301 /index.html http://www.example.com/

And has seemed to solve the issue of 404s on index.html

Thanks again for pointing me in the direction of the additional info, which I'm still studying.

Mark

g1smd




msg:4131029
 7:41 pm on May 11, 2010 (gmt 0)

You'll need to change that Redirect code to use a RewriteRule.

Never mix Redirect and RedirectMatch directives with RewriteRule directives in the same site.

Dude_S




msg:4132512
 6:49 am on May 14, 2010 (gmt 0)

I have Wordpress .htaccess questions too. And sorry, but I'm a htaccess noob. When I activate pretty permalinks in WP, it writes this to .htaccess:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>


Strangely, I have an older WP install, that also has pretty permalinks, but doesn't have this section in .htaccess. Permalinks still work. So could someone explain in simple terms what above code actually does? I've searched in the Wordpress Codex [codex.wordpress.org] and other places, but the code is often quoted, but never explained.

I also saw this very interesting thread [webmasterworld.com] and I understand it makes above code faster, but if permalinks work without that code, wouldn't that be even better?

To answer these questions, I tried to learn these rewrite rules and syntax so I can figure it out myself; started reading the Apache doc [httpd.apache.org], but I gave up after half hour or so. It seems quite involved - as you can also see from the many questions on this forum - and frankly, I don't want to become an expert in something I may need to deal with only once. But I do want to have a basic grip of what's going on on my server.

Bonus question: On my older blog, I have this code to remove the www. which I copied from somewhere:
# Redirect www to bare domain
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} !^example\.com
RewriteRule (.*) http://example.com/$1 [R=301,L]


For my new blogs, I'd like to have it the other way around - default to www.example.com. One hurdle though: I'm running two blogs on different domains from the same WP install, i.e the domains map to the same directory, but run on different databases. The directory can only have one .htaccess, so I can't put the domain in there in plain text. Can the code be adjusted to work in this case?

Thanks!

jdMorgan




msg:4132641
 1:22 pm on May 14, 2010 (gmt 0)

Commented "factory-standard" WP code:

# Waste time by checking to see if mod_rewrite is installed.
# Fail silently if it is not (reduces WP support calls?)
<IfModule mod_rewrite.c>
#
# Turn on the mod_rewrite rewriting engine
RewriteEngine On
#
# Set the RewriteBase to the default value -- usually another time-waster
RewriteBase /
#
#
# Read the disk (using lots of server resources) and if the
# requested URL-path does not resolve to an existing file
RewriteCond %{REQUEST_FILENAME} !-f
#
# Read the disk again (using even more server resources), and if the
# requested URL-path does not resolve to an existing directory
RewriteCond %{REQUEST_FILENAME} !-d
#
# then rewrite all non-blank URL-paths to WP's /index.php
RewriteRule . /index.php [L]
</IfModule>

I gave up after half hour or so. It seems quite involved [...] and frankly, I don't want to become an expert in something I may need to deal with only once.

... thus leaving yourself dependent on others for the security and efficiency of your own server, and unprepared to handle security problems or any other trouble which may arise from changes made to your server configuration by your host or by hackers; Sorry, but that's a poor choice, in my opinion; If your server isn't configured correctly, all kinds of unfortunate side effects can result in both operation and search rankings, and potentially put you out of business.

To resolve your "some domains are www, and others not" dilemma:

Options +FollowSymLinks -Indexes -MultiViews
RewriteEngine on
#
# If any "example.com" subdomain (or domain) is requested (possibly
# mis-cased or with fully-qualified domain name or appended port number)
RewriteCond %{HTTP_HOST} ^([^.]+\.)*example\.com\.?(:[0-9]+)?$ [NC]
# and if the hostname is not *exactly* the "non-www" canonical hostname
RewriteCond %{HTTP_HOST} !=example.com
# Externally redirect to canonical hostname
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
#
# If any "example2.com" subdomain (or domain) is requested
# (possibly mis-cased or with FQDN or appended port number)
RewriteCond %{HTTP_HOST} ^([^.]+\.)*example2\.com\.?(:[0-9]+)?$ [NC]
# and if the requested hostname is not *exactly* the "www" canonical hostname
RewriteCond %{HTTP_HOST} !=www.example2.com
# Externally redirect to canonical hostname
RewriteRule ^(.*)$ http://www.example2.com/$1 [R=301,L]

If you need to handle SSL requests, then the code above will need to be modified/expanded even more. But there are too many variables involving which pages are SSL and which are not to allow posting a general solution.

Jim

Dude_S




msg:4132969
 9:29 pm on May 14, 2010 (gmt 0)

leaving yourself dependent on others for the security and efficiency of your own server, and unprepared to handle security problems or any other trouble which may arise from changes made to your server configuration by your host or by hackers; Sorry, but that's a poor choice, in my opinion

First, of all thanks for your prompt reply. I noticed you're responding to many questions here and you're very helpful and knowledgeable. No doubt, you're an expert in this field.

But what are you going to do when your car breaks down, are you going to learn about engines and transmissions yourself, or are you going to ask an experienced car mechanic? What if something is wrong with your brake system, you are "leaving yourself dependent on others for the security ... of your own [server] life"?

If you get sick or injured, are you going to study medicine to find a fix for yourself, or are you going to see an expert (doctor) and again "leaving yourself dependent on others"?

There are plenty more examples, and you see, that's what people do when they need help with something in an area they don't have to deal with often - they ask the experts. How long did it take you to reach your level of expertise with htaccess and mod_rewrite? I'm sure it's more than a few hours. You can't expect everyone, who comes across htaccess once in a blue moon, to put in that kind of time themselves. Sure, usually experts you consult get paid, and you know what? after reading some of your posts, I'd be happy to throw you a "tip" if there was a donation button or something. Unfortunately, it seems I can only give you verbal props.

Back to my question: So basically, Wordpress just routes everything that doesn't exist through its own index.php and that's where the permalinks are being handled, correct? I'm still wondering then, how on the older blog, where that code is missing in .htaccess, an URL like
http://blog.com/hello-world displays correctly. There is no physical file or directory "hello-world" on the host.

Now again about my "bonus question" - those are two plain-vanilla blogs. No SSL, nothing but standard port 80. So I don't need (I hope) to deal with all those special cases. In fact, if someone tried to request the site with an appended port number, I'd probably assume malicious intent. So what I want to do is:

Typed in URL:
example.com
Redirect and display in browser:
http://www.example.com/
accessed file:
~/public_html/index.php

Typed in URL:
example2.com
Redirect and display in browser:
http://www.example2.com/
accessed file:
~/public_html/index.php

Typed in URL:
www.example.com
Display in browser:
http://www.example.com/
accessed file:
~/public_html/index.php


There are some static files outside of the blog:
Typed in URL:
example.com/report.pdf
Redirect and display in browser:
http://www.example.com/report.pdf
accessed file:
~/public_html/report.pdf

Typed in URL:
www.example2.com/report.pdf
Display in browser:
http://www.example2.com/report.pdf
accessed file:
~/public_html/report.pdf


[I believe the following is handled by WP internally by turning on permalinks, so this may not have to be handled by .htaccess:
Typed in URL:
example.com/hello-world
Redirect and display in browser:
http://www.example.com/hello-world
accessed file:
~/public_html/index.php?p=1
]


So looking at your above code (thanks for the detailed comments, much appreciated!), the best solution is to put two separate blocks for the two domains? I'm just a bit confused because your first block deals with example.com and has no www, and the second block deals with example2.com and has all www. Also, does the dot in example.com not have to be escaped? Thanks again!

jdMorgan




msg:4132988
 10:28 pm on May 14, 2010 (gmt 0)

You need to know enough about brakes (or your health) to determine that you have a problem.

Or, looked at another way, you're working on your brakes based on advice from someone on the phone who cannot actually see your car or its brake system... And what if that person on the phone doesn't actually have the expertise he implies?

You'll need two rules only if one domain needs "www" and the other does not. Otherwise, the rule is simply "if requested hostname is not www dot <something> dot com, then externally redirect to add 'www' to the requested hostname."

The old install may have used mod_negotiation, AcceptPathInfo, or any one of several other methods to deliver content if the client-requested URL did not resolve to a physically-existing file.

Jim

g1smd




msg:4132989
 10:30 pm on May 14, 2010 (gmt 0)

It looks like the code has example1 and example2 reversed from your explanation, so adjust to suit.

You can see one block of code uses www and the other does not.

Periods need to be escaped in regex patterns, unless the != "exact match" syntax is in use, as here.

And to expand your car analogy, you as the owner of the car need to know that when the fuel gauge dips it's time for a refill, and you need to keep an eye on the tyres and keep them pumped up, and a myriad of other things which if you let them slide will bring you to a halt, some dangerously.

Ah, jd beat me to it by a matter of seconds.

Dude_S




msg:4133122
 9:11 am on May 15, 2010 (gmt 0)

Thanks for the replies, guys. So I guess, to stick with the analogy, the questions is whether learning htaccess and mod_rewrite rules is more akin to checking tyre pressure or to doing an engine job. I don't know... do you really believe everybody who wants to set up a self-hosted personal blog or website should learn regex and rewrite rules first? I mean, sure, in an ideal world... but realistically?

Look, at least I'm putting the effort in to understand what the code I'm using does and not just blindly copy and paste stuff I find all over the web and have everything turn into a mess. If this was some big commercial project, I'd hire a guy like you to do this, but in real life, I have to do this myself. This and a thousand other things throughout the day, some of which WAY more important than anything computer-related. Like go to my daughter's soccer match because I already missed the last one. :-) I understand it's a bit of a balancing act to try to figure this out "over the phone" while refusing to make mod_rewrite the center of my life for the next few days. But I think it can be done.

About the missing code in the older blog; it's installed in a subdirectory and WP put its thing in that subdir. I had only looked at the main html directory. Sorry, my bad.

You'll need two rules only if one domain needs "www" and the other does not. Otherwise, the rule is simply "if requested hostname is not www dot <something> dot com, then externally redirect to add 'www' to the requested hostname."
Both domains should always have "www". So you're saying then I need only one rule block. But as described and shown in examples above, I have 2 domains mapping to 1 directory, hence being controlled by 1 htaccess file. Your example has a domain in plaintext in there, so I assume the rule would only apply to that 1 domain. What about the other one? Wouldn't I have to repeat the rule block for the two different domains? Or is there some way to work with placeholders? Or something like "whatever follows, make sure there's always www in the beginning of the line"?

I entered a redirection in my cPanel and it added this to my .htaccess
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule ^(.*)$ "http\:\/\/www\.example\.com$1" [R=301,L]

For the second domain, it adds a second block of those rules. Seems to work, but it looks quite different from yours, anything wrong with this?

[edited by: Dude_S at 9:20 am (utc) on May 15, 2010]

g1smd




msg:4133123
 9:19 am on May 15, 2010 (gmt 0)

Literal periods in patterns need to be escaped.

Literal periods, colons and slashes in the target URL do not need to be escaped.

You'll likely need to add a / before the $1 too.

That's twice today I've made exactly the same comments about some cPanel generated redirection code. It's awful. The programmers seem to have no clue whey are doing. Where do I file a bug report?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved