Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

htaccess confusion

selective 403s?

1:40 am on Jun 18, 2003 (gmt 0)

10+ Year Member

I have noticed several banned bots showing up in my logs, looking at my guestbook. (One of three guestbooks, actually, but the only one actually named 'guestbook', and the only one that ever gets harvest attempts [but it has no e-mail addresses -- ha!] -- let that be a lesson to ya!) I checked my htaccess file, and sure enough, there's 'Franklin Locator' and others. So I went to the great and wonderful WannaBrowser and tried an experiement. Spoofing Franklin Locator, I tried to request several pages of my site. 403 error each time. Then I tried the guestbook. It was delivered just fine. I tried the other two guestbooks. They were served, too. So I tried other pages, other agents; Same results. The banned agents all get a 403 for any page I have tried except they get a 200 for the guestbooks.

By the way, I can't imagine it matters, but the guestbooks are in php. I tried another php page and got a 403 as expected.

I would assume it was some error on my part in the htaccess, but it obviously works correctly most of the time. I see the 403 errors in my logs often enough. Does anybody have any clue what this is about?

-- the very confused SeanL

2:33 am on Jun 18, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member


> some error on my part in the htaccess

You might consider posting that snippet of your code, along with the full local pathnames to all files involved in your test (after generalizing the domain name, of course).

Also, .htaccess will only protect files in the same directory or in directories below the one where it is located... Is your .php stuff in a different directory branch?


1:43 am on Jun 19, 2003 (gmt 0)

10+ Year Member

Here are a few sample lines from my htaccess:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^.*CheckWeb.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*CherryPicker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*China.Local.Browse.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Franklin.Locator.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Indie.Library.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Industry.Program.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Indy.Library.*$ [NC,OR]
RewriteRule ^.*$ - [F]

In addition to "www", there is a subdomain "tribe." The guestbooks are in this subdomain directory. The identical htaccess is in the root directory and the directory which serves as the root for "tribe." This should cover everything. I know that the htaccess in www does not have any effect on "tribe." And as I said, it worked for any other file I tried to access, just not for the guestbooks.

Still befuddled,

2:03 am on Jun 19, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

> Still befuddled...

That makes two, then... :) The only thing I see may just be a cut and paste error from posting here, and that is that the last RewriteCond must not have an [OR] flag on it.

The RewriteRules, while not optimally coded, should work. So there's probably something else going on.

OK, if I request tribe.yourdomain.com/anything, how do you redirect my request to the tribe subdirectory? Is it done with another set of RewriteRules, with an Alias directive, or how? (I'm looking for something that may interfere with your .htaccess files being processed as expected.)


10:42 am on Jun 19, 2003 (gmt 0)

10+ Year Member

yee HAH! I solved it. :)

I sort of accidentally discovered that there was an htaccess file in the directory which contains the guestbooks. All it said was

RewriteEngine on

I don't know what it was doing there. I suppose it was a case of accidentally saving something in progress, and saving it to the wrong directory to boot. I got rid of it, and now, using WannaBrowser, it seems to be working fine.

Somebody give me a cookie now. I deserve one. With macadamia nuts and chocolate chips. :)

10:44 am on Jun 19, 2003 (gmt 0)

10+ Year Member

By the way, yes, it was a cut-and-paste error. There are more lines of conditions after the last one I included, and the last one does not have OR.

But what's wrong with the conditions? They do work. But I like to be just right, so I don't have to change everything later.

8:34 pm on Jun 19, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member


^.*something.*$ is redundant, because the ".*" makes the start and end anchors unneccessary.

^Start means "any string that starts with 'Start'"
End$ means "any string that ends with 'End'"
. means "any single character" and
* means "any number of the preceding character, including zero".

So, ^.*something.*$ means "any string that starts with anything, followed by 'something', and ends with anything.
The same thing applies to the pattern in the RewriteRule.

Given those patterns, you might as well just leave the anchors off and go with:

RewriteCond %{HTTP_USER_AGENT} CheckWeb [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC]
RewriteRule .* - [F]

Anchoring and wildcards in regular expressions [etext.lib.virginia.edu] patterns are two things that are easily misunderstood, and there are lots of bad examples floating around. I like to encourage people to study regex enough to basically understand it, rather than just copying stuff off the boards - mod_rewrite can go catastrophically wrong with the slightest typo, so it's a good idea.

Well rats, I was gonna give you a cookie, but your browser blocks third-party cookies. ;) Glad you got it working, though!


6:36 am on Jun 20, 2003 (gmt 0)

10+ Year Member

Thanks, Jim. I was thinking the anchors were necessary. I thought I had read it somewhere. And thanks for the attempt at a cookie. You are right; I do block all third party cookies. So this means you have already had two parties and haven't invited me to either? I would have accepted a cookie from either those first two, just not the third. Now your next party, the fourth, send me a cookie.

-- Sean


Featured Threads

Hot Threads This Week

Hot Threads This Month