homepage Welcome to WebmasterWorld Guest from 54.234.0.85
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
htaccess confusion
selective 403s?
SeanL

10+ Year Member



 
Msg#: 241 posted 1:40 am on Jun 18, 2003 (gmt 0)

I have noticed several banned bots showing up in my logs, looking at my guestbook. (One of three guestbooks, actually, but the only one actually named 'guestbook', and the only one that ever gets harvest attempts [but it has no e-mail addresses -- ha!] -- let that be a lesson to ya!) I checked my htaccess file, and sure enough, there's 'Franklin Locator' and others. So I went to the great and wonderful WannaBrowser and tried an experiement. Spoofing Franklin Locator, I tried to request several pages of my site. 403 error each time. Then I tried the guestbook. It was delivered just fine. I tried the other two guestbooks. They were served, too. So I tried other pages, other agents; Same results. The banned agents all get a 403 for any page I have tried except they get a 200 for the guestbooks.

By the way, I can't imagine it matters, but the guestbooks are in php. I tried another php page and got a 403 as expected.

I would assume it was some error on my part in the htaccess, but it obviously works correctly most of the time. I see the 403 errors in my logs often enough. Does anybody have any clue what this is about?

-- the very confused SeanL

 

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 241 posted 2:33 am on Jun 18, 2003 (gmt 0)

SeanL,

> some error on my part in the htaccess

You might consider posting that snippet of your code, along with the full local pathnames to all files involved in your test (after generalizing the domain name, of course).

Also, .htaccess will only protect files in the same directory or in directories below the one where it is located... Is your .php stuff in a different directory branch?

Jim

SeanL

10+ Year Member



 
Msg#: 241 posted 1:43 am on Jun 19, 2003 (gmt 0)

Here are a few sample lines from my htaccess:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^.*CheckWeb.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*CherryPicker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*China.Local.Browse.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Franklin.Locator.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Indie.Library.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Industry.Program.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Indy.Library.*$ [NC,OR]
RewriteRule ^.*$ - [F]

In addition to "www", there is a subdomain "tribe." The guestbooks are in this subdomain directory. The identical htaccess is in the root directory and the directory which serves as the root for "tribe." This should cover everything. I know that the htaccess in www does not have any effect on "tribe." And as I said, it worked for any other file I tried to access, just not for the guestbooks.

Still befuddled,
SeanL

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 241 posted 2:03 am on Jun 19, 2003 (gmt 0)

> Still befuddled...

That makes two, then... :) The only thing I see may just be a cut and paste error from posting here, and that is that the last RewriteCond must not have an [OR] flag on it.

The RewriteRules, while not optimally coded, should work. So there's probably something else going on.

OK, if I request tribe.yourdomain.com/anything, how do you redirect my request to the tribe subdirectory? Is it done with another set of RewriteRules, with an Alias directive, or how? (I'm looking for something that may interfere with your .htaccess files being processed as expected.)

Jim

SeanL

10+ Year Member



 
Msg#: 241 posted 10:42 am on Jun 19, 2003 (gmt 0)

yee HAH! I solved it. :)

I sort of accidentally discovered that there was an htaccess file in the directory which contains the guestbooks. All it said was

RewriteEngine on

I don't know what it was doing there. I suppose it was a case of accidentally saving something in progress, and saving it to the wrong directory to boot. I got rid of it, and now, using WannaBrowser, it seems to be working fine.

Somebody give me a cookie now. I deserve one. With macadamia nuts and chocolate chips. :)

SeanL

10+ Year Member



 
Msg#: 241 posted 10:44 am on Jun 19, 2003 (gmt 0)

By the way, yes, it was a cut-and-paste error. There are more lines of conditions after the last one I included, and the last one does not have OR.

But what's wrong with the conditions? They do work. But I like to be just right, so I don't have to change everything later.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 241 posted 8:34 pm on Jun 19, 2003 (gmt 0)

SeanL,

^.*something.*$ is redundant, because the ".*" makes the start and end anchors unneccessary.

^Start means "any string that starts with 'Start'"
End$ means "any string that ends with 'End'"
. means "any single character" and
* means "any number of the preceding character, including zero".

So, ^.*something.*$ means "any string that starts with anything, followed by 'something', and ends with anything.
The same thing applies to the pattern in the RewriteRule.

Given those patterns, you might as well just leave the anchors off and go with:

RewriteCond %{HTTP_USER_AGENT} CheckWeb [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC]
RewriteRule .* - [F]

Anchoring and wildcards in regular expressions [etext.lib.virginia.edu] patterns are two things that are easily misunderstood, and there are lots of bad examples floating around. I like to encourage people to study regex enough to basically understand it, rather than just copying stuff off the boards - mod_rewrite can go catastrophically wrong with the slightest typo, so it's a good idea.

Well rats, I was gonna give you a cookie, but your browser blocks third-party cookies. ;) Glad you got it working, though!

Jim

SeanL

10+ Year Member



 
Msg#: 241 posted 6:36 am on Jun 20, 2003 (gmt 0)

Thanks, Jim. I was thinking the anchors were necessary. I thought I had read it somewhere. And thanks for the attempt at a cookie. You are right; I do block all third party cookies. So this means you have already had two parties and haven't invited me to either? I would have accepted a cookie from either those first two, just not the third. Now your next party, the fourth, send me a cookie.

-- Sean

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved