Forum Moderators: phranque

Message Too Old, No Replies

G**gle webmaster tools Blank UA

htaccess user agent blank

         

cyberdyne

11:42 pm on Feb 16, 2012 (gmt 0)

10+ Year Member



I have the following rule in my htaccess to block visitors with blank UA's. However, G**gle webmaster tools insists on trying to pull my favicon whenever I visit their site with just that! The 'exception' rule doesn't appear to be working.
What have I written incorrectly?
Many thanks in advance

RewriteCond %{HTTP_USER_AGENT} ^-?$ 
RewriteCond %{REMOTE_ADDR} !^209\.85\.(12[89]|1[3-9][0-9]|2[0-5][0-9])\. [OR]
RewriteCond %{REMOTE_ADDR} !^74\.125\.16\.
RewriteRule .* - [F]

g1smd

12:10 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Remove the [OR] flag or put the two ORed items in a single condition.

Think about it. :)

cyberdyne

12:21 am on Feb 17, 2012 (gmt 0)

10+ Year Member



This seems to have worked but only one IP has requested it so far.

RewriteCond %{HTTP_USER_AGENT} ^-?$ 
RewriteCond %{REMOTE_ADDR} !(^209\.85\.(12[89]|1[3-9][0-9]|2[0-5][0-9])\.|^74\.125\.16\.22[0-9])
RewriteRule .* - [F]


Thank you a always.

g1smd

12:52 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No need to repeat the start anchor.

!^(209\.85\.(12[89]|1[3-9][0-9]|2[0-5][0-9])\.|74\.125\.16\.22[0-9])

g1smd

1:02 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you are still puzzling over your original logic error, I'll simplify the wording here by using "does not begin 74" for one pattern and "does not begin 209" for the other.

Your code blocks access when the IP "(doesn't begin 74) OR (doesn't begin 209)".

So IP beginning 88 would be blocked, as would one beginning 99 too. These requests actually meet both conditions but only actually need to meet one of them (as it is an OR condition).

For "IP beginning 209" it meets one of the conditions of the two in "(not beginning 209) OR (not beginning 74)". As 209 "does not begin 74" then this part of the OR rule is true (and only one needs to be true for an OR rule) and the request is blocked.

The only requests to not be blocked would be those that "began 209" and at the very same time "began 74". That means that nothing would ever be blocked by this rule.

That's the danger in having "(not this) OR (not that)" as the condition. What you wanted was "NOT (this OR that)" as a single condition or else you needed "(not this) AND (not that)" on two lines.

It's a long time since I studied logic formally but if I remember rightly "(not this) OR (not that)" is the same as "NOT ((this) AND (that))" and that's NOT what you wanted. :)

cyberdyne

1:21 am on Feb 17, 2012 (gmt 0)

10+ Year Member



I do hope you're benefiting in 'real life' - either financially or in other ways - from your excellent teaching skills!
I completely understood your explanation.
Thank you very much.

lucy24

1:41 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Overlapping preceding 2 posts:

Nothing bad will happen if g### can't get at your favicon. Your GWT page just won't look as zippy.

But here's Option B:

<Files "favicon.ico">
Order Allow,Deny
Allow from all
</Files>

and then...

BrowserMatch ^-?$ keep_out
(... and anything else using mod_setenvif)

Order Allow,Deny
Allow from all

Deny from env=keep_out
(etc. with all the other Deny from... lines)

I originally blocked blank UAs with mod_rewrite but for some reason it didn't work properly. (It was a while ago so I can't remember details.)

I put in the favicon exemption for a different reason: to help identify humans who got mislabeled as robots. The faviconbot was just a fringe benefit.

wilderness

2:00 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



g1msd,
I don't get this?

The only requests to not be blocked would be those that "began 209" and at the very same time "began 74". That means that nothing would ever be blocked by this rule.

That's the danger in having "(not this) OR (not that)" as the condition. What you wanted was "NOT (this OR that)" as a single condition or else you needed "(not this) AND (not that)" on two lines.


How is it possible for one IP to begin with both "209" and "74" simultaneously?

Are you saying that it fails when the combined IP's are on
one line or two lines?
or both?

g1smd

2:12 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It isn't possible to have an IP that "begins 74" while also "begins 209", and that's why the rule "(not 209) OR (not 74)" fails to operate in the way the designer intended.

When IP beginning 209 makes a request, the "not 74" condition is true and the request is blocked. By using "OR" only one condition needs to be true to block the request. This leads to the unintended operation seen in the OP.

Using "(NOT 209) OR (NOT 74)" on two lines leads to all requests being blocked.

Using "NOT (209 OR 74)" on one line leads to correct operation.

Also "(NOT 209) AND (NOT 74)" on two lines would also lead to correct operation.

wilderness

2:41 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Very sincere apologies.

None of my use NOT.

g1smd

10:27 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yeah, it's hard work thinking it through, but I have seen this issue several times before and alarm bells ring whenever I see the ! operator and the [OR] flag used together.

cyberdyne

4:46 pm on Feb 17, 2012 (gmt 0)

10+ Year Member



Lucy, I can confirm your suggestion worked well, many thanks

lucy24

10:20 pm on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



alarm bells ring whenever I see the ! operator and the [OR] flag used together

:)

That's in situations like the present one, where both lines apply to the identical element: the beginning of the IP address.

The construction "not-A OR not-B" may be exactly what you want if you're looking at unrelated* things, such as "not from such-and-such IP, OR not asking for such-and-such file".


* Functionally unrelated, that is. They may be related in real life, like the naked faviconbot.