Forum Moderators: phranque
I did consider removing 1.0 as well but haven't checked on its usageOn my sites, 1.0 is used by a handful of YMMV operators:
SetEnvIfExpr "%{HTTP:Accept-Language} =~ m#^en-us$|^ru$|zh-c#i " accept_lang=foreign:$0 # stop processing of errdocs
<if " (%{REQUEST_URI} =~ m#errdoc#) ">
SetEnvIf Request_URI /errdoc\.php exempt=errdoc
SetEnvIfExpr "%{REQUEST_URI} =~ m#robots\.txt# " bot=robot
Require all granted
</if> Bing is also a trial. It sometimes comes in, as far as I can tell, straight into the errordoc mode, without first being a legitimate bot hit.Some years ago, I used to find direct bingbot requests for one of my more specialized error documents. I never did figure out the how or why. But I do know from painful experience that if I have a typo in a link, and dash back an hour later to correct it, bingbot will have come by during that hour, and will continue requesting that incorrect URL for years to come.
<if " (%{REQUEST_URI} =~ m#errdoc#) ">I'm missing something. How can REQUEST_URI be concurrently errdoc.php and robots.txt? And, secondarily, why does it require a SetEnvIfExpr instead of a simple SetEnvIf like the previous line?
SetEnvIf Request_URI /errdoc\.php exempt=errdoc
SetEnvIfExpr "%{REQUEST_URI} =~ m#robots\.txt# " bot=robot
Require all granted
</if>
157.55.39.43 - - [15/Aug/2021:09:40:07 +0100] "GET /robots.txt HTTP/2.0" 403 212 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.39.43 - - [15/Aug/2021:09:40:08 +0100] "GET /robots.txt HTTP/2.0" 200 1063 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" BrowserMatch ^$ useragent=none SetEnvIfExpr "%{HTTP_USER_AGENT} =~ m#^$# " useragent=none
Still getting stupid bingbot log entries, still trying to discover why.Yeah, stupidity and success don’t usually go together, but bing seems to have mastered the formula.
BrowserMatch ^$ useragent=noneHuh. But then you've got an env var called "useragent" which has content regardless of whether the UA exists or not. Or is this the only circumstance in which you set it? Might be a good idea to track environmental variables and make sure "useragent" isn't coming out with unexpected values, as with "foreign" a few posts back.
BrowserMatch ^-?$ noagent
where the -? part is a leftover from misunderstanding logs. (If a header field is empty, logs say "" while if it is absent logs say "-".) File under: Not needed, but does no harm. And then noagent becomes one of the listed RequireNone conditions. For a while I had to unset it for facebook, but mercifully they seem to have dropped this unwise idea.
BrowserMatch ^$ useragent=none
SetEnvIfExpr "%{HTTP_USER_AGENT} =~ m#^$# " useragent=none THAT'S what m#blahblah# means
A few of them didn't populate $0 until I enclosed the m#...# in parentheses as in m#(...)#.Oh! I had a thought. I learned a while back, by direct painful experience, that in vanilla SetEnvIf, $0 only works as intended if the pattern includes something that could hypothetically be interpreted as a Regular Expression, like for example a . period. If the pattern consists only of characters that have no special RegEx meaning--i.e. no anchors, no parentheses and so on--then the intended $0 comes through as the literal string "$0". Perhaps SetEnvIfExpr works analogously.