Forum Moderators: open

Message Too Old, No Replies

other people's cookies

         

lucy24

10:05 pm on Sep 10, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Stop me if you've heard this one. While looking into something else, I discovered that some robots have been sending cookies with their requests-- only they're not my own cookies (with rare exceptions, that means piwik). So what the ### are they?

Sometimes they will send an empty "Cookies:" header-- infuriating, because I can't figure out how to block the request upfront ("absent" and "empty" both translate to "!."). Or, for variety's sake, the cookie string will begin in "=true" meaning, I guess, "I don't know what it is, but I know it exists".

There are lots of session-ID type things: cookies with names beginning in
ASPSESSIONID
JSESSIONID
PHPSESSID
_session_id
phpbb3_
session-id
sid

Lots more beginning in CF:
CFCLIENT
CFGLOBALS
CFID
CFTOKEN

Sometimes the CFs are preceded by the equally inexplicable
Cookie2: $Version=1

Other frequent visitors:
TS01etcetera
guest_idv1 etcetera
ubvt54.209.60.etcetera, where the last bit is not the rest of an AWS URL but an 18-digit number

Some look as if they're misplaced headers:
X-Mapping
cookie_policy1
countryGB
Countrybe
languageen etc
localede
locales.country=FR
nopop1
noredirect1

One that particularly puzzzled me-- and prompted me to take a closer look-- was an apparent human whose second request was blocked due to a
netseer_cm=done
that showed up out of nowhere. (I always check piwik requests originating from the 403 page. Most are all too easily explained, because the referer is listed as semalt or equivalent; the rare others are wrongly excluded humans that need a closer look.)

What the heck?

dstiles

6:59 pm on Sep 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



> Cookie2: $Version=1

I have an exception for (old?) nokia devices but otherwise block it.

lucy24

8:49 pm on Sep 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Cookie2: $Version=1
I did a different search. Looky here:
IP: 107.23.45.196
Referer: http://example.com/
Host: example.com
Cookie: nopop=1; noexit=1; nosnd=1
Accept-Language: en-US,en;q=0.8,zh;q=0.6,es;q=0.4
Accept: */*
Connection: close
User-Agent: Mozilla/5.0 (compatible; DuckDuckGo-Favicons-Bot/1.0; +http://duckduckgo.com)
Cookie2: $Version="1"
or
IP: 107.23.45.196
Referer: http://example.com/
Host: example.com
Cookie: XTCsid=d612effe7e71950124d7e94555cc3e4c
Accept-Language: en-US,en;q=0.8,zh;q=0.6,es;q=0.4
Accept: */*
Connection: close
User-Agent: Mozilla/5.0 (compatible; DuckDuckGo-Favicons-Bot/1.0; +http://duckduckgo.com)
Cookie2: $Version="1"
<topic drift>
Honestly, somebody needs to have a talk with DDG and ask why they're trying so, so hard to get their faviconbot blocked six ways from Sunday. It looks as if they only did it for a few months, earlier this year, but still.
</td>

Those are the only anomalous occurences of "Cookie2" I could find. That is, the only anomalous pattern; there were more than two DDGs of this kind. Everywhere else, "Cookie2" comes immediately before the bogus "Cookie:" line, and its value is 1 without quotation marks.

:: idly wondering if there's an arcane historical reason why "d" stands for "cell" ::

tangor

9:21 pm on Sep 11, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@lucy24 ....

You are digging into things I have no clue about, yet, will share this from JDMorgan many years back, a "null" in regex that has served me well in other aspects. Would this be of use?

"^$"

Seriously, I'm not good at this stuff, just know that one thing has been helpful... and we all miss JD ...

At present, among those posting, you are about as close to JD as any I've seen. Take that as a compliment.

wilderness

1:59 am on Sep 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



At present, among those posting, you are about as close to JD as any I've seen. Take that as a compliment.


ditto.

Jim was also fluent in Apache.

lucy does things with text editors (searches, sort and replace) that Jim never did.

blend27

2:26 pm on Sep 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lots more beginning in CF:

JSESSIONID ...
CFCLIENT
CFGLOBALS
CFID
CFTOKEN


urltoken=CFID=5903&CFTOKEN=7f59fd9da5b05c04-92DCDA13-BD53-85E6-C4E6CBC27034AD3E&jsessionid=05A4D2A7BEAD58398E9B8E4F2A8CFE46.cfusion

Those are set by Adobe ColdFusion when the site is written in one(ColdFusion). If the visitor claims to have that cookie in the request headers and your site is written in PHP, it is a BOT. You could safely Bag the request.

lucy24

7:27 pm on Sep 12, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the visitor claims to have that cookie in the request headers and your site is written in PHP

Or, in my case, 95% hand-rolled HTML with the odd SSI.

You could safely Bag the request.

In fact I recently started doing that very thing, via a RewriteRule that says "IF there's a cookie, and IF the said cookie doesn't start with {short list of cookies that I actually use} THEN 403 the suckers". And then I ran into one from an apparent human.

ColdFusion, eh? That tends to bolster my impression that robots collect all cookies that any site ever sends out, and add them to all subsequent requests everywhere, hoping it will make them look more human.

:: wandering off to test site to see if there's a secure way to express the concept of "present but empty" (as could theoretically also happen if they sent a blank UA or Referer header, resulting in "" rather than "-" in Apache logs) ::

keyplyr

1:56 am on Sep 13, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Problem with many of the header IF rules is cloud.

I know that some here don't support mobile to the extent that I do, but just as an FYI - often times mobile IP tunnels, VPNs, apps, social agents and mobile specific ISPs come from dynamic clouds, bouncing around from one cluster to another. I have found header info can change from one node to another (possibly employing legacy configs?)

I've had to change several rules more than once because of this, unintentionally blocking beneficial agents/users in the process & causing me to simplify my header filters. Although I now pay no attention to cookies at all, I can see this applying to those that do.