Forum Moderators: open

Message Too Old, No Replies

Oh, the old UAs

Trying to archive some old data here...

         

blend27

9:31 pm on May 15, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Some, including me love to keep old data on UAs going back to lets say in my case 2006-ish, usage, frequency..., was it or I a bot or not, or SV1 for that matter.

Rebuilding the indexes here, I like to keep it all organized.

UA, Headers usage and what not....?

What would you keep?

btw, I have 19303 unique UAs of '%NT 5.1%' here... :)

[edited by: blend27 at 10:28 pm (utc) on May 15, 2021]

not2easy

10:22 pm on May 15, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Unless I was thinking of posterity trivia, not very many. But that's me.

blend27

10:25 pm on May 15, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



is Lucy home?

iamlost

2:25 am on May 16, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I, too, have been tracking headers for years as part of, initially, bot defence, and later personalisation.

These days there are two parts: (1) the historical data store for reference as required and (2) the various rules that have been developed (always subject to change) as a pattern against which to test a visitor.

The request header is the foundation of visitor analytics. Yes, much can be spoofed, obfuscated, or simply left blank but that too is a tale to track :)
But then iamlost

lucy24

2:37 am on May 16, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



is Lucy home?
Yes, but the same cannot be said of Lucy's computer, which died last month with little-to-no warning.

:: insert various “sobbing brokenly” emoticons ad lib ::

As a result, I currently don't have access to logs covering November 2019 - mid-April 2021. But not2easy or someone like her can probably dig up the whole series of “At home with the robots” posts. Some information is also at (ahem, cough-cough) example.com/fun/robots/ though I've been a bit capricious about keeping or not keeping UA details for older versions.

I've never bothered to keep headers more than a year or so, meaning no accumulated information on headers past and present. Some robotic basics will never change (hurrah! no Accept header, blocked at the gate, out of sight out of mind), while some do change over the years. For example, Upgrade-Insecure-Requests used to convey information, but is now so common, you can't use it as a Human Identifier. Compare years ago when you could safely assume that if the UA started in “Mozilla”, it's human. And then there was that spell of about a year and a half when FB sometimes didn't send a UA, so you had to poke all kinds of holes.

not2easy

3:05 am on May 16, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



;) The whole series is pinned at the top of the Search Engine Spider and User Agent Identification [webmasterworld.com] forum.

The most recent we have (for which we are ever so grateful) is the 2020 version: [webmasterworld.com...]

blend27

5:15 pm on May 16, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



-- Yes, but the same cannot be said of Lucy's computer, which died last month with little-to-no warning. ---

I keep headres in db: 2 tables.

headers & header_values.

headers:
header_id
header_name
date_added

header_values:
header_values_pair_id
header_value_text
header_id(from prev. table)
date_added

..then for each request there is a quick loop(inner join) to get header_values_pair_id list and that list is stored in DB as a string for each request made. Gives me flexibility to look up things on a fly when "investigating" and on a fly while "authenticating".

SumGuy

11:42 pm on May 18, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



I have logs going back to 1998-1999. Still running IIS-4 on NT4, motherboard changed once. The twin to that server, also running nt-4, handles SMTP, and likewise I have logs going back to same time-frame.

blend27

1:17 am on May 24, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



IIS-4 on NT4
Am a proud owner of an aluminum ATX tower that used to run that, converted into a flower box. Mother-board is still in it, had some snails destroy original power supply a few years back! :)

iamlost

4:57 am on May 24, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



blend27: you left the MCP intact and in control of toxic energised super snails?
No doubt it’s an Audrey II ‘flower’ box...

blend27

4:09 pm on May 24, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



-- toxic energised super snails --

Larger Birds seem to go nuts after them, late at night!