homepage Welcome to WebmasterWorld Guest from 54.225.18.187
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Visit PubCon.com
Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
Forum Library, Charter, Moderators: bakedjake

Linux, Unix, and *nix like Operating Systems Forum

    
Making easy graph XML output from log files
Works in Linux and Windows!
incrediBILL




msg:3939112
 10:29 pm on Jun 23, 2009 (gmt 0)

Remember, you can download Windows versions all the Linux command line tools so anyone can use these simple but powerful commands to process mass quantities of data without buying all sorts of fancy expensive software.

As a matter of fact, my target output was for the web, so I used the free version of FusionCharts which is Flash, avoiding tools like Excel as much as possible.

Many PHP programmers might have a first reaction to read in the comma delimited file and process it in PHP code, or LAMPers might convert it so MySQL, but both processes would take quite some time to code when a single command line written in a few seconds can easily produce quality results in a fraction of the time.

What I had was a log file specifically tracking local access to some pages and I finally needed to collate and sort this log file into a format I could import directly into FusionCharts and display a simple graph.

My starting log comma delimited log file looked like this:

date time,country,state,city,IP,etc.,etc.,etc.

or as AWK would see it:

$1,$2,$3,$4,$5,etc.,etc.,etc.

My goal was to extract just the US data and sort the hits on individual States and output the results in FusionChart compatible XML.

Here's the one liner that processes the file into XML:

grep "United States" somelogfile.txt awk -F "," '{print ","$3}' sort uniq -c sort -nr awk -F "," '{printf "<set name=\"%s\" value=\"%d\"/>\n",$2,$1}' > chartdata.xml

NOTE: WebmasterWorld breaks the solid PIPE bar and replaces it with which you'll have to fix for this to work.

Let's dissect this to make some sense of it.

grep "United States" somelogfile.txt

This grep extracts just the log file entries for the US, leaving out the rest of the world and pipes the results into the AWK command

awk -F "," '{print ","$3}'

In my logfile the fields were separated by commas (-F ",") to make parsing with AWK easier because I knew countries and states would have spaces in them so spaces as a delimiter wouldn't fly.

The $3 is the third field, which in my file was the state.

This field ($3) would obviously change for your files or purposes but what comes next should work as-is for most applications.

Note that I added a comma "," in front of every state which will be used later to save some trouble.

sort

As simple as it says, the whole output of states, including the comma prefix, is sorted alphabetically.

uniq -c

Count out all the unique instances of each state and preface the state with the counter, and it will include the comma for easy AWK'ing shortly such as " 102,California"

sort -nr

Now sort the " count, state" organized data numerically and in reverse order so you have the file ordered from the most popular state down to the least.

awk -F "," '{printf "<set name=\"%s\" value=\"%d\"/>\n",$2,$1}' > chartdata.xml

This is the final XML output command using AWK to chop up the " count, state" data into $1 and $2 and then rewrite it in XML format.

Notice I used the "printf" in AWK to do a formatted output which removes all the spaces from the number field via the decimal format output of "%d". Had the spaces remained it would cause FusionGraphs to not like the data and it's a nice clean step.

The final output looks like this..

<set name="California" value="2608"/>
<set name="Florida" value="2601"/>
<set name="Texas" value="2205"/>
<set name="New York" value="1604"/>
<set name="Ohio" value="1408"/>

...etc

Now you just need to preface the XML data with the <graph...> element and append the file with the closing </graph> element and you're done.

Place those in files graphstart.txt and graphend.txt and cat them together
cat graphstart.txt chartdata.xml graphend.txt > mygraphdata.xml

Voila! There you have a nice XML file "mygraphdata.xml" ready to be graphed.

Of course you might need to manually specify the bar or column color codes to make it look pretty, but I can't answer all your problems in a single command line now can I?

:)

[edited by: encyclo at 5:53 pm (utc) on June 24, 2009]
[edit reason] fixed typo [/edit]

 

httpwebwitch




msg:3939473
 1:08 pm on Jun 24, 2009 (gmt 0)

How did you get the nickname "incrediBill"? By beautiful performances like this one. Fantastic!

vincevincevince




msg:3939552
 3:53 pm on Jun 24, 2009 (gmt 0)

Brilliant; awk is something I need to get more used to using instead of clumsy list loops.

If anyone (on windows) wants to get most of the important tools and a linux shell - install CYGWIN - it gives you a real linux prompt with all the fittings and fixtures you expect, within Windows.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved