Log Analysis

The beet distribution includes a command-line tool suite for analyzing beet logs. The tool suite can be used to convert binary logs to CSV or XML, apply custom XSL transforms, and perform efficient bulk loads of beet data to an Oracle database. The tools require the following:

You can run any of the following utilities from a command prompt within the created beet directory. All of these instructions assume you are logging in the default format, GZIP-compressed FastInfoSet (i.e. binary XML).

Upload to an Oracle Database

  1. (first time only) Run the provided etl/create_etl.sql script to create the required data structures in your target Oracle database.

  2. Run the import script:

    > ./load-event.sh user/pass@sid path/to/log.bxml.gz

  3. The time required for this process will vary with available system resources, the size of the log, the speed of your connection to the database, and so on. Examine the resulting log files load_event_csv.log and (if there were error records) BAD_EVENT_CSV.log. Typically errors will only occur if you have tried to insert values to large for the target schema. If this is the case, you may want to update the schema to accommodate the larger values, or truncate the bad data (in BAD_EVENT_CSV.log) and attempt the load again. The provided structures are adequate to handle most needs, so errors should be rare.

Important Note for Cygwin Users:

While the script supports Cygwin, an Oracle sqlldr limitation requires the use of very large temporary files in a Cygwin environment. It is strongly recommended that you execute the upload scripts from a true Unix environment with stronger pipeline support, such as Solaris or Linux.

Use of the provided script is simple, but database administration is up to you. Depending on what you hope to do with your data, you will likely want to customize the ETL process to suit your needs. Therefore familiarity with sqlldr and basic Oracle database administration is assumed here. You should examine the provided scripts and make sure you understand what they do before using them.

Export to XML

You can easily export a binary log to a simple XML format legible to humans or other XML processing utilities:

> zcat path/to/log.bxml.gz | java -jar beet-utils.jar -tool xml > result.xml

Problems with zcat

Some systems (like OS X) may not ship with zcat, or may contain a version that is incompatible with the above command line interface. Try using gzcat if zcat is missing or doesn't work. Otherwise, you may have to install zcat, or research the available compression utilities on your host platform.

Example 1.3. Sample XML Data

<event id="244f5f4e-21d0-4044-b71f-ba39bb96cfbd"1
       parent-id="778e4a40-48a8-4386-b93c-72e322663a90"2>
    <type>jdbc</type><name>executeBatch</name><application>beet-hello</application>
    <start>2009-04-23T13:50:37.625-07:00</start><duration-ms>0</duration-ms>
    <session-id>53FAB2406DB3CFAE8BD4201D6DD60D73</session-id>3
    <event-data>4<sql>delete from HelloData where id=?</sql>
        <batch><parameters><param>1</param></parameters></batch>
    </event-data>
</event>
<event id="778e4a40-48a8-4386-b93c-72e322663a90" parent-id="9d2f5856-193f-4fe7-a7d5-51b5a355e4cf">
    <type>method</type><name>com.mtgi.analytics.example.service.HelloService.delete</name><application>beet-hello</application>
    <start>2009-04-23T13:50:37.531-07:00</start><duration-ms>109</duration-ms>
    <session-id>53FAB2406DB3CFAE8BD4201D6DD60D73</session-id>
    <event-data><parameters><param>{object}</param></parameters><result/></event-data>
</event>
<event id="9d2f5856-193f-4fe7-a7d5-51b5a355e4cf">
    <type>http-request</type><name>/beet-hello/</name><application>beet-hello</application>
    <start>2009-04-23T13:50:37.515-07:00</start><duration-ms>157</duration-ms>
    <event-data uri="/beet-hello/" protocol="HTTP/1.1" method="POST" remote-address="127.0.0.1" remote-host="127.0.0.1">
        <parameters><param name="command"><value>delete</value></param></parameters>
    </event-data>
</event>

1

Each event is identified by a globally unique identifier.

2

Events that are triggered by an enclosing event include a reference to their parent's ID.

3

Events include user and session ID, if applicable.

4

event-data contains flexible type-specific information about a recorded event, such as parameter data, SQL text, and so on.


Be careful. The default compressed-binary format has a compression ratio of around 20:1 compared to its plain-text counterpart, so you can use a lot of disk this way. If you plan to apply an XSL transform to the output document, consider the XSLT mode of the export tool as outlined below.

Export to CSV

Similarly, you can export to a CSV file for use in a spreadsheet or older EDI tools:

> zcat path/to/log.bxml.gz | java -jar beet-utils.jar -tool csv > result.csv

Again, assume that your CSV data will be quite a bit larger than the compressed-binary data.

Export with custom XSL

You can certainly use the XML export and stream the result to an XSL transformer. However, executing XSL transforms on large XML documents can be extremely resource-intensive without specialized tools. Included with the log analysis package is an analyzer that splits a large input document into fragments based on an XPath query, and then applies an XSL transform to each fragment. For transforms that are stateless or only need to examine a small part of a document, this is vastly more efficient than loading an entire document to invoke the transform.

The following example splits the input document into one fragment per 'event' element, applying the given XSL transform file to each fragment and streaming the result to standard out:

> zcat path/to/log.bxml.gz | java -jar beet-utils.jar -tool xslt -split event-log/event -xsl sql/etl/insert_events.xsl > result.csv

The included file sql/etl/insert_events.xsl provides an example transform document, including some custom XSL functions available to transforms invoked in this way.