Classifying metrics

To be able to create meaningful data representations of metrics that are useful to a business, it is important to identify the metrics that matter the most. This is important for any type of data representation, such as classic dashboards, interactive data environments or a data sonification. Those particular metrics are not always obvious and companies often take a lot of effort in identifying the metrics matter the most for their business. It is not unusual that those metrics are hidden in the raw data the company produces during their every day process, which means that they can only be revealed though calculation. Such calculations could be the difference between two correlating signals or the amount of a particular signal exceed its standard variance. One can differentiate between metrics that have immediate matter and relevance, such as the availability of the company's services, and metrics that matter in the long run and are examined retrospectively. The data sonification project "Listening to the Heart of Business" focuses on live metrics that possibly can have an immediate impact on the business.

From a DataShaka perspective, there is a large amount of metrics that have to be monitored constantly. DataShaka is a data unification platform, harvesting data from different sources for their clients, unifying that data and then storing and delivering it to their clients. Many data files are constantly harvested, processed, unified and validated on a cloud machine. Furthermore, a Microsoft Azure powered storage platform named DISQ (Dynamic Intelligent Storage Query) is storing and delivering that data. As all these processes are constantly happening in the background and vital for keeping the heartbeat of the business alive, it is important to know if everything is running smoothly or if problems are occurring and where these problems are coming from.

Below is a list identifying specific metrics that matter most for the DataShaka company, from a business perspective as well as a developer's perspective:

The metrics that matter

Number of Data Files

processing
stuck

Cloud Machine Statistics

CPU
Memory
Network
Free Disc Space

Speed Query Responds

duration
failure

UDPs (Unified Data Points)

uploaded
downloaded

Steps processing Data Files

failed
completed

User Login
Users logged in
Data file process kicked off

All these business metrics are structurally time series data. Additionally, each time series point (TSP) for these metrics contains some pieces of context.

The first classification that can be done on those metrics is to differentiate between metrics that are basically just events and only communicate that something particular has happened, and continuous metrics where the actual numbers are relevant. There are also metrics however, that are basically just events, but that event inherits a particular value which is very relevant. Consequently, there are three different categories these metrics can be classified to:

Binary Event Metrics
Complex Event Metrics
Continuous Metrics

Looking at classic sonification techniques, this could a possible way to apply sound to each type of metric:

Binary Event Metrics => Auditory Icons
Complex Event Metrics => Earcons
Continuous Metrics => Parameter Mapping

An explanation of sonification techniques can be found in a previous blog post here.

In every case, each metric is a point in time that contains a particular value, may it be for a constantly changing metric (such as CPU usage of a cloud machine), as well as a simple event, where the value is binary only switches between 0 and 1. All these metrics additionally inherit context, basically being what their value represents (query respond time, CPU, etc.).

This particular way of looking at data is coherent with DataShaka's data ontology TCSV, describing a time based content-agnostic and context-driven data representation. This particular data ontology and its relation to the sonification project will be further discussed in future posts.

Looking at the metrics that matter identified above and applying them to the three classes that have been created, they can be structured the following way:

Binary Event Metrics (Auditory Icon)

Speed Query Responds

failure

Steps processing Data Files

failed
completed

User Login

Complex Event Metrics (Earcon)

Speed Query Responds

duration (Time)

UDPs (Unified Data Points)

uploaded (Amount)
downloaded (Amount)

Data file process kicked off (Size of File)

Continuous Metrics (Parameter Mapping)

Number of Jobs

processing
stuck

CPU/Memory/Network/etc
Free Disc Space
Users logged in

Master Thesis - Listening to the Heart of Business

Friday, September 27, 2013

Classifying metrics

No comments:

Post a Comment

About Me

Labels

Project Description

Related Websites

Wikipedia Links

Blog Archive

Search This Blog