The Lattice Monitoring Framework
The Lattice Monitoring Framework provides functionality to add powerful and flexible monitoring facilties to systems. Lattice has a minimal runtime footprint and is not intrusive, so as not to adversely affect the performance of the system itself or any running applications. The monitoring can be built up of various components provided by the framework, so creating a bespoke monitoring sub-system.
The framework provides data souces, data consumers, and a control strategy. In a large distributed system there may be hundreds or thousands of measurement probes which can generate data. It would not be effective to have all of these probes sending data all of the time, so a mechanism is needed that controls and manages the relevant probes.
Lattice has been utilized within the RESERVOIR service cloud project. For full operation of a RESERVOIR service cloud, monitoring is a vital part of the full control loop that goes from the service management, through a control path, to the Probes which collect and send data, back to the service management which makes various decisions based on the data. The monitoring is a small but fundamental part of RESERVOIR as it allows the integration of components in all of the layers.
Monitoring is a fundamental aspect of a service cloud such as RESERVOIR because it is used by the infrastructure itself and for service management. The monitoring system needs to be pervasive as:
- it is required by most of the components of the service cloud;
- it cuts across the layers of the cloud system creating vertical paths; and
- it spans out across all the service clouds in a federation in order to link all the elements of a service.
Producers and ConsumersThe monitoring system itself is designed around the concept of producers and consumers. That is there are producers of monitoring data, which collect data from probes in the system, and there are consumers of monitoring data, which read the monitoring data. The producers and the consumers are connected via a network which can distribute the measurements collected. The collection of the data and the distribution of data are dealt with by different elements of the monitoring system so that it is possible to change the distribution framework without changing all the producers and consumers. For example, the distribution framework can change over time, say from IP multicast, to an event bus, or a publish / subscribe framework. This should not affect too many other parts of the system.
Data Sources and Probes
In many systems probes are used to collect data for system management. In this regard, Lattice will follow suit. However, to increase the power and flexibility of the monitoring we introduce the concept of a data source. A data source represents an interaction and control point within the system that encapsulates one or more probes. A probe sends a well defined set of attributes and values to the consumers. This can be done by transmitting the data out at a predefined interval, or transmitting when some change has occured.
The measurement data itself is sent via a distribution framework. These measurements are encoded to be a small as possible in order to maximise the network utilization. Consequently, the measurement meta-data is not transmitted each time, but is kept separately in an information model. This information model can be updated at key points in the lifecycle of a probe and can be accessed as required by consumers.
In order to distribute the measurements collected by the monitoring system, it is necessary to use a mechanism that fits well into a distributed architecture such as the management overlay. We need a mechanism that allows for multiple submitters and multiple receivers of data without having vast numbers of network connections. For example, having many TCP connections from each producer to all of the consumers of the data for that producer would create a combinatorial explosion of connections. Solutions to this include IP multicast, Event Service Bus, or publish/subscribe mechanism. In each of these, a producer of data only needs to send one copy of a measurement onto the network, and each of the consumers will be able to collect the same packet of data concurrently from the network.
Design and Implementation Overview
Within Lattice there are implementations of the elements presented in the relationship model shown in figure 3. In this model we see, a DataSource which acts as the control point and a container for one or more Probes. Each Probe defines the attributes that it can send. These are set in a collection of ProbeAttribute objects, that specify the name, the type, and the units of each value that can be sent within a measurement.
When a Probe sends a Measurement, the Measurement has a set of values called ProbeValues. The ProbeValues that are sent are directly related to the Probe Attributes defined within the Probe.
When the system is operating, each Probe reports the collected measurement to the Data Source. The Data Source passes these measurements to a networking layer, where they are encoded into an on-the-wire format, and then sent over the distribution network. The receiver of the monitoring data decodes the data and passes reconstructed Measurements to the monitoring consumer. Encoding measurement data is a common function of monitoring systems as it increases speed and decreases network utilization.
In Lattice, the measurement encoding is made as small as possible by only sending the values for a measurement on the data distribution framework. The definitions for the ProbeAttributes, such as the name and the units are not transmitted with each measurement, but are held in the information model and are accessed as required.
Lattice is distrbiuted under the GPL licence.
You can download the source
You can download the compiled version
You can download the dependent libraries
See the release notes , in PDF.
See the online documentation