Metrics Data Model

Status: Mixed

Overview

Status: Stable

The OpenTelemetry data model for metrics consists of a protocol specification and semantic conventions for delivery of pre-aggregated metric timeseries data. The data model is designed for importing data from existing systems and exporting data into existing systems, as well as to support internal OpenTelemetry use-cases for generating Metrics from streams of Spans or Logs.

Popular existing metrics data formats can be unambiguously translated into the OpenTelemetry data model for metrics, without loss of semantics or fidelity. Translation from the Prometheus and Statsd exposition formats is explicitly specified.

The data model specifies a number of semantics-preserving data transformations for use on the collection path, supporting flexible system configuration. The model supports reliability and statelessness controls, through the choice of cumulative and delta transport. The model supports cost controls, through spatial and temporal reaggregation.

The OpenTelemetry collector is designed to accept metrics data in a number of formats, transport data using the OpenTelemetry data model, and then export into existing systems. The data model can be unambiguously translated into the Prometheus Remote Write protocol without loss of features or semantics, through well-defined translations of the data, including the ability to automatically remove attributes and lower histogram resolution.

Events => Data Stream => Timeseries

Status: Stable

The OTLP Metrics protocol is designed as a standard for transporting metric data. To describe the intended use of this data and the associated semantic meaning, OpenTelemetry metric data stream types will be linked into a framework containing a higher-level model, about Metrics APIs and discrete input values, and a lower-level model, defining the Timeseries and discrete output values. The relationship between models is displayed in the diagram below.

Events → Data Stream → Timeseries Diagram

This protocol was designed to meet the requirements of the OpenCensus Metrics system, particularly to meet its concept of Metrics Views. Views are accomplished in the OpenTelemetry Metrics data model through support for data transformation on the collection path.

OpenTelemetry has identified three kinds of semantics-preserving Metric data transformation that are useful in building metrics collection systems as ways of controlling cost, reliability, and resource allocation. The OpenTelemetry Metrics data model is designed to support these transformations both inside an SDK as the data originates, or as a reprocessing stage inside the OpenTelemetry collector. These transformations are:

Temporal reaggregation: Metrics that are collected at a high-frequency can be re-aggregated into longer intervals, allowing low-resolution timeseries to be pre-calculated or used in place of the original metric data.
Spatial reaggregation: Metrics that are produced with unwanted attributes can be re-aggregated into metrics having fewer attributes.
Delta-to-Cumulative: Metrics that are input and output with Delta temporality unburden the client from keeping high-cardinality state. The use of deltas allows downstream services to bear the cost of conversion into cumulative timeseries, or to forego the cost and calculate rates directly.

OpenTelemetry Metrics data streams are designed so that these transformations can be applied automatically to streams of the same type, subject to conditions outlined below. Every OTLP data stream has an intrinsic decomposable aggregate function making it semantically well-defined to merge data points across both temporal and spatial attributes. Every OTLP data point also has two meaningful timestamps which, combined with intrinsic aggregation, make it possible to carry out the standard metric data transformations for each of the model’s basic points while ensuring that the result carries the intended meaning.

As in OpenCensus Metrics, metrics data can be transformed into one or more Views, just by selecting the aggregation interval and the desired attributes. One stream of OTLP data can be transformed into multiple timeseries outputs by configuring different Views, and the required Views processing may be applied inside the SDK or by an external collector.

Example Use-cases

The metric data model is designed around a series of “core” use cases. While this list is not exhaustive, it is meant to be representative of the scope and breadth of OTel metrics usage.

OTel SDK exports 10 second resolution to a single OTel collector, using cumulative temporality for a stateful client, stateless server:
- Collector passes-through original data to an OTLP destination
- Collector re-aggregates into longer intervals without changing attributes
- Collector re-aggregates into several distinct views, each with a subset of the available attributes, outputs to the same destination
OTel SDK exports 10 second resolution to a single OTel collector, using delta temporality for a stateless client, stateful server:
- Collector re-aggregates into 60 second resolution
- Collector converts delta to cumulative temporality
OTel SDK exports both 10 seconds resolution (e.g. CPU, request latency) and 15 minutes resolution (e.g. room temperature) to a single OTel Collector. The collector exports streams upstream with or without aggregation.
A number of OTel SDKs running locally each exports 10 second resolution, each reports to a single (local) OTel collector.
- Collector re-aggregates into 60 second resolution
- Collector re-aggregates to eliminate the identity of individual SDKs (e.g., distinct service.instance.id values)
- Collector outputs to an OTLP destination
Pool of OTel collectors receive OTLP and export Prometheus Remote Write
- Collector joins service discovery with metric resources
- Collector computes “up”, staleness marker
- Collector applies a distinct external label
OTel collector receives Statsd and exports OTLP
- With delta temporality: stateless collector
- With cumulative temporality: stateful collector
OTel SDK exports directly to 3P backend

These are considered the “core” use-cases used to analyze tradeoffs and design decisions within the metrics data model.

Out of Scope Use-cases

The metrics data model is NOT designed to be a perfect rosetta stone of metrics. Here are a set of use cases that, while won’t be outright unsupported, are not in scope for key design decisions:

Using OTLP as an intermediary format between two non-compatible formats
- Importing statsd => Prometheus PRW
- Importing collectd => Prometheus PRW
- Importing Prometheus endpoint scrape => [statsd push | collectd | opencensus]
- Importing OpenCensus “oca” => any non OC or OTel format
TODO: define others.

Model Details

Status: Stable

OpenTelemetry fragments metrics into three interacting models:

An Event model, representing how instrumentation reports metric data.
A Timeseries model, representing how backends store metric data.
A Metric Stream model, defining the OpenTeLemetry Protocol (OTLP) representing how metric data streams are manipulated and transmitted between the Event model and the Timeseries storage. This is the model specified in this document.

Event Model

The event model is where recording of data happens. Its foundation is made of Instruments, which are used to record data observations via events. These raw events are then transformed in some fashion before being sent to some other system. OpenTelemetry metrics are designed such that the same instrument and events can be used in different ways to generate metric streams.

Even though observation events could be reported directly to a backend, in practice this would be infeasible due to the sheer volume of data used in observability systems, and the limited amount of network/CPU resources available for telemetry collection purposes. The best example of this is the Histogram metric where raw events are recorded in a compressed format rather than individual timeseries.

Note: The above picture shows how one instrument can transform events into more than one type of metric stream. There are caveats and nuances for when and how to do this. Instrument and metric configuration are outlined in the metrics API specification.

While OpenTelemetry provides flexibility in how instruments can be transformed into metric streams, the instruments are defined such that a reasonable default mapping can be provided. The exact OpenTelemetry instruments are detailed in the API specification.

In the Event model, the primary data are (instrument, number) points, originally observed in real time or on demand (for the synchronous and asynchronous cases, respectively).

Timeseries Model

In this low-level metrics data model, a Timeseries is defined by an entity consisting of several metadata properties:

Metric name
Attributes (dimensions)
Value type of the point (integer, floating point, etc)
Unit of measurement

The primary data of each timeseries are ordered (timestamp, value) points, with one of the following value types:

Counter (Monotonic, Cumulative)
Gauge
Histogram
Exponential Histogram

This model may be viewed as an idealization of Prometheus Remote Write. Like that protocol, we are additionally concerned with knowing when a point value is defined, as compared with being implicitly or explicitly absent. A metric stream of delta data points defines time-interval values, not point-in-time values. To precisely define presence and absence of data requires further development of the correspondence between these models.

Note: Prometheus is not the only possible timeseries model for OpenTelemetry to map into, but is used as a reference throughout this document.

OpenTelemetry Protocol data model

The OpenTelemetry protocol (OTLP) data model is composed of Metric data streams. These streams are in turn composed of metric data points. Metric data streams can be converted directly into Timeseries.

Metric streams are grouped into individual Metric objects, identified by:

The originating Resource attributes
The instrumentation Scope (e.g., instrumentation library name, version)
The metric stream’s name

Including name, the Metric object is defined by the following properties:

The data point type (e.g. Sum, Gauge, Histogram ExponentialHistogram, Summary)
The metric stream’s unit
The metric stream’s description
Intrinsic data point properties, where applicable: AggregationTemporality, Monotonic

The data point type, unit, and intrinsic properties are considered identifying, whereas the description field is explicitly not identifying in nature.

Extrinsic properties of specific points are not considered identifying; these include but are not limited to:

Bucket boundaries of a Histogram data point
Scale or bucket count of a ExponentialHistogram data point.

The Metric object contains individual streams, identified by the set of Attributes. Within the individual streams, points are identified by one or two timestamps, details vary by data point type.

Within certain data point types (e.g., Sum and Gauge) there is variation permitted in the numeric point value; in this case, the associated variation (i.e., floating-point vs. integer) is not considered identifying.

OpenTelemetry Protocol data model: Producer recommendations

Producers SHOULD prevent the presence of multiple Metric identities for a given name with the same Resource and Scope attributes. Producers are expected to aggregate data for identical Metric objects as a basic feature, so the appearance of multiple Metric, considered a “semantic error”, generally requires duplicate conflicting instrument registration to have occurred somewhere.

Producers MAY be able to remediate the problem, depending on whether they are an SDK or a downstream processor:

If the potential conflict involves a non-identifying property (i.e., description), the producer SHOULD choose the longer string.
If the potential conflict involves similar but disagreeing units (e.g., “ms” and “s”), an implementation MAY convert units to avoid semantic errors; otherwise an implementation SHOULD inform the user of a semantic error and pass through conflicting data.
If the potential conflict involves an AggregationTemporality property, an implementation MAY convert temporality using a Cumulative-to-Delta or a Delta-to-Cumulative transformation; otherwise, an implementation SHOULD inform the user of a semantic error and pass through conflicting data.
Generally, for potential conflicts involving an identifying property (i.e., all properties except description), the producer SHOULD inform the user of a semantic error and pass through conflicting data.

When semantic errors such as these occur inside an implementation of the OpenTelemetry API, there is an presumption of a fixed Resource value. Consequently, SDKs implementing the OpenTelemetry API have complete information about the origin of duplicate instrument registration conflicts and are sometimes able to help users avoid semantic errors. See the SDK specification for specific details.

OpenTelemetry Protocol data model: Consumer recommendations

Consumers MAY reject OpenTelemetry Metrics data containing semantic errors (i.e., more than one Metric identity for a given name, Resource, and Scope).

OpenTelemetry does not specify any means for conveying such an outcome to the end user, although this subject deserves attention.

Point kinds

A metric stream can use one of these basic point kinds, all of which satisfy the requirements above, meaning they define a decomposable aggregate function (also known as a “natural merge” function) for points of the same kind. ¹

The basic point kinds are:

Comparing the OTLP Metric Data Stream and Timeseries data models, OTLP does not map 1:1 from its point types into timeseries points. In OTLP, a Sum point can represent a monotonic count or a non-monotonic count. This means an OTLP Sum is either translated into a Timeseries Counter, when the sum is monotonic, or a Gauge when the sum is not monotonic.

Specifically, in OpenTelemetry Sums always have an aggregate function where you can combine via addition. So, for non-monotonic sums in OpenTelemetry we can aggregate (naturally) via addition. In the timeseries model, you cannot assume that any particular Gauge is a sum, so the default aggregation would not be addition.

In addition to the core point kinds used in OTLP, there are also data types designed for compatibility with existing metric formats.

Summary

Metric points

Status: Stable

Sums

Sums in OTLP consist of the following:

An Aggregation Temporality of delta or cumulative.
A flag denoting whether the Sum is monotonic. In this case of metrics, this means the sum is nominally increasing, which we assume without loss of generality.
- For delta monotonic sums, this means the reader SHOULD expect non-negative values.
- For cumulative monotonic sums, this means the reader SHOULD expect values that are not less than the previous value.
A set of data points, each containing:
- An independent set of Attribute name-value pairs.
- A time window (of (start, end]) time for which the Sum was calculated.
  - The time interval is inclusive of the end time.
  - Times are specified in Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970
- (optional) a set of examplars (see Exemplars).

The aggregation temporality is used to understand the context in which the sum was calculated. When the aggregation temporality is “delta”, we expect to have no overlap in time windows for metric streams, e.g.

Contrast with cumulative aggregation temporality where we expect to report the full sum since ‘start’ (where usually start means a process/application start):

There are various tradeoffs between using Delta vs. Cumulative aggregation, in various use cases, e.g.:

Detecting process restarts
Calculating rates
Push vs. Pull based metric reporting

OTLP supports both models, and allows APIs, SDKs and users to determine the best tradeoff for their use case.

Gauge

A Gauge in OTLP represents a sampled value at a given time. A Gauge stream consists of:

A set of data points, each containing:
- An independent set of Attribute name-value pairs.
- A sampled value (e.g. current CPU temperature)
- A timestamp when the value was sampled (time_unix_nano)
- (optional) A timestamp (start_time_unix_nano) which best represents the first possible moment a measurement could be recorded. This is commonly set to the timestamp when a metric collection system started.
- (optional) a set of examplars (see Exemplars).

In OTLP, a point within a Gauge stream represents the last-sampled event for a given time window.

In this example, we can see an underlying timeseries we are sampling with our Gauge. While the event model can sample more than once for a given metric reporting interval, only the last value is reported in the metric stream via OTLP.

Gauges do not provide an aggregation semantic, instead “last sample value” is used when performing operations like temporal alignment or adjusting resolution.

Gauges can be aggregated through transformation into histograms, or other metric types. These operations are not done by default, and require direct user configuration.

Histogram

Histogram metric data points convey a population of recorded measurements in a compressed format. A histogram bundles a set of events into divided populations with an overall event count and aggregate sum for all events.

Histograms consist of the following:

An Aggregation Temporality of delta or cumulative.
A set of data points, each containing:
- An independent set of Attribute name-value pairs.
- A time window (of (start, end]) time for which the Histogram was bundled.
  - The time interval is inclusive of the end time.
  - Time values are specified as nanoseconds since the UNIX Epoch (00:00:00 UTC on 1 January 1970).
- A count (count) of the total population of points in the histogram.
- A sum (sum) of all the values in the histogram.
- (optional) The min (min) of all values in the histogram.
- (optional) The max (max) of all values in the histogram.
- (optional) A series of buckets with:
  - Explicit boundary values. These values denote the lower and upper bounds for buckets and whether not a given observation would be recorded in this bucket.
  - A count of the number of observations that fell within this bucket.
- (optional) a set of examplars (see Exemplars).

Like Sums, Histograms also define an aggregation temporality. The picture above denotes Delta temporality where accumulated event counts are reset to zero after reporting and a new aggregation occurs. Cumulative, on the other hand, continues to aggregate events, resetting with the use of a new start time.

The aggregation temporality also has implications on the min and max fields. Min and max are more useful for Delta temporality, since the values represented by Cumulative min and max will stabilize as more events are recorded. Additionally, it is possible to convert min and max from Delta to Cumulative, but not from Cumulative to Delta. When converting from Cumulative to Delta, min and max can be dropped, or captured in an alternative representation such as a gauge.

Bucket counts are optional. A Histogram without buckets conveys a population in terms of only the sum and count, and may be interpreted as a histogram with single bucket covering (-Inf, +Inf).

Histogram: Bucket inclusivity

Bucket upper-bounds are inclusive (except for the case where the upper-bound is +Inf) while bucket lower-bounds are exclusive. That is, buckets express the number of values that are greater than their lower bound and less than or equal to their upper bound. Importers and exporters working with OpenTelemetry Metrics data are meant to disregard this specification when translating to and from histogram formats that use inclusive lower bounds and exclusive upper bounds. Changing the inclusivity and exclusivity of bounds is an example of worst-case Histogram error; users should choose Histogram boundaries so that worst-case error is within their error tolerance.