入门指南

五分钟内获取应用的遥测数据!

本页面将向您展示如何在 Python 中使用 OpenTelemetry 入门。

您将学习如何以自动化的方式对一个简单应用进行添加仪表,从而将 [traces][] 和 [metrics][] 发送到控制台。

准备工作

确保您已在本地安装了以下软件:

示例应用

以下示例使用基本的 Flask 应用程序。如果您没有使用 Flask,也没关系 - 您也可以将 OpenTelemetry Python 与其他 Web 框架一起使用,例如 Django 和 FastAPI。关于支持的框架的完整列表,请参见 registry

有关更详细的示例,请参见 examples

安装

首先,在新目录中设置一个环境:

mkdir otel-getting-started
cd otel-getting-started
python3 -m venv .
source ./bin/activate

现在安装 Flask:

pip3 install 'flask<3'

创建并启动一个 HTTP 服务器

创建一个名为 app.py 的文件,并将以下代码添加到其中:

from random import randint
from flask import Flask

app = Flask(__name__)

@app.route("/rolldice")
def roll_dice():
    return str(roll())

def roll():
    return randint(1, 6)

使用以下命令运行应用程序,并在 Web 浏览器中打开 http://localhost:8080/rolldice 以确保它正常工作。

flask run -p 8080

仪表化

自动仪表化将代表您生成遥测数据。有几种选项可以使用,更详细的内容请参见 Automatic Instrumentation。这里我们将使用 opentelemetry-instrument 代理。

安装 opentelemetry-distro 包,其中包含 OpenTelemetry API、SDK 和工具 opentelemetry-bootstrapopentelemetry-instrument

pip install opentelemetry-distro

运行 opentelemetry-bootstrap 命令:

opentelemetry-bootstrap -a install

这将安装 Flask 仪表。

运行仪表化的应用程序

现在,您可以使用 opentelemetry-instrument 运行已仪表化的应用程序,并将其打印到控制台:

opentelemetry-instrument \
    --traces_exporter console \
    --metrics_exporter console \
    flask run -p 8080

在 Web 浏览器中打开 http://localhost:8080/rolldice 并多次刷新页面。一段时间后,您将在控制台输出中看到打印的 spans,如下所示:

查看示例输出
{
  "name": "/rolldice",
  "context": {
    "trace_id": "0xdcd253b9501348b63369d83219da0b14",
    "span_id": "0x886c05bc23d2250e",
    "trace_state": "[]"
  },
  "kind": "SpanKind.SERVER",
  "parent_id": null,
  "start_time": "2022-04-27T23:53:11.533109Z",
  "end_time": "2022-04-27T23:53:11.534097Z",
  "status": {
    "status_code": "UNSET"
  },
  "attributes": {
    "http.method": "GET",
    "http.server_name": "127.0.0.1",
    "http.scheme": "http",
    "net.host.port": 5000,
    "http.host": "localhost:5000",
    "http.target": "/rolldice",
    "net.peer.ip": "127.0.0.1",
    "http.user_agent": "curl/7.68.0",
    "net.peer.port": 52538,
    "http.flavor": "1.1",
    "http.route": "/rolldice",
    "http.status_code": 200
  },
  "events": [],
  "links": [],
  "resource": {
    "attributes": {
      "telemetry.sdk.language": "python",
      "telemetry.sdk.name": "opentelemetry",
      "telemetry.sdk.version": "1.14.0",
      "telemetry.auto.version": "0.35b0",
      "service.name": "unknown_service"
    },
    "schema_url": ""
  }
}

生成的 span 跟踪了对 /rolldice 路由的请求的生命周期。

发送更多请求到该端点,然后等待一小段时间或终止应用程序,您将在控制台输出中看到指标,如下所示:

查看示例输出
{
  "resource_metrics": [
    {
      "resource": {
        "attributes": {
          "service.name": "unknown_service",
          "telemetry.auto.version": "0.34b0",
          "telemetry.sdk.language": "python",
          "telemetry.sdk.name": "opentelemetry",
          "telemetry.sdk.version": "1.13.0"
        },
        "schema_url": ""
      },
      "schema_url": "",
      "scope_metrics": [
        {
          "metrics": [
            {
              "data": {
                "aggregation_temporality": 2,
                "data_points": [
                  {
                    "attributes": {
                      "http.flavor": "1.1",
                      "http.host": "localhost:5000",
                      "http.method": "GET",
                      "http.scheme": "http",
                      "http.server_name": "127.0.0.1"
                    },
                    "start_time_unix_nano": 1666077040061693305,
                    "time_unix_nano": 1666077098181107419,
                    "value": 0
                  }
                ],
                "is_monotonic": false
              },
              "description": "measures the number of concurrent HTTP requests that are currently in-flight",
              "name": "http.server.active_requests",
              "unit": "requests"
            },
            {
              "data": {
                "aggregation_temporality": 2,
                "data_points": [
                  {
                    "attributes": {
                      "http.flavor": "1.1",
                      "http.host": "localhost:5000",
                      "http.method": "GET",
                      "http.scheme": "http",
                      "http.server_name": "127.0.0.1",
                      "http.status_code": 200,
                      "net.host.port": 5000
                    },
                    "bucket_counts": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    "count": 1,
                    "explicit_bounds": [
                      0, 5, 10, 25, 50, 75, 100, 250, 500, 1000
                    ],
                    "max": 1,
                    "min": 1,
                    "start_time_unix_nano": 1666077040063027610,
                    "sum": 1,
                    "time_unix_nano": 1666077098181107419
                  }
                ]
              },
              "description": "measures the duration of the inbound HTTP request",
              "name": "http.server.duration",
              "unit": "ms"
            }
          ],
          "schema_url": "",
          "scope": {
            "name": "opentelemetry.instrumentation.flask",
            "schema_url": "",
            "version": "0.34b0"
          }
        }
      ]
    }
  ]
}

在自动仪表化中添加手动仪表化

自动仪表化会捕获系统边缘的遥测数据,如入站和出站的 HTTP 请求,但它无法捕获应用程序内部的情况。为此,您需要编写一些手动仪表化代码。以下是如何将手动仪表化与自动仪表化轻松连接的方法。

跟踪

首先,在 app.py 中修改代码以包含初始化跟踪器的代码,并使用它创建一个是通过自动生成跟踪而生成的一个子跟踪:

# 这些是必要的导入声明
from opentelemetry import trace

from random import randint
from flask import Flask

# 获取一个跟踪器
tracer = trace.get_tracer("diceroller.tracer")

app = Flask(__name__)

@app.route("/rolldice")
def roll_dice():
    return str(roll())

def roll():
    # 这将创建一个新的跟踪,并成为当前跟踪的子跟踪
    with tracer.start_as_current_span("roll") as rollspan:
        res = randint(1, 6)
        rollspan.set_attribute("roll.value", res)
        return res

现在再次运行应用程序:

opentelemetry-instrument \
    --traces_exporter console \
    --metrics_exporter console \
    flask run -p 8080

当您向服务器发送请求时,您将在控制台输出的跟踪中看到两个 spans,名为 roll 的 spans 将其父跟踪注册为自动创建的跟踪:

查看示例输出
{
    "name": "roll",
    "context": {
        "trace_id": "0x48da59d77e13beadd1a961dc8fcaa74e",
        "span_id": "0x40c38b50bc8da6b7",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x84f8c5d92970d94f",
    "start_time": "2022-04-28T00:07:55.892307Z",
    "end_time": "2022-04-28T00:07:55.892331Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "roll.value": 4
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.14.0",
            "telemetry.auto.version": "0.35b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}
{
    "name": "/rolldice",
    "context": {
        "trace_id": "0x48da59d77e13beadd1a961dc8fcaa74e",
        "span_id": "0x84f8c5d92970d94f",
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2022-04-28T00:07:55.891500Z",
    "end_time": "2022-04-28T00:07:55.892552Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.server_name": "127.0.0.1",
        "http.scheme": "http",
        "net.host.port": 5000,
        "http.host": "localhost:5000",
        "http.target": "/rolldice",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "curl/7.68.0",
        "net.peer.port": 53824,
        "http.flavor": "1.1",
        "http.route": "/rolldice",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.14.0",
            "telemetry.auto.version": "0.35b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}

rollparent_id/rolldicespan_id 相同,表示父子关系!

指标

现在,在 app.py 中修改代码以包含初始化计量器的代码,并使用它创建一个计数器仪表,用于计算每个可能的点数的掷骰子次数:

# 这些是必要的导入声明
from opentelemetry import trace
from opentelemetry import metrics

from random import randint
from flask import Flask

tracer = trace.get_tracer("diceroller.tracer")
# 获取一个计量器。
meter = metrics.get_meter("diceroller.meter")

# 创建一个用于测量的计数器仪表
roll_counter = meter.create_counter(
    "dice.rolls",
    description="每个掷骰子点数的次数",
)

app = Flask(__name__)

@app.route("/rolldice")
def roll_dice():
    return str(roll())

def roll():
    with tracer.start_as_current_span("roll") as rollspan:
        res = randint(1, 6)
        rollspan.set_attribute("roll.value", res)
        # 这会将指定点数的计数器加 1
        roll_counter.add(1, {"roll.value": res})
        return res

现在再次运行应用程序:

opentelemetry-instrument \
    --traces_exporter console \
    --metrics_exporter console \
    flask run -p 8080

发送请求到服务器后,您将在控制台输出的指标中看到掷骰子计数器的指标,其中为每个点数的掷骰子次数分别计数:

查看示例输出
{
  "resource_metrics": [
    {
      "resource": {
        "attributes": {
          "telemetry.sdk.language": "python",
          "telemetry.sdk.name": "opentelemetry",
          "telemetry.sdk.version": "1.12.0rc1",
          "telemetry.auto.version": "0.31b0",
          "service.name": "unknown_service"
        },
        "schema_url": ""
      },
      "scope_metrics": [
        {
          "scope": {
            "name": "app",
            "version": "",
            "schema_url": null
          },
          "metrics": [
            {
              "name": "dice.rolls",
              "description": "每个掷骰子点数的次数",
              "unit": "",
              "data": {
                "data_points": [
                  {
                    "attributes": {
                      "roll.value": 4
                    },
                    "start_time_unix_nano": 1654790325350232600,
                    "time_unix_nano": 1654790332211598800,
                    "value": 3
                  },
                  {
                    "attributes": {
                      "roll.value": 6
                    },
                    "start_time_unix_nano": 1654790325350232600,
                    "time_unix_nano": 1654790332211598800,
                    "value": 4
                  },
                  {
                    "attributes": {
                      "roll.value": 5
                    },
                    "start_time_unix_nano": 1654790325350232600,
                    "time_unix_nano": 1654790332211598800,
                    "value": 1
                  },
                  {
                    "attributes": {
                      "roll.value": 1
                    },
                    "start_time_unix_nano": 1654790325350232600,
                    "time_unix_nano": 1654790332211598800,
                    "value": 2
                  },
                  {
                    "attributes": {
                      "roll.value": 3
                    },
                    "start_time_unix_nano": 1654790325350232600,
                    "time_unix_nano": 1654790332211598800,
                    "value": 1
                  }
                ],
                "aggregation_temporality": 2,
                "is_monotonic": true
              }
            }
          ],
          "schema_url": null
        }
      ],
      "schema_url": ""
    }
  ]
}

将遥测数据发送到 OpenTelemetry Collector

OpenTelemetry Collector 是大多数生产部署的关键组件。有一些例子说明何时使用收集器是有益的:

  • 多个服务共享一个遥测数据接收器,以减少切换导出器的开销
  • 聚合多个运行在多个主机上的服务的跟踪
  • 用于在导出到后端之前处理跟踪的中心位置

除非您只有一个单独的服务或正在进行实验,否则您将希望在生产部署中使用收集器。

配置和运行本地收集器

首先,将以下收集器配置代码保存到 /tmp/ 目录中的一个文件中:

# /tmp/otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
exporters:
  # 注意:在 v0.86.0 之前,请使用“logging”代替“debug”。
  debug:
    verbosity: detailed
processors:
  batch:
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [debug]
      processors: [batch]
    metrics:
      receivers: [otlp]
      exporters: [debug]
      processors: [batch]

然后运行以下 Docker 命令,根据此配置获取和运行收集器:

docker run -p 4317:4317 \
    -v /tmp/otel-collector-config.yaml:/etc/otel-collector-config.yaml \
    otel/opentelemetry-collector:latest \
    --config=/etc/otel-collector-config.yaml

现在您将在本地运行一个收集器实例,监听 4317 端口。

修改命令以通过 OTLP 导出跟踪和指标

下一步是修改命令,以通过 OTLP 将跟踪和指标发送到收集器,而不是输出到控制台。

为此,请安装 OTLP 导出器包:

pip install opentelemetry-exporter-otlp

opentelemetry-instrument 代理将检测到您刚安装的软件包,并在下一次运行时默认使用 OTLP 导出。

运行应用程序

像之前一样运行应用程序,但不要导出到控制台:

opentelemetry-instrument flask run -p 8080

opentelemetry-instrument 默认使用 OTLP/gRPC 向 localhost:4317 导出跟踪和指标。

现在访问 /rolldice 路由时,您将在收集器进程中而不是 flask 进程中看到输出,输出大致如下所示:

查看示例输出
2022-06-09T20:43:39.915Z        DEBUG   debugexporter/debug_exporter.go:51  ResourceSpans #0
Resource labels:
     -> telemetry.sdk.language: STRING(python)
     -> telemetry.sdk.name: STRING(opentelemetry)
     -> telemetry.sdk.version: STRING(1.12.0rc1)
     -> telemetry.auto.version: STRING(0.31b0)
     -> service.name: STRING(unknown_service)
InstrumentationLibrarySpans #0
InstrumentationLibrary app
Span #0
    Trace ID       : 7d4047189ac3d5f96d590f974bbec20a
    Parent ID      : 0b21630539446c31
    ID             : 4d18cee9463a79ba
    Name           : roll
    Kind           : SPAN_KIND_INTERNAL
    Start time     : 2022-06-09 20:43:37.390134089 +0000 UTC
    End time       : 2022-06-09 20:43:37.390327687 +0000 UTC
    Status code    : STATUS_CODE_UNSET
    Status message :
Attributes:
     -> roll.value: INT(5)
InstrumentationLibrarySpans #1
InstrumentationLibrary opentelemetry.instrumentation.flask 0.31b0
Span #0
    Trace ID       : 7d4047189ac3d5f96d590f974bbec20a
    Parent ID      :
    ID             : 0b21630539446c31
    Name           : /rolldice
    Kind           : SPAN_KIND_SERVER
    Start time     : 2022-06-09 20:43:37.388733595 +0000 UTC
    End time       : 2022-06-09 20:43:37.390723792 +0000 UTC
    Status code    : STATUS_CODE_UNSET
    Status message :
Attributes:
     -> http.method: STRING(GET)
     -> http.server_name: STRING(127.0.0.1)
     -> http.scheme: STRING(http)
     -> net.host.port: INT(5000)
     -> http.host: STRING(localhost:5000)
     -> http.target: STRING(/rolldice)
     -> net.peer.ip: STRING(127.0.0.1)
     -> http.user_agent: STRING(curl/7.82.0)
     -> net.peer.port: INT(53878)
     -> http.flavor: STRING(1.1)
     -> http.route: STRING(/rolldice)
     -> http.status_code: INT(200)

2022-06-09T20:43:40.025Z        INFO    debugexporter/debug_exporter.go:56  MetricsExporter {"#metrics": 1}
2022-06-09T20:43:40.025Z        DEBUG   debugexporter/debug_exporter.go:66  ResourceMetrics #0
Resource labels:
     -> telemetry.sdk.language: STRING(python)
     -> telemetry.sdk.name: STRING(opentelemetry)
     -> telemetry.sdk.version: STRING(1.12.0rc1)
     -> telemetry.auto.version: STRING(0.31b0)
     -> service.name: STRING(unknown_service)
InstrumentationLibraryMetrics #0
InstrumentationLibrary app
Metric #0
Descriptor:
     -> Name: roll_counter
     -> Description: The number of rolls by roll value
     -> Unit:
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: AGGREGATION_TEMPORALITY_CUMULATIVE
NumberDataPoints #0
Data point attributes:
     -> roll.value: INT(5)
StartTimestamp: 2022-06-09 20:43:37.390226915 +0000 UTC
Timestamp: 2022-06-09 20:43:39.848587966 +0000 UTC
Value: 1

下一步

对于自动仪表化和 Python,有多个选项可供选择。请参阅 Automatic Instrumentation 了解有关它们以及如何进行配置的详细信息。

手动仪表化不仅仅是创建子 span 那么简单。要了解有关初始化手动仪表化和 OpenTelemetry API 的更多细节,请参阅 Manual Instrumentation

使用 OpenTelemetry,有多种选项可用于导出您的遥测数据。要了解如何将数据导出到首选后端,请参阅 Exporters

如果您想探索更复杂的示例,请查看 OpenTelemetry 演示,其中包括基于 Python 的 Recommendation ServiceLoad Generator

最后修改 December 13, 2023: improve glossary translation (46f8201b)