Jaeger: Open Source distributed tracing from Uber

Jaeger (April 2017, Uber) by Yuri Shkuro: distributed tracing system for microservices. OpenTracing/OpenTelemetry compatible, Cassandra/ES/Kafka storage, rich UI. Emerging tracing observability standard.

Open SourceWeb JaegerUberDistributed TracingOpenTelemetryCNCFObservabilityOpen Source

The distributed tracing problem

Microservice architectures generate call cascades: an HTTP request enters, traverses 5-50 services, each with its own latencies, errors, downstream dependencies. Traditional logs aren’t enough: distributed tracing is needed to correlate all spans of a single transaction with a shared trace ID.

Dapper (Google, 2010) and Zipkin (Twitter, 2012) are precursors. Uber builds Jaeger after outgrowing Zipkin at internal scale.

The release

Jaeger is released Open Source by Uber in April 2017. Lead engineer: Yuri Shkuro. Written in Go, Apache 2.0 licence. Donated to CNCF in October 2017 as an incubating project. Name: Jägermeister (hunter in German).

Architecture

  • Agent — sidecar/DaemonSet that collects spans from services
  • Collector — receives spans, processes, writes to storage
  • Query service + UI — queries storage to display traces
  • Storage — Cassandra (original), Elasticsearch, Kafka, gRPC plugin, Badger (embedded)
  • Ingester — Kafka → storage (async deployment)

Instrumentation

Supports:

  • OpenTracing API (historical) — earlier cross-vendor interface
  • OpenTelemetry (OTel) — current CNCF standard; Jaeger is an OTel-compatible backend
  • Jaeger native clients — Go, Java, Python, Node, C++

Many libraries/frameworks have auto-instrumentation: Express, Spring Boot, gRPC, Kafka client, HTTP client, SQL driver.

UI features

  • Trace search — by service, operation, tag, duration
  • Service dependency graph — dependencies between microservices
  • Flame graph — per-span latency
  • Compare traces — diff between two executions

Competitors

  • Zipkin (Twitter, 2012) — historical, simpler
  • AWS X-Ray, Google Cloud Trace — managed
  • Datadog APM, New Relic, Dynatrace — commercial
  • Honeycomb — analytics-oriented

In the Italian context

Jaeger is entering Italian teams adopting microservices:

  • Telco — TIM, Vodafone for distributed B/OSS traces
  • Fintech and challenger banks — debug transactional flows
  • Large e-commerce with microservices
  • Digital PA — projects with many inter-agency integrations
  • B2B SaaS selling APIs

References: Jaeger (Uber, April 2017). Yuri Shkuro. Apache 2.0 licence. Written in Go. CNCF Incubating since October 2017. OpenTracing / OpenTelemetry backend. Cassandra, Elasticsearch, Kafka storage.

Need support? Under attack? Service Status
Need support? Under attack? Service Status