Prometheus: monitoring and alerting for the cloud-native era

Prometheus introduces a pull model for monitoring with multidimensional time series, PromQL, service discovery and built-in alerting, becoming the second CNCF project after Kubernetes.

Open SourceWeb Open SourcePrometheusMonitoringAlertingCNCFTime Series

From SoundCloud to the CNCF

Traditional monitoring tools struggle to handle infrastructures where services appear and disappear constantly: orchestrated containers, dynamically scaled microservices, ephemeral instances that change IP address every few minutes. Prometheus, born inside SoundCloud from the work of former Google engineers — including Matt Proud and Julius Volz, who bring experience gained with Google’s internal monitoring system, Borgmon — addresses precisely this scenario.

In 2016 Prometheus joins the Cloud Native Computing Foundation (CNCF) as its second hosted project, right after Kubernetes. The inclusion signals that the cloud-native community considers monitoring to be as important as container orchestration.

Pull model and multidimensional time series

Unlike traditional push-based monitoring systems — where agents send metrics to a central collector — Prometheus adopts a pull model: the server periodically scrapes /metrics endpoints exposed by targets. This approach simplifies configuration of the monitored services, which only need to expose an HTTP endpoint, and lets Prometheus know immediately when a target is unreachable.

Metrics are organised as multidimensional time series: each series is identified by a name and a set of key-value pairs called labels. For instance, a metric http_requests_total can have labels like method="GET", handler="/api/users", status="200". Labels allow filtering, aggregating and correlating metrics in ways that traditional name-hierarchy-based systems cannot.

PromQL and service discovery

PromQL is the query language designed to interrogate time series. It supports operations such as rate of change, aggregations, filters and joins across different metrics. Service discovery enables Prometheus to find targets automatically — integrating with Kubernetes, Consul, DNS, static files and other mechanisms — without requiring manual configuration for every new instance.

Built-in alerting

The Alertmanager, a separate component within the ecosystem, handles notifications: it receives alerts generated by rules defined in Prometheus, groups them, deduplicates them and routes them to channels such as email, Slack or PagerDuty. Prometheus stores data in an optimised local storage for time series, with a persistence model designed for efficiency and speed rather than long-term durability.

Link: prometheus.io

Need support? Under attack? Service Status
Need support? Under attack? Service Status