HPA isn’t enough
Kubernetes HPA (Horizontal Pod Autoscaler) scales based on CPU/memory — metric too coarse for event-driven workloads: Kafka queue length, RabbitMQ depth, SQS messages, custom metrics. An autoscaler is needed that looks at external event tuples.
The release
KEDA (Kubernetes Event-Driven Autoscaling) is announced by Microsoft and Red Hat in May 2019 at KubeCon Barcelona. Version 1.0 released on 19 November 2019. Apache 2.0 licence. Written in Go. Donated to CNCF in 2020, graduated in August 2023.
How it works
KEDA extends K8s HPA with ScaledObject (or ScaledJob) CRD:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata: { name: kafka-consumer }
spec:
scaleTargetRef:
name: my-consumer
minReplicaCount: 0
maxReplicaCount: 50
triggers:
- type: kafka
metadata:
bootstrapServers: kafka:9092
topic: orders
lagThreshold: "100"
KEDA observes the Kafka topic lag and scales the Deployment. Includes scale-to-zero: if there are no messages, the Deployment drops to 0 replicas.
Supported scalers
60+ out-of-box scalers:
- Queues/Streaming — Kafka, RabbitMQ, ActiveMQ, NATS, Redis Streams, Azure Service Bus, AWS SQS/Kinesis, GCP Pub/Sub
- Metrics — Prometheus, Datadog, New Relic, Azure Monitor, CloudWatch
- Database — PostgreSQL, MySQL, MSSQL, MongoDB (row count)
- Triggers — Cron, external webhook, Redis, etcd, Solace
- AI/ML — Seldon, custom predictor via gRPC
Use cases
- Kafka consumer workers that scale on lag
- Batch processing triggered by queue
- Serverless-like on K8s (scale-to-zero)
- Scheduled scaling — open at 8:00, close at 20:00
- Spot workload — burst on low-priority
- ML inference — scaling based on requests
Integration
- Knative complementary for HTTP serverless
- Argo Events — in-cluster events
- Istio + KEDA — scale based on requests-per-second
- Prometheus Operator — custom metrics
In the Italian context
KEDA is used in:
- Italian e-commerce with burst traffic (Black Friday, Christmas)
- Fintech — async order processing
- Telco — event-driven B/OSS pipelines
- Industrial IoT — sensor processing
- AI/ML — model inference autoscaling
- Digital PA — PNRR/PSN batch pipelines
Allows containing cloud costs by scaling-to-zero on idle workloads, and handling unpredictable peaks without static provisioning.
References: KEDA 1.0 (19 November 2019). Microsoft + Red Hat. Apache 2.0 licence. Written in Go. CNCF graduated (August 2023). ScaledObject/ScaledJob CRD. 60+ scalers. Scale-to-zero.
