Kubernetes Fairbid Kafka Cluster¶
Archived (pre-2022)
Preserved for reference only -- likely outdated. View original | Last updated: September 2020
Some working notes which will become the manual eventually
Based on the ticket: DEVOPSBLN-1260
Monitoring & Escalations: DEVOPSBLN-1550
Basic design¶
Fairbid Kafka cluster is running on 'bln-fairbid-production' Kubernetes cluster in Inneractive AWS Production account. All infrastructure represented with code. The Kubernetes cluster is deployed with Terraform (version >= 0.12). Kafka and Zookeeper deployed with the Helm using community helm charts provided by Bitnami. These charts were copied and reworked by us, and you can find the corresponding code in our bln-k8s-common-helm repository.
For this Kafka cluster, we are using specific Spotinst Launch Configurations with appropriate labels and taints.
Cluster is in the Testing stage at the moment, so we didn't decide on the exact resource limits and cluster configuration, but as for now it's the following:
- 5 Kafka brokers running on OnDemand nodes (m4.xlarge - m5.xlarge) with 80Gi persistent storage. Nodes are deployed in MultiAZ, so we are making use of all 3 of them.
- Pods on nodes are isolated from any other nodes by using Kubernetes Taints
- 3 Zookeeper pods running on Spotinst nodes
- Kafka Exporter and JMX Exporter are exposed to the Prometheus
- Kafka brokers are available for any services running inside the Kubernetes cluster through headless service
- Kafka brokers are unavailable externally, but this behavior can be changed.
Links¶
Helm Charts for Kafka and Zookeeper: kafka (Bitbucket)
Bitnami Kafka helm chart: kafka (Github)
Grafana dashboard: http://grafana.production.fyber.com:3000/d/D40Go0lWz/eks-fairbid-kafka
Launch Specification for Ocean in Terraform: launch_specs.tf (Bitbucket)
Terragrunt file for Ocean deployment via Terraform: terragrunt.hcl (Bitbucket)
To Do¶
1 complete Management of Kafka topics 2 complete Alerting
How To's¶
unset JMX_PORT; kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 5 --topic KAFKA_RULES