PREVIEW: Taming Billions of Metrics and Logs at Scale (Luca Magnoni, CERN) Kafka Summit 2018

Показать описание

Apache Flume is currently used as collector agent. A processing infrastructure is provided to users for the streaming analytics of monitoring data, based on Mesos/Marathon and Docker for job orchestration and deployment, with users developing the processing logic on the preferred framework (e.g. mainly Apache Spark, but with Kafka Streams/KSQL being an option too). The results of the analysis, as well as the raw data, are stored in HDFS as long term archive and on Elasticsearch and InfluxDB as backends for the visualisation layer, based on Grafana. This talk discusses the monitoring architecture, the challenges encountered in operating and scaling Kafka to handle billions of events per day and presents how users benefit from Kafka as central data hub for stream processing and analysis of monitoring data.

ABOUT CONFLUENT