Optimizing Kafka Producers and Consumers: A Hands-On Guide

preview_player
Показать описание

Contents:
0:00 intro, setup
4:10 code walkthrough
10:14 optimizing Kafka producers
18:05 optimizing Kafka consumers

This video is a hands-on guide to maximizing the performance of your Kafka producers and consumers. We will talk about the artchitecture of a typical application with Kafka, how producers work and how to tune Kafka producers for either max throughput or min latency. Then, we will switch sides to consumers and will discuss how you can scale your Kafka consumers vs number of partitions, and how to leverage multithreading inside a single consumer for 1000x increase in consumption performance.

Follow Rock the JVM on:

-------------------------------------------------------------------------
-------------------------------------------------------------------------
Рекомендации по теме
Комментарии
Автор

That was an amazing demo, literally the same ideas can be applied using the Python client and I love the way you described the problem, pretty language agnostic. The blog post is also neat as supplementary material. Thanks for your time.

ShahriyarRzayev
Автор

My naive solution for this optimization is (haven't run it to see if it works):
- Producer: KafkaProducer is thread safe, so I will create a coroutine for each message. Each coroutine takes the same kafkaProducer, uses it to send a message, waits until the Future returned from kafkaProducer.send resolves then finishes. So with 10M messages, we'll have 10M coroutines. If 10M coroutines is too many for the Kotlin coroutine library, I hope there'll be a coroutine pool executor service (like the thread pool executor service in Java) for us to use.
- Consumer: of course we need a number of kafka consumers which is equals to the number of partitions. To make each kafka consumer consume messages at the highest speed, I won't use the JsonDeserializer (decoding a byte array to a json string takes time) but use the ByteArrayDeserializer (i.e. no decoding is needed). The kafka consumer will send the byte array to a blocking queue / ring buffer for other coroutines to consume and process.Disadvantage of this method is that if our message processing is slow, there'll be a lot of messages in the blocking queue.

What do you think of this solution?

avalagum