You want to use Kafka? Or do you really need a Queue?

Показать описание

Do you want to use Kafka? Or do you need a message broker and queues? While they can seem similar, they have different purposes. I'm going to explain the differences, so you don't try to brute force patterns and concepts in Kafka that are better used for a message broker.

🔗 EventStoreDB

💥 Join this channel to get access to source code & demos!

🔥 Don't have the JOIN button? Support me on Patreon!

0:00 Intro
0:43 Log
2:48 Messages
5:19 Broker
7:29 Partitions

#eventdrivenarchitecture #softwarearchitecture #softwaredesign

Рекомендации по теме

Комментарии

Another great video Derek.

For what it's worth, the metamodel that I always use to explain messaging to people is this: Message break down into two types - Requests and Events. Requests further break down into two types - Commands and Queries. So, from a modelling perspective, Messages and Requests are abstract concepts, Queries, Commands and Events are concrete. You can provide a nice UML diagram that shows this.

Requests are owned by the consumer therefore you could have multiple logical producers but a single logical consumer. Events are owned by the publisher therefore you can have a single logical producer but zero or many logical consumers. Requests are unidirectional - they require for there to be a consumer, a specific consumer and exactly one consumer on the end of the line to process the request. The producer is aware of the consumer and it is logically coupled with it. They producer needs the consumer. Due to this coupling, requests are blocking for the producer because the producer requests something that it needs in order to continue the business process it was executing. Whether it requested for something to happen, via a command, or it requested some information via a query, the producer needs a response before it can continue. Events on the other hand are broadcast - they do not require for there to be any consumer or there can be multiple consumers. The producer is not aware nor interested in these consumers. The producer has no expectations of these consumers. Events are fire-and-forget. As part of the business process it was executing, the producer can publish an event and continue with what it was doing without expecting anything from anyone.

So indeed, the idea that Events break down into Messages and Commands is completely wrong. Messages break down into Requests (Commands and Queries) and Events. Requests ask someone specific for something specific - whether it is to do something or to provide some information. Requests imply coupling and are blocking. Events notify the world and whoever may or may not be interested that something has happened without expecting anything in return. Events do not imply coupling, they imply that the producer is completely de-coupled and unaware of the consumers. Events are non-blocking.

The whole idea of a Command Event is idiotic and whoever came up with it needs to learn more about messaging and distributed systems in particular and about architecture in general. What they need not do is publish articles that confuse people when the general awareness and knowledge of messaging and distributed systems is already very low.

People also need to learn about logical versus system vs phyisical boundaries. This is really key to actually understanding and defining architectures.

andreipacurariu

8:00 I think that there can be multiple Consumers for a partition in kafka. They just have to be assigned to different Consumer Groups. You can't have two Consumers assigned to the same Partition within the same Consumer Group. That actually helps the process messages only once if you have only one Consumer Group. Kafka actually stores how far you've read into a partition. Its for a consumer to decide when to "ACK" how far it read through. There is also a built in hashing algorithm called murmur that helps spreading messages across partitions efficiently.

MrHuno

Kafka has many configuration options. It can be configured to act as message broker as you described. now the question is "should it be done ?" I don't know, but it sure can be done.

sodrechavessodre

Really glad you got a chance to cover Kafka and really like you're suggesting it's not for everything. But there are somethings that when explained seem confusing. My concern is where consumer and ability to ack that state has been persisted correctly (not still attempting to be persisted as in competing consumer scenarios)

Kafka is first designed (and most well known) to handle at-least-once message delivery. In this, a partition is acting as a means for consensus, so messages can be handled in correct order (tracked by Kafka offset).

In a competing consumer you have multiple opportunities for downstream state to be persisted. This leads to people to start writing logic in their consumers to handle all sorts of async blocking scenarios (frequently leading to messy developer assumptions). Instead Kafka users simply just say to the Kafka partition they're acking with "you're ready to pull the next message". This is why people use Kafka for communicating in transactional scenarios like finance and across data centers (where arrival of messages can pollute data communicated to another boundary to reliably persist). It's also why tools like connector solutions are heavily built around Kafka ecosystem (because they reliably replicate data between multiple locations). It doesn't mean Kafka is blocked like what was encountered in the service bus days (blocking the world)... We can still have multiple consumers by providing multiple partitions defined by unique key to feed partitions(... Where a good topic to cover is aggregate root IDs)

At-most-once delivery semantics where a queue pops / fires-and-forgets you'll find competing consumers more often because focus is not guarantees on data arriving and persisting in order... It's notification (like broadcast messages). Think communication bridges for notification (mqtt and otherwise) or telemetry data where data that's lost or polluted by a dirty read/write/retry on restart isn't a big deal.

baseman

Thank you once again on your clear thoughts on commands and events! The practical problem in industry I think is : Once a central Kafka infrastructure is setup, it is very easy to add new topics and start using it which makes friction to adopt very less.
On the Low Level Development side, tried and tested integrations with established Kafka infra for messages in an already existing micro-service gives reliability & confidence and near-zero code changes that piece of producer/consumer module. This would otherwise require development efforts for any other broker integration (like ActiveMQ, for example)
For such situations, people generally convince themselves with such technical jugglery of words by calling them Command Events or treating them as same or stop caring about the implementation and keeping the logical hop of this distinction (of commands and events) in their minds

krozaine

I really liked the video, since I recently read in twitter that kafka is not a proper message broker, but didn't really know the difference. Just to point out, Kafka have ack feature as well, the consumers can commit the offset of the last message they consumed and Kafka stores it in a special topic, impeding that a well configured consumer, of the same consumer group, process old messages from the topic, only consumers from a new consumer group would read old messages.

mathiasdemestral

Very well explained. The distinction between a message being a command and event is critical to help make key architectural decision on how they are produced, stored, and consumed. “Command event” is an idea that only adds needless confusion. It more often than not leads to compromised solutions.

frozencanuck

My workplace treat Kafka as "the correct broker for microservices" and ignores alternatives like RabbitMQ or ampq. I don't even consider about commands and event, I usually treat them as the same. Thanks for the video!

FahmiNoorFiqri

Very well spoken examinations, your knowledge and channel are very well refined :) good job ❤️

scottspitlerII

I feel almost personally responsible for inspiring this video because it was exactly related to my question on one of your previous videos. That said, i think i still disagree that i “need” a message broker even if a message broker is technically more appropriate model. At my current company we have this problem of “technology proliferation “ where we find ourselves paying for all kinds of software that isn’t used except rarely or used incorrectly anyways so i feel like just “making it work” with what we have which is kafka seems like an ok compromise to avoid adding complexity. Plus in our case, users require that those commands need to be audited later so a log also seems appropriate.

essamal-mansouri

So true. I have been telling others the same things when someone mentions Kafka or is planning to use it.

Boss-grjw

Thanks Derek for another video!:)
Great explaination of a difference between commands and events but when it comes to technologies behind them I think it misses some points.
Kafka also has a way to acknowledge messages - consumers commit processed messages which makes commit offset progress forward but the key thing is that it only happens within a consumer GROUP. That's true that you can add a new consumer group which could start consuming messages from the beginning but as long as you're using the same consumer group you can keep track of what messages where already consumed so you won't process the same message multiple times.
That's some assumption you need to make but you also make similar assumption in case of events - there should be only one publisher. However you cannot technically enforce it in Kafka to allow only one instance of publisher to send message to given topic. Does it make Kafka bad for handling events?
I'm not saying that Kafka is the best way to handle commands but I wouldn't avoid it at all costs just because you may misconfigure consumer groups and read commands again (it doesn't even imply that reading commands again will make them being processed). Maybe in some cases it would be even beneficial to keep tech stack simpler and handle both commands and events using single mechanism in smaller projects.

TL-zype

Good explanation, as always, but since you namedropped Kafka, it'd make some sense to namedrop what you consider to be some good message brokers too? Azure /Service Bus/ Queues? RabbitMQ? Perhaps a good follow-up video to this :)

digitalhome

Well, Youtube decided to eat my comment, don't have the will to rewrite it, but briefly Kafka gives you a lot of power to change how to produce and consume events at runtime in terms of the broker and without changing its infrastructure. Adding consumer groups and reading from the start of the stream, etc. That can be extremely useful, especially when you want to later add services without having to write special code to load historic data from another store. You can write the new consumer service and have it read from the start of the stream and let it catch up.

everydreamai

Once again excellent video but I have 2 concerns:

1) publishing commands: in CQRS, commands are executed by a command processor and upon a successful state change, an event(s) is published. Therefore to me, commands should be executed immediately by a command processor or queued to the command processor for execution asynchronously. I only publish events to be executed asynchronously and am only concerned if the message does not get accepted by message broker for processing which I can process as an exception. I only used RPC calls like REST or gRPC to process my commands or queries.

2) The basic architecture of Kafka is totally geared towards fully distributed asynchronous streaming of events. What Kafka is extremely good at is pull based consumers that subscribe to events and only process the events when the consumer is ready to accept another event for processing. Most message brokers on the other hand are centrally pushed message based with more complicated message handling patterns which Kafka clearly minimized in their design for extremely high performance. While Kafka can work as a message broker, I would use it purely for high performance asynchronous event stream processing only.

basilthomas

So Kafka is good to let multiple microservices react to the same event, but it's not too good if an event needs to be consumed exactly once. Correct? In that case, what would your recommendation be? Use a message queue for such "event-commands"? or reconsider the design? (i.e. why a command is required rather than an event)

Fred-yqfs

I remember going on that site and seeing the term "Command Event" and immediately closing the website. I agree with you. It's absolutely garbage

YazanAlaboudi

Using the outbox pattern with RabbitMQ you can reproduce the events for a new consumer as well. That’s my plan. Can’t use Kafka as there’s no support for it where I work. It’s taken a long time to get them to move away from monolithic massive transactions.

MiningForPies

You can actually have multiple consumer instances process event logs from the same topic if they are part of a common consumer group.

gertrude

Hmm interesting. But I still can use Events with Message broker using Topics.
So in general we can use it both for commands and evets. The difference that we can't store data as in log based message brokers like Kafka. And we can't processe streams.
But I agree when we need message command for things like request reply it's more naturaly to use ServiceBus or RabbitMq, or even simple queues rather then Kafka.
But if you already have Kafka for events it's difficult to say should we add brokers for commands like messages or try to use Kafka like mentioned in one of comments below.

alexanderbikk

You want to use Kafka? Or do you really need a Queue?

You want to use Kafka? Or do you really need a Queue?

Kafka in 100 Seconds

3. Apache Kafka Fundamentals | Apache Kafka Fundamentals

Apache Kafka in 6 minutes

System Design: Apache Kafka In 3 Minutes

Apache Kafka 101: Introduction (2023)

When NOT to use Apache Kafka?

What is Apache Kafka®?

The Ultimate Guide to CDC Setup: Kafka, Debezium, PostgreSQL, and MySQL | Enterprise level setup

What is a Message Queue and When should you use Messaging Queue Systems Like RabbitMQ and Kafka

Apache Kafka in 5 minutes

Apache Kafka in 6 minutes: Apache Kafka Tutorial #1

A Kafka Client’s Request: There and Back Again by Danica Fine

Everything you Wanted to Know about Apache Kafka but You Were too Afraid to Ask!

Why do we need Kafka?

Topics, Partitions and Offsets: Apache Kafka Tutorial #2

Apache Kafka vs message queue explained

Why You Need To Learn Apache Spark and Kafka | Tutorial #1

Apache Kafka® Tutorials for Beginners | What & Why Apache Kafka? Brief introduction | JavaTechie...

Kafka Streams 101: Getting Started (2023)

2. Motivations and Customer Use Cases | Apache Kafka Fundamentals

I can't believe it's not a queue: Using Kafka with Spring

Kafka and Pulsar a gentle comparison by Anton Rodriguez

Introduction to Kafka Streams