Apache Kafka: Keeping the order of events when retrying due to failure

preview_player
Показать описание
How to handle message retries & failures in event-driven systems? While keeping event ordering?
In this video, I explain how you can approach this problem.

#eventdrivenarchitecture #danieltammadge #ApacheKafka #microservices
Рекомендации по теме
Комментарии
Автор

Thank you for sharing. Very clear detailed approach to keeping order of events when using Apache Kafka

StephenTD
Автор

Hi Daniel, I am little confused, what happens when consumer index 4 fails when its in retry topic ? Should we put that event into failing event table along with consumer index 5 ? Also below is my understanding, can you please rectify if anything wrong I interpreted ..

Main consumer -- check if any event for the incoming customer is is present in the failing table ? If yes, put the incoming event to Holding table, If no, process it, If processed successfully, acknowledge the offset else put into failing table.

Retry Producer - Polls the failing table, creates an event and pushes it to retry topic

Retry Consumer - check if any event in the retry topic, processes it, if successful, gets all the messages from holding table and pushes it to retry topic. What happens if three messages were in holding table for same customer id and all three got pushed to the retry topic and first message fails ? What does retry consumer do ?

abhishekanand
Автор

at 2:50, when you said, after successfully processing the failed event, you'd republish all the holding events, in which topic will you publish them? I believe, you meant - retry topic, right? Not into the main topic, as the events can get out of order.
However, there is a catch here.
Once a failed event has been successfully processed, will you delete that record? If a new event is still in the main topic, it doesn't know that there is a failed event or holding event. So, are you suggesting that we always need to check both failed events table & holding events table? If there is any record in holding events table, add new events to holding table? If there is a lot of traffic, one failed event, can lead to all events going into holding table and slowing down the event processing.
How do we solve this?

hemanthaugust
Автор

Hi Daniel, I am facing a similar problem. Please let me know where to create the Failing Even Log Table and Holding Table. Is it in a separate database or in the ksqldb for example?

rajind
Автор

Hello! Great material :)
I've some questions regarding it.
How do you deal with the scenario where you already process a failure message but before you process the holding events some new event is consumed from the topic?
If it is processed first, the current state will be overwritten by the holding events, right?
So, what comes to my mind is to check also holding events, but what then? Attached the newest event at the end and fire holding events?
You don't mention it during the video so there is a high chance that I got something wrong, I would appreciate it if you clarify it.

mateusz
Автор

Hi Daniel, what is the use of partition and offset stored in holding table?

skblabla
Автор

Should this be done in cases where create customer failed due to a third party service downtime? As the number of events failing in such case will be huge.. what is the recommendation for such scenario?
I need to read more on error handling i guess..but want to know what kind of exceptions can be handled this way..will the approach change if we dont know the volume of exceptions? As we wont know how long the third party service downtime will last for?

skblabla
Автор

What database system do you recommend to store these problematic events?

frankcoutinho
Автор

Have you ever implemented a circuit breaker in Kafka consumer?

hernanisilang