Using your Database as a Queue? Good or bad idea?

preview_player
Показать описание
Do you need a message broker, or can you use your database as a queue? Well, as in most cases, it depends. Let me explain how it's possible and some trade-offs and things to consider, such as volume, processing failures, and more.

🔗 EventStoreDB

💥 Join this channel to get access to a private Discord Server and any source code in my videos.

🔥 Join via Patreon

✔️ Join via YouTube

0:00 Intro
0:59 Invisibility Timeout
2:14 Competing Consumers
3:40 Table
5:35 Trade-offs

Original Blog Post:

#softwarearchitecture #postgresql #rabbitmq
Рекомендации по теме
Комментарии
Автор

Yeah there is really complex queue system in my current project (hopefully not for long) on top of sqlserver and learning curve is very steep

adrian_franczak
Автор

I am implementing this exact same approach in my project currently. Removing RabbitMQ and replacing it with a table-based queue in Postgres. And our volumes aren't so high that I need to worry about running into problems.

Two motivations for this:

1. Less complexity.
2. Saving infra costs on RabbitMQ.

sirg
Автор

What l see is the most benefit is for people using a small vps and don't want to install a message broker with a 1gig ram requirement off the bet, this strategy would really work well until you need advanced features.

buildingphase
Автор

There's a lot more to consider than what meets the eye at the first glance: push/pull model, DLQ, priorities, throughput, latency, multiple consumers for one queue, producing messages with fan-outs etc., not even talking about the guarantees of delivery, failovers and keeping full history necessities. There is no silver bullet. We actually chose Postgres for a system that needed the specifics from the mentioned: pull model, priorities, DLQ, failovers and did implement the queue logic in pl/sql due to the fact it's also a transport for messages and not a single service queue.
Best way to go IMO is to try different solutions and see their shortcomings in a POC environment, built SPECIFICALLY to your project needs and not as a general solution. General solutions fail most of the time when you're working on projects that are medium-large in size or projects that weren't built AROUND the queue. Get a right tool for the job and reduce the marketing/hype noise "the X tool is much better than Y because it does Z" - do you really need Z? Is Z more important than all the things you lose by switching from Y? Very good video on an important topic, I see it as a continuation on "why best practices are bad" (or something along the lines) video. Context is KING.

vlakarados
Автор

When working as a beginner developer, I implemented similar queue on an Oracle database, without really understanding what I'm doing. Later when I learned about real message brokers I assumed that using database for that was a crazy idea. Fun to see today that it could actually made sense after all :)

nitreus
Автор

I refused using a message broker for a simple scenario in my previous project. Runs on a database table and polling consumers for several years now with zero issues and maintenance.
We should introduce complexity only when no suitable simple solutions left.

penaplaster
Автор

I would love a video or a series where you focus on how you use EventStoreDB in your architecture. Also maybe when to use a message broker vs events.

jonthoroddsen
Автор

We do have our own messaging library built on top of MySQL and/or Redis. We've debated moving to a real message queue, but, we're a small company and production is Hard - way harder than it is to just slap a random piece of tech into your project, especially when you know what hits the fan, and we'd rather minimize the number of techs we have in production.

bobbycrosby
Автор

I have a personal project that I use postgres for a queue system with.

It takes about 1k operations a day & those operations can be anywhere from 500ms - 3 minutes.

Personally I have 2 tables, a queue and a dispatches table. I issue a queue item by selecting an item in the queue that doesn't have a matching dispatch row from up to 3 minutes ago.

Once a queue item has 3 matching dispatch rows, the queue item is considered failed.

It was pretty easy to implement. Not sure if I did it in the optimal way or not, I'm a complete noob when it comes to postgres (and sql in general)

It's a nice system so far. The dispatches table has a "fail reason" column which makes it pretty easy to figure out why things are failing when they are.

CatMeowMeow
Автор

Transaction does have a timeout, so for X long message processing, when X > transaction timeout, you must commit twice, and have select a condition on "processing" IS NULL

tomasprochazka
Автор

We have queue based on table. We have Topics and each topic have own worker to pooling messages. We have also VisibleFrom and ExpiresAt columns. Table is properly indexed but those indexes are not used by sql engine because rows quantity on that table is very variable and changes a lot so sql engine have no chance to have proper statistics - that leads to just scans. Scan is not a problem when there is 1k of rows but our clients sometimes puts/generates 200k messages at once (or more). Scan (to find Topic) on that number of rows is significant.

Now we are on the path to ditch sql tables as queue and move to RabbitMQ :)

dariuszlenartowicz
Автор

Had to come back and say that MassTransit will get support for SQL Databases as a "transport" soon. So there is no need to create your own abstraction for queues using database. And you will get all the goodies from using that framework.

marna_li
Автор

I used MySQL/MariaDB as message queue, dead letter queue, failure handling & retry, while ensuring ordered processing on it for years.

raghuveerdendukuri
Автор

Faced precisely that problem (in Ruby) a long long time ago the first time the team I was on interacted with RabbitMQ - had 4 workers, one of them took a thousand messages at a time and blew up all of the time, causing all the prefetched - some already processed - messages to go back on the queue, causing havoc. We couldn't figure out if we were doing something wrong of if it was just this counterintuitive by design, and I think we solved it only after setting max_prefetch=1 and manually .ack():ing each message. 🤔

cbrunnkvist
Автор

Somehow the solution used in this case feels like a workaround for a misconfigured setup in the first place though. I've only ever used DB's as queues for simplicity and accessibility, but with growing complexity, it feels like the feature set DB queues come with can't quite keep up

allinvanguard
Автор

postgres is ones you master it you mostly don't need anything else. i'd argue most infrastructure are overly complicated

qrjftvx
Автор

Without transaction, you can do:

UPDATE messages
SET status = processing
WHERE status = pending
RETURNING id

So you reserve records for processing and do whatever you want with them. If you finish processing successfully, change it to the next status, otherwise return it to the previous one.

belmirp
Автор

for fast processing context, how about the money cost of consistently querying the database each 2 seconds in many processes ?

MorkorOuzor
Автор

Would not deleting the rows at the end of the transaction cause significant performance issues? I'd like to keep the rows after they are finished processing to show the status in an admin dashboard

Also, would polling multiple rows (10 to 100) at once, cause issues?

dimitriborgers
Автор

Did anyone ever figure out if it’s possible to disable the prefetch function of rabbitMQ?

Because it might be a situation where they reached for a DB queue because it’s something they can control, but rabbitMQ wasn’t.

quincymitchell