Shared Database between Services? Maybe!

preview_player
Показать описание
Is a shared database a good or bad idea when working in a large system that's decomposed of many different services? Or should every service have its own database? My answer is yes and no, but it's all about data ownership. Let me explain all kinds of situations that up answering this question.

🔗 EventStoreDB

💥 Join this channel to get access to source code & demos!

🔥 Don't have the JOIN button? Support me on Patreon!

0:00 Intro
1:26 Monolith
2:33 Schema & Data Ownership
5:26 Query & View Composition
8:51 Command Data Consistency

#softwarearchitecture #softwaredesign #codeopinion
Рекомендации по теме
Комментарии
Автор

We were literally just having this conversation with another team within our company. We have separate, distributed "feature slices" with their own data. And their product is a monolithic code base that suffers from all of issues that we expect. I had to field lots of "but what if" questions. I shall be forwarding this video to them as a much better explanation! Thanks!

LeeOades
Автор

An interesting alternative that I love is using the Saga orchestration pattern (not the choregraphy one) to handle a situation like your ordering and payment example. You can still use eventing as a way of communication between your bounded contextes, but it helps a lot the tracking/monitoring of a full business use case. And naturally, it helps the troubleshooting and data/process repair if required when a business flow halt at the half for example. Big contrast compared to just multiple systems raising events and cascading all effects everywhere, too easy to lost track of everything.

TheKhloroform
Автор

We are currently in the process of decomposing our monolith. We have individual services but have one database. In order to act like we have separate databases we have separate schemas per service. Every service has shared code for accessing the authorization databases. Eventually when we actually split apart the database the this will change but for the time being is a nice solution.

logantcooper
Автор

Awesome video. This is the best explanation of database in microservice architecture. Love the different patterns and examples especially BFF and order-payment scenario. Learn so much in this short video. Thank you!

clnguye
Автор

Thanks, we had similar situation, so we created boundary in our db, schema for each microservice. it can read but not able to write to schema where it doesnt own .

ydkumar
Автор

You didn't describe the scenario where: i) The Schema and DB Write Access is limited to a single service; and, ii) Other services are given read access through VIEWS, enabling JOINs and orchestration in the RDBMS.

alivateRocket
Автор

The BFF approach is the way to go when composition is needed, but in some cases we need to show data in a grid or some other format that is heavily dependant on how the data is structured, thus impacting performance. In that case I tend to aggregate (by listening to events) in the BFF service itself, there we can optimize indexes, cache data and so on. What do you think about that?

arielmoraes
Автор

why don't you use aggregator pattern for the same? You pull data from each service individually and using composite microservice, club the data how the user wants and sends it back. Since it is business data, API gateway won't be of right implementation than aggregator pattern using composite microservice

sanjeevsjk
Автор

I really like this approach, but one thing I still have doubts over is during the last part - I'm genuinely asking as I'm unclear the best way to do this, any advice appreciated.

In the checkout flow described at around 10:00, the client has to pass order information to the order service, but also has to pass some of that information to the payment service, which puts the responsibility of that extra step on to the client and thus the developers.

You say the order service can pass the ID to the payment service to mark the order as ready to have payment collected, but rather than passing the ID, wouldn't it be better to pass the order summary information, so that when the client then calls the payment service it already has that locally?

That way the client has to call the order service once in a normal flow, and then the payment service, but doesn't need to have multiple calls to each - easier for everyone? Have I missed something? Advice appreciated, and thanks for all your awesome videos!

CarlSargunar
Автор

Thank you ! Great content and super clear explanations ! Really made it click!
One question I have is regarding the relationship between schemas belonging to different services.
Should the relationships like on to one or one to many still be used across services’ schema to use cascading or other handy features ?

laias
Автор

On the command side, if the client needs to save data to both boundaries for example on a single button click, how would you handle data inconsistency if the request to another boundary fails? Handling transactions spanning multiple http requests in the ui or creating compensating action in the backend in this case seem like a pain..

zengouu
Автор

Really nice video .. Thanks for sharing really like the way you have explained. Wondering what would be disadvantages of having single physical instance and. different schema before we move on to different physical and different schema for each micro service ?

cbest
Автор

Great video, but I don't think we should dismiss updating all of the microservices at the same time. In that case, ops takes ownership of the data. I think it's a valid approach in some situations.

AlexDresko
Автор

Distributed Event Driven Baby, all the good stuff!

seanknowles
Автор

"a large system" is subjective

KhaledKimboo
Автор

Curious about your example. If instead you had two payment type options with 2 services which have have shared core fields like amount, account, etc. but different “specialized” fields associated to them like Zelle wants ‘email’ and credit card wants ‘ccv’… would you just agree on a canonical model for the core values and deploy them separately still?

patjohn
Автор

As long as you're sharing data you will always have a shared database. What are the alternatives to sharing tables in a Postgres instance, for example? You can ask a microservice synchronously for the data it owns via RPC, as shown in your first example. This is practically the same as reading from the Postgres server directly: you have to make the same kind of synchronous request to read from a database and I would argue that calling a mature database product is actually much more reliable than some homegrown microservice. If you use RPC for data integration, then the callee microservice will become your unreliable database server. The other option is to use some kind of event-driven paradigm as outlined in your second example. Whether you use Kafka, EventStoreDB or whatever, they are all databases too. Kafka is a log-structured database with limited querying capabilities, but it is still a database. They say that Kafka is asynchronous, but that is also misleading: when you want to read an event from the Kafka log you still have to make a TCP connection to the Kafka cluster to retrieve your event and thus you are synchronously coupled to the uptime of your Kafka cluster (i.e. database server) just the same as with postgres.

We have established that both shared relational databases and event-carried state transfer over a message bus are integration through a database system. So what is the real difference? The key point is that Kafka's limited querying capabilities _force_ you to implement some kind of CQRS where readers have a materialized view of the event stream. This does two important things: Firstly, it decouples the reader from the availability of the Kafka cluster (i.e. database server) since the reader can continue to read from its materialized view/cache even when the Kafka cluster is down. Secondly, the writer who owns the data also needs a materialized view if he wants to read it. This naturally leads to decoupling of the writer's internal data model and the schema of the integration events; the event schema can evolve independently of the writer's business requirements, so you do not have the problem of needing to redeploy all readers at the same time when a field/column is renamed.

In conclusion, the advantages of event-driven state transfer are more accidental rather than a fundamental problem with integration through relational databases. You can achieve almost the same result on top of Postgres: Manage access rights to tables so each microservice can only write the tables it owns; Carefully segregate the writer's own internal data model from the schema of the shared "integration table"; Use read replicas with materialized views of the "integration table" for each reader. Under the hood, changes to the "integration table" are written as change events to the Postgres write-ahead log and the change events are then propagated to read replicas. We can see that the write-ahead log and asynchronous replication even work the same as event-carried state transfer via Kafka.

rcts
Автор

shared database is clear indication of bad design of service boundaries.

ebuzertahakanat
Автор

Great video, as always!
Question, if I'm using CQRS and my service only has permission (database grant) to execute queries on the shared tables/databases, would it still be a problem?

harryot
Автор

Hi thanks for great content, at 4:15 what do you do to prevent circular dependencies? when a service need to call another service? I can imagine a case where a service needs data from another service, and that service also have a service that calls the first service...

oQu