Multi-Tenant: Database Per Tenant or Shared?

preview_player
Показать описание
If you're building a multi-tenant application, should you use a database per tenant or a shared database? There a lot of pros and cons and the trade-offs will help you decide which is best for your context.

🔗 EventStoreDB

💥 Join this channel to get access to a private Discord Server and any source code in my videos.

🔥 Join via Patreon

✔️ Join via YouTube

0:00 Shared Database
1:38 Leaking Data
2:38 Noisy Neighbors
3:48 DB per Tenant
4:49 Complexity
5:30 Hybrid
Рекомендации по теме
Комментарии
Автор

Our software supports database per tenant, but in production we've yet to use that feature. Instead of adding filter by tenant to all the queries, we created row level security on a MS-SQL database. Each Tenant has its own SQL USER. Each database tabale uses security policy to apply a predicate at the database level. That predicate will check the SQL USER running the query and apply a predicate to filter out any rows that are not for that tenant. This has the advantages that the developers are no longer responsible for applying the tenant filter in the code, and that any SQL injection attacks will automatically be restricted to just that tenant. Obviously this is only a practical solution when you have a low number of tenants.

steve-wright-uk
Автор

For our SaaS product, we're using a hybrid model. Consumers are, writ-large, on one of several regionally-pooled APIs & DBs. For our important enterprise customers, we have dedicated environments for each. This gives them complete isolation and better performance guarantees. We can also move customers from one pool or environment to another in cases where a shared customer is promoted to an enterprise customer - this is where globally unique ids can save you. The enterprise fees are commensurate with the development of both the hybrid model and the resources they're consuming in their environment.

I can tell you one thing for certain: Infrastructure as Code helps a ton if you're doing either pooled DB & API, DB & API per tenant, or both simultaneously. It makes spinning up those extra environments and deploying changes much, much faster and more reliable.

maple.everything
Автор

We have a shared database for all tenants and use query filters. Works well enough. We plan to squeeze as much life out of this setup as we grow before we decide to take on any more complexity.

logantcooper
Автор

If you're going with database-per-tenant, I would just make sure you seriously take the time upfront to build (or buy) proper sync and validation tools between each database. Easy to skip if you're just starting with a handful of databases, but that number creeps up, and executing a schema change in a thousand databases, or running 999 schema comparisons manually isn't a lot of fun.

JoeEnos
Автор

One of the advantages for database per tenant is if you need to do a database restore. This allows a single tenant to be affected.

steve-wright-uk
Автор

It is worth noting that a multi-tenant system can, more or less cheaply, support a single tenant on a new instance, e.g. if you have some tenants with gigantic amounts of data, or expect some with legal restrictions needing an on premise instance. The reverse path can be more complicated, if you finally realize that you can't maintain a lot of instances.

malehernandez
Автор

Very nice topic. Great overview of important considerations and options.

There are so many variations. Database versioning. API versioning. Backwards compatibility. Containerized versus shared resources. Single instances or Cluster. Split per schema on the same host versus different tables in the same schema versus one distict daemon or machine per tenant ... production, staging and testing environments ... split by tenant in the user-facing data center but mirror into a shared data structure ...

ulrichborchers
Автор

Thanks Man! really need this video for a project.

ITManPH
Автор

At my previous job, we tried to do database per tenant for a SaaS product and one issue we ran into was the actual provisioning of database. We tried to create the database “on-demand” - i.e when a user signed up as a new organization. The problem was we did this on azure with mssql databases so naturally it was slow and added complexity to the sign up process (showing the user a “please wait” screen, azure API requests with preconfigured ARM templates, retries on db creation failure, attempting to use a pool of databases and expanding the pool when needed, managing elastic pools, etc). Not to mention the cost aspect and migration rollouts. Should have gone with something like RLS or query filter

ranulpallemulle
Автор

I think the hybrid approach is valid: 1 only database to maintain and a schema dedicated for each tenant. Sometimes it's possible to spread schemas along 2 or more db instance to tackle noisy neighbors or performance issues. Schemas ensure isolation without applying any query filters. it's like to select the right database but without the burden to manage many databases.

bitaligners
Автор

At all the banks I've worked at, and where I'm at now (a data analytics company) we always use database per tenant. It's the right thing for us, we cannot risk a query defect leaking data, or a problem with one tenant affecting others, or a query-heavy tenant affecting the CPU/RAM of other tenants. Also if we have to restore (which we've done once or twice over the years) it only affects a single tenant.

The downside is migrations and cost. But it greatly simplifies our application architecture and compliance audits.

shawn_bullock
Автор

I inherited a db per tenant architecture. With CI/CD and automation in place many of the downsides mentioned are easily addressed other than cost. Our real complexity came from also having tenant-agnostic data that all tenants nevertheless need access to.

Nontv
Автор

Thank you so much this is ultra informative! Im working on a multi tenanted system right now, and the timing is impeccable!

jackwright
Автор

Resources for tenant, including database and potentially API, also have an impact for provisioning new tenants. If you have a single database and API for all your tenants, provisioning a new tenant is pretty trivial. If every tenant gets their own same resources, then you have to provision and potentially seed all those resources every time you deploy new tenant. Just something else to consider

cdarrigo
Автор

Great video, you covered a lot of variations. 
Multi-Tenant with Supabase is great when using their auth and Postgres functions. There is not need to send user or tenant info because it is part of the session and accessible from the function call, they also scale amazingly.

ThingEngineer
Автор

I have multiple tenants per DB based on the query filter method. But I realized there are cases you want it separate. So I could built some infrastructure to opt-out for separate databases based om customer requirements. Provided that they don't want separate instances of the app all together. I have seen that some companies do that because of data requirements and so they don't upgrade all customers to newer versions without them paying. This is kind of the pre-Cloud and pre-SAAS method. I just feel like it must be a hell to manage with branching and patching bugs across specific codebases for each customer.

marna_li
Автор

Also consider data privacy and retention laws vary by country, so you might want to have a shared multi-tenant database for each unique region/country that your clients are in. Reduces complexity from one db per client, but makes dealing with each unique set of privacy laws easier.

ziftytodd
Автор

Single db is our approach and we limit the issue with noisy neighbours by using tenantId and PartitionKey. This way is azure storage when data is split between nodes, each tenant should be on one node, so they only affect this node

qjn
Автор

Split to multiple for better querying, re-indexing.

anhcoder
Автор

We have a database per tenant model. In our use case, the system could be split into micro services, because 1/2 of the business logic and tables are exactly the same for all tenants. However, due to the needs of each tenant, we would have too many variations in the tables, and we need to do quite extensive join operations.

We looked into a hybrid (multi-tenant for the 1/2 that's similar) and tenant per database, but since most of our business deals with joining from the "core" tables to the tenant tables, the test queries we ran were not promising.

In our case, since we know exactly how the complexity runs, we have a git repo that is "core" repo, that every tenant is forked from and upstreams from, so in the few cases that we have to change things in the core tables, we just run the git upstream fetch commands.

One thing to note, is that in our industry, data leakage is 100% not acceptable to any of our clients, and we'd probably lose them as an account (and this would be our best case scenario).

Philinmiami