filmov
tv
The architecture of a Geo-Distributed SQL Database
Показать описание
In this webinar we define the architecture of a Distributed SQL database. The requirements can be summarized into the five core conditions:
1. #Scale
A distributed SQL database must seamlessly scale in order to mirror the capabilities of cloud environments without introducing operational complexity. Just as we can scale up compute without heavy lifting, the database should be able to scale as well. This includes an ability to evenly distribute data across multiple distributed participants in the database.
2. #Consistency
A distributed SQL database must deliver a high level of isolation in a distributed environment. In a cloud-based world with distributed systems and microservices are the default architectures, transactional consistency becomes difficult as multiple operators may be trying to work on the same data. The database should mediate contention and deliver the same level of isolation of transactions as we expect in a single instance database.
3. #Resiliency
A distributed SQL database must naturally deliver the highest level of resiliency without any need of external tooling to accomplish this. The cloud presents an always-on environment for our workloads and the database should have the same properties. With a distributed database we can reduce the time it takes to recover from a failure down to near zero and replicate data naturally without any external configuration.
4. #GeoReplication
A distributed SQL database should allow for distribution of data throughout a complex, widely dispersed geographic environment. The cloud presents an ability to reach every corner of the globe with an acceptable quality of service and the database should not restrict your applications from doing so. It should perform to meet your expectations
5. #SQL
And while these four technical requirements are paramount, there is one key prerequisite above all. The database must speak SQL. It is the language of data and the default for all application logic. We should not have to retrain developers to use the database. They should be able to use the SQL dialect they are already familiar with.
There are a few databases that meet these requirements. But only CockroachDB satisfies the next two requirements:
The devil is six: #DataLocality
Once you live in a distributed world, it becomes apparent that the database itself could actually take care of domiciling data. With participants located in various regions or data centers, it becomes possible to understand the location of each and then tie the data that it stores to a location. Some application architects have implemented this as part of an application but this approach is error-prone and brittle. Using the database to geo-partition data based on some field in a table is a new requirement for Distributed SQL. This allows you to use the database to address data sovereignty concerns.
And god is seven: #MultiCloud
A unique trait of a Distributed SQL database is that it has semi-autonomous units that all participate in a larger system. Each unit should be able to be deployed by itself and then join the larger system, the CockroachDB cluster. This is an inherent trait that fuels the first five requirements listed above. However, this can also be used to extend the database to be truly multi-cloud. The database should not rely on a single network to accomplish distribution. It should be divorced from these limits so that a participant can be located anywhere, from any public cloud, a private cloud and even a single on-premise instance.
Chapters:
00:00 - 02:18 The architecture of a distributed database
02:18 - 04:59 Why do we need another database?
04:59 - 14:19 What is a Distributed SQL database?
14:21 - 19:05 The monolithic ordered key pair table
19:06 - 26:34 Consensus protocol, cluster and replica
26:34 - 30:23 Building a Distributed Database
30:27 - 36:23 Does splitting ranges cause a lot of data movement taking too much compute power?
36:23 - 37:23 Should a leaseholder be geographically closest to the application?
37:32 - 40:02 Transactions in a distributed database
40:02 - 45:43 How a transaction works in Cockroach
45:43 - 49:05 How do you optimize transactions in a distributed system?
49:05 - 50:32 How do you design your tables, keys, any resources to help think in Cockroach design?
50:32 - 52:55 General guidelines for smaller nodes versus fewer bigger nodes
53:03 - 55:13 How backup and restore works in a Distributed Database
55:14 - 56:20 How to get started with Cockroach
---------------------------------------------------------------------------------------------------------------------------
1. #Scale
A distributed SQL database must seamlessly scale in order to mirror the capabilities of cloud environments without introducing operational complexity. Just as we can scale up compute without heavy lifting, the database should be able to scale as well. This includes an ability to evenly distribute data across multiple distributed participants in the database.
2. #Consistency
A distributed SQL database must deliver a high level of isolation in a distributed environment. In a cloud-based world with distributed systems and microservices are the default architectures, transactional consistency becomes difficult as multiple operators may be trying to work on the same data. The database should mediate contention and deliver the same level of isolation of transactions as we expect in a single instance database.
3. #Resiliency
A distributed SQL database must naturally deliver the highest level of resiliency without any need of external tooling to accomplish this. The cloud presents an always-on environment for our workloads and the database should have the same properties. With a distributed database we can reduce the time it takes to recover from a failure down to near zero and replicate data naturally without any external configuration.
4. #GeoReplication
A distributed SQL database should allow for distribution of data throughout a complex, widely dispersed geographic environment. The cloud presents an ability to reach every corner of the globe with an acceptable quality of service and the database should not restrict your applications from doing so. It should perform to meet your expectations
5. #SQL
And while these four technical requirements are paramount, there is one key prerequisite above all. The database must speak SQL. It is the language of data and the default for all application logic. We should not have to retrain developers to use the database. They should be able to use the SQL dialect they are already familiar with.
There are a few databases that meet these requirements. But only CockroachDB satisfies the next two requirements:
The devil is six: #DataLocality
Once you live in a distributed world, it becomes apparent that the database itself could actually take care of domiciling data. With participants located in various regions or data centers, it becomes possible to understand the location of each and then tie the data that it stores to a location. Some application architects have implemented this as part of an application but this approach is error-prone and brittle. Using the database to geo-partition data based on some field in a table is a new requirement for Distributed SQL. This allows you to use the database to address data sovereignty concerns.
And god is seven: #MultiCloud
A unique trait of a Distributed SQL database is that it has semi-autonomous units that all participate in a larger system. Each unit should be able to be deployed by itself and then join the larger system, the CockroachDB cluster. This is an inherent trait that fuels the first five requirements listed above. However, this can also be used to extend the database to be truly multi-cloud. The database should not rely on a single network to accomplish distribution. It should be divorced from these limits so that a participant can be located anywhere, from any public cloud, a private cloud and even a single on-premise instance.
Chapters:
00:00 - 02:18 The architecture of a distributed database
02:18 - 04:59 Why do we need another database?
04:59 - 14:19 What is a Distributed SQL database?
14:21 - 19:05 The monolithic ordered key pair table
19:06 - 26:34 Consensus protocol, cluster and replica
26:34 - 30:23 Building a Distributed Database
30:27 - 36:23 Does splitting ranges cause a lot of data movement taking too much compute power?
36:23 - 37:23 Should a leaseholder be geographically closest to the application?
37:32 - 40:02 Transactions in a distributed database
40:02 - 45:43 How a transaction works in Cockroach
45:43 - 49:05 How do you optimize transactions in a distributed system?
49:05 - 50:32 How do you design your tables, keys, any resources to help think in Cockroach design?
50:32 - 52:55 General guidelines for smaller nodes versus fewer bigger nodes
53:03 - 55:13 How backup and restore works in a Distributed Database
55:14 - 56:20 How to get started with Cockroach
---------------------------------------------------------------------------------------------------------------------------
Комментарии