Learning Library

← Back to Library

Control and Data Plane Architecture for Cloud Databases

Key Points

  • The control plane/data‑plane distinction is a fundamental design principle for scalable cloud services, influencing everything from routers to Kubernetes‑based platforms.
  • In a managed database service, user‑facing clients interact with the data plane for read/write operations, while administrative actions (e.g., backups, version upgrades) are handled through a control‑plane API.
  • Automation of resource‑intensive tasks—often via Kubernetes operators—executes user intents on the data plane and records actions in a metadata store that supports billing, user management, and other admin functions.
  • This separation enhances security and operational isolation, allowing the platform to protect data access while still exposing flexible management capabilities.

Sections

Full Transcript

# Control and Data Plane Architecture for Cloud Databases **Source:** [https://www.youtube.com/watch?v=Ep1QW-wOmgc](https://www.youtube.com/watch?v=Ep1QW-wOmgc) **Duration:** 00:14:21 ## Summary - The control plane/data‑plane distinction is a fundamental design principle for scalable cloud services, influencing everything from routers to Kubernetes‑based platforms. - In a managed database service, user‑facing clients interact with the data plane for read/write operations, while administrative actions (e.g., backups, version upgrades) are handled through a control‑plane API. - Automation of resource‑intensive tasks—often via Kubernetes operators—executes user intents on the data plane and records actions in a metadata store that supports billing, user management, and other admin functions. - This separation enhances security and operational isolation, allowing the platform to protect data access while still exposing flexible management capabilities. ## Sections - [00:00:00](https://www.youtube.com/watch?v=Ep1QW-wOmgc&t=0s) **Control vs Data Plane in Managed Databases** - The speaker explains how control‑plane and data‑plane architectures underpin scalable cloud services, using a managed PostgreSQL offering to illustrate their roles and trade‑offs. - [00:03:05](https://www.youtube.com/watch?v=Ep1QW-wOmgc&t=185s) **Isolated Metadata for Secure Scalable Services** - The speaker describes how using a dedicated metadata store for backup requests—supporting billing and user management—provides natural network isolation for security and allows independent scaling of control‑plane services versus data‑plane resources in a hosted managed database, a pattern that extends to other cloud platforms like object storage. - [00:06:30](https://www.youtube.com/watch?v=Ep1QW-wOmgc&t=390s) **Separating Control and Data Planes for DNS** - The speaker explains how isolating the control plane from the data plane enables infrastructure‑specific optimizations and security when designing a DNS hosting service that provisions records via an API and resolves queries at massive scale. - [00:09:46](https://www.youtube.com/watch?v=Ep1QW-wOmgc&t=586s) **Benefits of Control/Data Plane Separation** - The speaker outlines how applying control‑plane and data‑plane architecture principles to networking and storage platforms yields security enhancements, independent scalability, and performance gains such as lower latency and higher throughput. - [00:13:03](https://www.youtube.com/watch?v=Ep1QW-wOmgc&t=783s) **Control vs Data Plane Trade‑offs** - The speaker explains how separating control and data planes adds code maintenance and expands the attack surface, weighing the pros and cons and advising when this architecture is appropriate for simple APIs versus complex data‑service platforms. ## Full Transcript
0:00If you're going to be designing a cloud based data services platform at scale, 0:06you're going to want to know about the control plane and data plane 0:10system architecture design principles. 0:13These software design principles can be found in the construction and design 0:18of hardware systems and software systems alike. 0:21From routers to software defined networking platforms 0:25to storage based cloud platforms 0:28and even in Kubernetes, where it's built into the framework itself. 0:32The control plane and data plane are going to be important themes 0:37as we look at 3 different cloud based platforms. 0:42And as we work through each of them, 0:43you can see how the advantages and disadvantages come out in each. 0:47So let's start off today by talking about a data base hosted managed platform. 0:57Let's say we're offering our customers instances of databases such as Postgres. 1:05The user can connect to their database through a database client. 1:10Maybe it's their application, 1:11maybe they're running a client on their local machine, 1:15really whatever they want to do with it. 1:17That's up to them. 1:19You can learn more about managed database services, by the way, 1:23in this video over here. 1:27Now this instance is going to be running on our data plane infrastructure. 1:33The user can pull data from it, write data to it. 1:36But when they want to interact with the instance itself, 1:39the service that we're running on their behalf, 1:42they're going to want to interact with an API. 1:46A simple interface where they can do more administrative level tasks, 1:50such as upgrading the version of the database, requesting backups, things like that. 1:58So let's say that the customer wants to create a backup. 2:04They're going to hit the API first, 2:06and then that API is going to reach into the data plane 2:11and they're going to create themselves a task. 2:17Now this is going to be a pretty resource intensive task. 2:20So let's have a set of automation 2:24that's going to execute this task on behalf of the users request. 2:31This automation might be comprised of Kubernetes operators, for example. 2:36That's what we do in cloud databases at IBM. 2:40And you can learn more about operators and Kubernetes operators 2:43in this video over here. 2:45So we looked at when the user expresses the intent of creating a backup on their instance. 2:54Let's talk about some of the administrative things we need to do on the platform side. 2:59So they created a backup. 3:02Well, we built them for that because we're a business, right? 3:06We're going to store that backup request in a metadata data store, 3:12and that metadata data store might support a billing system. 3:18The metadata data store might also support user management. 3:26You'll find that these administrative components 3:30are going to be common themes through a number of our platforms that we talk about today. 3:36We've just set up the infrastructure for a hosted managed database service. 3:42Now, a few advantages are going to start to bring themselves to light in this architecture alone. 3:48One of them is going to be security. 3:50Let's say the user is going to 3:53try to reach into their database and use that as an attack vector 3:57to pull all of the data about the other customers of your platform. 4:02They're going to get contained here 4:03because as they try to reach into the billing data 4:08or the metadata store, 4:09they're going to be prevented by this natural network isolation. 4:16Similarly, you're going to have the advantage of scaling 4:21the infrastructure independently for your data plane versus your control plane. 4:26So let's say you're hosting your database service on a virtual machine. 4:33If you need to host another one why not just add another virtual machine? 4:38Now, you don't have to replicate all of this administrative overhead 4:44running as services in your control plane 4:46in each of the machines that are running your data plane. 4:49Let's see what we can do with other cloud based platforms such as object storage. 4:54So instead of databases, 4:56the instances that we're going to be hosting for our customers 5:00are going to be object storage instances 5:02where the user is going to be storing buckets. 5:08So these buckets are going to contain whatever the customer wants to put in there 5:12videos, text files, blobs of who knows what. 5:16It's up to the customer again. 5:18We're going to see how the data plane can offer 5:22our infrastructure optimizations in this case as well. 5:25So by having this natural separation, 5:28we can now provide a thing like a CDN to our data plane, 5:34which will be helpful for the user to read large amounts of data 5:40very quickly, which will utilize caching 5:44and localization to improve reads. 5:49So a CDN will be something that's going to be very helpful for object storage, 5:55where it might not have been as intuitive how it would be applied to a database service. 6:01We'll also want to consider load balancing. 6:05With the object storage platform, 6:08you're going to be receiving thousands to hundreds of thousands times more requests 6:14in traffic to your data plane, reading and writing to these object stores, 6:20much larger packet sizes, 6:23and many more packets coming into this data plane 6:27than you would arriving into your control plane. 6:32So in this case, you can optimize for what infrastructure is present on your data plane. 6:39This is going to handle 10x, 100x, 1000x gigabytes per second 6:45versus this one's load balancer. 6:50So here already, we're starting to see how it's pretty advantageous 6:53to have isolation between the data plane and the control plane 6:57and optimizations that we can apply to 7:01one set of the infrastructure versus the other. 7:05Now let's talk about a networking service that we can 7:08design using the same infrastructure principles, 7:11the control plane and data plane. 7:13In this case we're going to be hosting a DNS platform. 7:18So we're going to be resolving queries 7:21either from within the private network or public network 7:25to this DNS service. 7:29Customers are going to provision DNS instances, 7:32and they're going to be creating their own set of DNS records. 7:36So the API is going to look a little different 7:38from when it was providing the database service 7:42and creating modifying database instances 7:45and object storage instances. 7:47In this case, we might expose endpoints that will 7:50allow the user to create "A" records, "C" records, etc., 7:58and then these records are going to be propagated into the data plane 8:04where the DNS resolver, the instance that they provisioned on our platform, 8:09will then be able to resolve DNS queries according to how 8:13the user has configured their instance to do so. 8:16So with the DNS service, 8:19now we can also talk about how the security 8:23of this system can be optimized in the data plane versus the control plane, 8:28but in a slightly different way from how I described it earlier. 8:31We've already covered how there's a natural network isolation 8:35between the control plane and data plane infrastructure. 8:39But let's think about what we want to support in terms of traffic 8:43for the data plane versus the control plane for the DNS platform. 8:48In this case, we can actually now support UDP. 8:53So UDP requests are going to be hitting the DNS resolver, 8:57the instance that the customer has provisioned on our platform, 9:01and that will optimize for latency and throughput, 9:05at the expense of some consistency 9:09if you consider the UDP protocol. 9:12Versus if you look at the control plane and the types of protocols that we want to support there 9:19and what we want to be optimizing for in this case, 9:22you're going to want to have a higher level of consistency in this case, 9:27as opposed to the latency. 9:30And you'll sacrifice some latency as well. 9:33When you create these A records 9:35and you have them propagate into the data plane 9:38it will take a little while because it has to make the network jump 9:43and it has to propagate throughout the system. 9:47So we just worked through 3 different examples 9:50on both the networking platform side of things, 9:53as well as the storage platform. 9:55There are some advantages and disadvantages that are going to arise 9:59from applying the control plane / data plane system architecture design principles to your platform, 10:05and you might consider these as somewhat of a double edged sword. 10:12Let's talk about some of those advantages that we saw as common themes through each of these examples. 10:17You're going to have some security gains for sure. 10:22We talked about it with DNS. 10:25An example is allowing UDP where you want it in the control plane. 10:30You can focus on those inherent problems to your your control plane infrastructure v 10:34ersus your data plane infrastructure. 10:36You can really drill down on those and make sure that your systems lock down. 10:41We talked about scalability. 10:43The ability to independently scale the infrastructure 10:47on one side of your system versus the other. 10:49It'll simplify things for you, 10:51and it'll make you more ready to support that increased demand 10:54in the future once you start gaining all of those new customers. 11:01Then you have performance improvements such as latency. 11:06And this ties into the kinds of problems 11:09that we talked about earlier, how, let's say in the data plane, 11:13you want to allow a greater throughput of requests 11:18and respond with a greater volume of responses 11:21in the same amount of time, versus your control plane. 11:27And then more of a meta advantage that you get from this 11:31is the ability to allocate your engineers, your work resources, 11:37your management around these infrastructural planes 11:41and the components within them. 11:43So on the control plane you might have a control plane team 11:48with subject matter experts that focus on things like the API. 11:53Your front end guy will focus on that. 11:55Or somebody who knows about the billing management system 11:59will be your go-to guy for that. 12:02Then you might have somebody on your data plane side who 12:05will focus on the operators or the set of automations, 12:09whether it's Jenkins or Travis, things like that. 12:13And lastly, and really, you can slice and dice it however you want, 12:18you can have engineers focused on the software 12:21that's going to be providing the service itself. 12:26So that helps you organize your team. 12:30There are some disadvantages as well. 12:33The other side of the sword. 12:35And those will ... 12:37well, let's use a different color for that. 12:44And those will be your overhead. 12:49So two types of infrastructure 12:54calls for the need to support the complexity of that infrastructure. 13:03Bridging these two types of infrastructure 13:06call for more code or maintenance of that code. 13:10And another thing to consider is your increase to the attack vector, 13:15or your attack surface, excuse me. 13:20More infrastructure, more types of infrastructure. 13:23Give your adversaries, the people that want to take down your platform, 13:26the ability to exploit different types of problems 13:31that are inherent to each of your two infrastructural planes. 13:35We covered some of the pros and cons, both sides of this double edged sword. 13:40Let's think about whether this system architecture design principle 13:44is applicable for your system. 13:47Are you going to be hosting something simple 13:49like a pet store API, 13:51or are you going to be hosting a platform 13:55which will offer your users data services 13:58where they can run their own analytics, their own data processing systems? 14:02If it's more towards the latter, you're going to want to consider 14:05the control plane / data plane system architecture design principles t 14:09o be applied for your platform. 14:11If you liked this video and want to see more like it, 14:14please like and subscribe! 14:16If you have any questions or want to share your thoughts about this topic, 14:20please leave a comment below.