Learning Library

← Back to Library

DevOps vs SRE: Complementary Roles

Key Points

  • The “DevOps vs SRE” question isn’t about choosing one over the other; SRE is actually an essential part of a well‑implemented DevOps practice.
  • DevOps is a development methodology that breaks down silos between development, operations, product, sales, and marketing to define *what* should be built and delivered.
  • SRE (Site Reliability Engineering) concentrates on automating deployment, ensuring systems stay up, and providing reliability feedback on the implementations that DevOps creates.
  • While DevOps teams focus on the core design and functionality, SRE teams handle the operational rollout and continuously feed performance insights back to the developers.
  • Together, DevOps and SRE form two sides of the same coin, each further reducing silos and improving the overall delivery and stability of cloud services.

Full Transcript

# DevOps vs SRE: Complementary Roles **Source:** [https://www.youtube.com/watch?v=KCzNd3StIoU](https://www.youtube.com/watch?v=KCzNd3StIoU) **Duration:** 00:08:22 ## Summary - The “DevOps vs SRE” question isn’t about choosing one over the other; SRE is actually an essential part of a well‑implemented DevOps practice. - DevOps is a development methodology that breaks down silos between development, operations, product, sales, and marketing to define *what* should be built and delivered. - SRE (Site Reliability Engineering) concentrates on automating deployment, ensuring systems stay up, and providing reliability feedback on the implementations that DevOps creates. - While DevOps teams focus on the core design and functionality, SRE teams handle the operational rollout and continuously feed performance insights back to the developers. - Together, DevOps and SRE form two sides of the same coin, each further reducing silos and improving the overall delivery and stability of cloud services. ## Sections - [00:00:00](https://www.youtube.com/watch?v=KCzNd3StIoU&t=0s) **DevOps vs SRE Explained** - Bradley Knapp clarifies that DevOps and SRE aren't opposing approaches, but complementary practices where SRE functions as an essential component of a well‑implemented DevOps strategy. - [00:03:38](https://www.youtube.com/watch?v=KCzNd3StIoU&t=218s) **Embracing Failure with SRE Discipline** - The speaker stresses that failure is inevitable, introduces the error‑budget concept, and details how Site Reliability Engineering anticipates, monitors, mitigates, and leads post‑incident root‑cause analysis. - [00:07:22](https://www.youtube.com/watch?v=KCzNd3StIoU&t=442s) **Eliminating Silos Between DevOps and SRE** - The speaker emphasizes automating manual tasks, integrating SRE’s institutional knowledge with DevOps, and breaking down organizational silos to ensure both disciplines work together effectively. ## Full Transcript
0:01Hey there and thanks for stopping by. My name is  Bradley Knapp and I'm one of the Product Managers 0:05here at IBM Cloud, and the question that we're  going to answer today is what is the difference 0:10between DevOps and SRE? This is a question  that I hear on a fairly regular basis, 0:15not just internally, but from external customers  as well. And it's one that we'd like to help you 0:20walk through so that you can really figure out  what makes sense in your organization and I think 0:24the answer is probably going to surprise you a  little bit. Before we get into the video that I 0:29do want to encourage you to like and subscribe, if  you think that you're going to enjoy these things 0:34just click on those buttons that way you  get notified every time we come out with 0:37something new. So, with that let's get right  into the question and the question is DevOps 0:48versus SRE. 0:53And so, as we get into this, I think probably  the most important thing to understand 0:58is this isn't a versus question. You don't have  to have one or the other. As a matter of fact, 1:05I would argue and I think that many people would  agree that SRE is actually an essential component 1:11of DevOps. And DevOps, a good properly implemented  DevOps method, leads to the necessity of SRE when 1:20it comes time to deploy. There are two sides of  the same coin. And so, that's obviously going to 1:25lead to a little bit of confusion because DevOps  is the development methodology, right. That's 1:30it's all about integrating your development  teams and your operations teams. It's about 1:34knocking down those silos between them. It's about  ensuring that everybody is singing off the same 1:40song book and that's very important. And SRE is in  charge of automating all of the things and making 1:47sure that you never go down. There are really two  parts of the same group, and so let's look at the 1:54differences, right, because they do have some  differences. Probably the first and largest one 2:00is that when we think about our DevOps site over  here, right, DevOps is about core development. 2:07The DevOps guys, particularly your developers,  they are doing the core development, 2:12they are answering the question "what do we  want to do?", they are working with product, 2:18they're working with sales, they're working with  marketing to develop design and deploy. What is 2:24it that we do? They're working on the core. SRE  on the other hand, they're not working on the 2:29core. What they are working is the implementation  of the core, they are working on the deployment, 2:39and they are constantly giving feedback back  into that core development group to say "hey 2:46something that you guys have designed isn't  working exactly the way that you think that it 2:50is." So, if we were to break that down a little  bit more they are helping the DevOps group, 2:56our SRE group is helping the DevOps group to  break down even more of those silos. If you 3:02want to think about it this way DevOps is trying  to develop the answer to how do we solve this 3:08problem, SRE is saying how do we deploy and  maintain and run to solve this problem it's 3:15the theoretical versus the practical, and ideally  they're talking to each other every day, right, 3:21because SRE should be logging defects, they should  be logging tickets back with development, but 3:27probably most importantly they need to understand  that they have the same goals. These groups should 3:32never be aligned against one another. And so,  they do have to have a common understanding. 3:38Let's talk about one of the most important  parts, right, we're going to talk about failure 3:43because failure is not necessary failure, it's  just a way of life. It doesn't matter what you 3:50deploy. It doesn't matter how well it  goes, it's going to happen. And so, 3:55when we talk about failure everyone involved needs  to understand that there's going to be some level, 4:02right. There is a failure budget, or an error  budget, where things are going to go wrong. 4:06And what happens when things go wrong that's what  figures out whether or not your organization is 4:11working because your SRE team when it comes  to failure, they're going to anticipate it, 4:18they're going to monitor it, they're going to  log it, they're going to record everything, 4:21and ideally they can identify a failure before  it happens. They're going to have predictive 4:26analytics that are going to say "all right this  thing is going to go bad based on what we've seen 4:30before." And so, SRE is responsible for mitigating  some of those failures through monitoring and 4:36logging, and doing the preemptive parts, right.  So we'll do the monitors, we'll do the logs. 4:43SRE is also going to lead all of your post actual  failure incident management, right. They're going 4:51to get you through the incident to begin with and  then they're going to hot wash it when it's done. 4:55They're going to lead that RCA, that root cause  analysis, and after they have that RCA completed, 5:03and this is the most important part they  have to take that RCA data and bring it back 5:09over into dev and get some tickets open. You  have to get dev online because you've gotta, 5:16these are the guys who are gonna solve the  core problem, some RCAs might be solved by SRE 5:21internally, right. They're gonna spend 50 percent  of their time writing, 50 of their time working, 5:26and so some of that problem they may be able to  fix directly, but sometimes that's not the case, 5:31right. Our RCA may have found a problem  that only dev can fix and that's all right, 5:37that's not a big deal. They're going to get  that over here, dev is going to implement, 5:41and then probably the most important part,  right, so you're going to get that new feature. 5:50Dev is going to get that pulled together.  They're going to get that new feature rolled out 5:53and then they're going to pass that back  into SRE and they're going to say "hey 5:57SRE, that problem that we had  we got a new feature for you." 6:02And then our guys on the SRE side, what do  they do? They then have to take that feature 6:10and they have to figure out how to integrate it  into their monitoring and their logging efforts 6:16to make sure that we don't get into  another RCA for the same kind of a problem. 6:21So these groups, they are part and parcel  of the same bunch. You really can't have one 6:27successful organization without another. And  when it comes to figuring out a distinction, 6:33it's not something that you should spend a lot  of time with. There are different skill sets, 6:37right. Core development DevOps, these are the guys  that really love writing software. SRE is a little 6:42bit more of an investigative mindset, right. You  have to be willing to go and do that analysis, 6:47figure out what things have gone wrong, automate  all of the things. But there's a lot that they 6:51have in common. Everyone should be writing  automation, everyone should be getting rid of toil 6:56as much as possible because we just don't have the  time to be doing manual tasks. When we can put the 7:01computers in charge of it, right, computers are  not great at thinking on their own, but if you 7:06need it to do the same thing over and over and  over again in exactly the same way you can't beat 7:12computing for that. And so, automation is key,  you just have a slightly different mindset. DevOps 7:18is going to automate deployment, they're going to  automate tasks, they're going to automate feature. 7:22SRE is going to automate redundancy, and they're  going to automate manual tasks that they can turn 7:27into programmatic tasks to keep the stack up. And  so, you know when we talk about DevOps versus SRE, 7:35that's not the question, the question is  how do we build DevOps, how do we build SRE, 7:40and how do we be sure that they are always talking  to each other because the institutional knowledge 7:45that SRE has so much of if that doesn't get passed  back into your DevOps group. You're never going 7:51to be successful, you're going to have a silo  here, and a silo here, and at the end of the day 7:56both of these philosophies core, core component,  is getting rid of silos, freeing ourselves from 8:03those silos is what is going to make us all more  successful. Thank you so much for stopping by 8:09the channel today. If you have any questions or  comments, please feel free to share them with 8:13us below. If you enjoyed this video and you  would like to see more like it in the future, 8:18please do like the video and subscribe to us  so that we'll know to keep creating for you.