speaker-0 (00:07)
Welcome back to Adventures in DevOps, where we deep dive with expert guests. Today's adventure focuses on infrastructure as code, which is, guess, as specific as asking an author if their next novel will be written with words. Our expert has done the systems administration thing, the founder thing, the AWS consulting thing, previously the director of cloud architecture at CBS, but here today, due to their work maintaining what I think is the world's largest repository of open source IEC modules,

the CEO and founder, Eric Osterman. Welcome.

speaker-1 (00:39)
Thanks Warren, glad to be here.

speaker-0 (00:41)
I mean, that was probably one of the longest intros I've done so far.

speaker-1 (00:47)
Well, it's a long track record, know, been at this for a while. I mean, I can't believe that I've been doing this since 2015, just terraform, terraform, terraform, terraform. But that laser focus on like one thing, I guess, you know, I found enjoyment in going deep.

speaker-0 (01:02)
Well, I think that laser focus also really helps sell it as like, this is my niche. I am the foremost expert here. And like you said, it's been over a decade now. And I want to believe almost any engineer who has worked with any sort of infrastructure as code, IAC knows about your company as well.

speaker-1 (01:18)
You know, I like to think that and it keeps me going. ⁓ But, you know, whenever I've gone to like AWS re-invent, I get very humbled ⁓ because I'll be saying, yeah, I run cloud posting there. ⁓ interesting. What's that? we do Terraform. What's Terraform? And it's like, you're at an AWS conference, you don't do Terraform. How are you managing your infrastructure? Well, it turns out that actually I believe something like the large majority of IAC is actually cloud formation and not Terraform. But in my bubble, that's in...

That's sacrilegious.

speaker-0 (01:50)
Well, I can play you into the dirty secret here of that we also use CloudFormation, but ⁓ we don't use the CDK. We generated our own programmatic wrappers to dynamically generate CloudFormation because doing it straight in JSON or YAML is so atrocious. ⁓ We did that earlier before Terraform really picked up and the other ones out there, which I want to ask your opinion about ⁓ at some point in this episode.

speaker-1 (02:19)
would

love to go into all of those things. And I was recently posting on even just how, you know, I don't think languages as it comes to ISE and as it comes to a product are what make or break the success. There's so much more than the language choice, but I think what you did makes total sense. And there's part of me that has some FOMO on cloud formation because it does make some things easier and it has improved over the years. yeah, writing raw cloud formation.

That was like the one of the first contracts they did like in 2010 or something. Actually, that is part of the origin story of Cloud Posse. I built this system for a company, Lookout Security, and it was an exploration into managing infrastructure as code. In its infancy, cloud formation had just been announced. So we built out, we did kind of like a spike on this and I called the project the Insane Cloud Posse.

And you can see where that five years later after traveling the world for a few years, I founded Cloud Posse as the company in that name.

speaker-0 (03:26)
When you're telling that story, feel like I'm already seeing trauma when CloudFormation originally came out. There was a dedicated UI for moving components around. I think their goal was you would define your architecture using a graphical user interface. And this interface is literally the worst thing of any service that AWS offers. And it memorizes location and stuff. It was very programmatically done. And just the output was just absolute garbage to store in any way. I think very quickly, everyone's like,

No, thank you.

speaker-1 (03:56)
Yeah,

I don't know if we missed that part because this was something we did in Ruby, which was the language du jour at that time. And then generated the templates. this is kind of like, this is a sore point for me a little bit because it is the state of IAC a decade later or 15 years later.

I mix it up 2004 or 2006 is when I started using AWS and these things weren't around then. And at that point you're using, ⁓ whatever that Python package is to manage cloud. ⁓

speaker-0 (04:32)
I don't even know what that is. I'll tell you, I got my journey started accidentally when the company I was working for wanted someone to orchestrate the deployment of their thousands of Windows services to individual hardware machines, like not even virtual machines in their production data centers where they had a list. It was like pretty much a CSV list where this server had these services running on it at a time and to consume that and

At the exact same moment I was doing this, the worst thing that ever happened was that Puppet came out and said, this is the way we'll do it. And so my company was like, amazing, let's do that. And then soon after everyone was like, that's the worst thing in the world. We should use Chef instead. But my company kept on using Puppet. So that's its own sort of trauma. ⁓

speaker-1 (05:24)
Yeah.

⁓ yeah. We're kicking a hornet's nest here. So that's actually good that you said that because, put a pin in that briefly. What I was going with that is the interesting thing with IAC is it's foundational. The tooling for it is often pretty unopinionated in that sense and basic to do anything with any of this stuff.

The first thing you reach for is a tool of some kind. your case, something, you don't want to write walls of YAML, of CloudFormation with conditionals and logic in that. Totally get it. I totally get CDK also on top of that. If some use it, I know you're not. But yeah, some template engine. So you do that. And then the next company does it. And every developer, when they approach these things, they, this is super cool, but I know what I'm going to do.

I'm going to build yet another ⁓ factory around this or something to systematize it. And then sometimes you get these things like you said, puppet, chef, and that's the sore point, right? Because those are relics of the past of where we came from and there's so much investment in them and they are no more. And then there's Kubernetes and then there's

speaker-0 (06:39)
Don't say crossplane. ⁓

speaker-1 (06:42)
Yeah, that's an interesting one, cross-playing. ⁓ I want to love it, have FOMO, but I never speak to anybody actually using it in production except for companies who are moving off of it. ⁓

speaker-0 (06:55)
I got one. did happen. There was one company I was advising that actually was in it, except what had happened was some engineers from like ex-engineers from Google had joined the company, started spinning up everything in Kubernetes. No one else knew anything about it. And then of course with that, they're like, wouldn't it even be better if Kubernetes deployed all of our infrastructure for us, you know, because it wasn't complicated enough before that. And so they wrote all their infrastructure ⁓ was deployed through crossplane. And then as a result,

They're like, ⁓ this is boring. I've automated everything time to leave. And these like, I was like five engineering teams now had Kubernetes and crossplane running as their only IAC and their whole architecture was run like that. And no one had any idea really what the heck was going on at that point.

speaker-1 (07:39)
It's, you know, I could believe they were on GKE, you said?

speaker-0 (07:44)
Yeah, they will. Well, no, actually. ⁓ yes. These were the days and it actually wasn't so long ago. They were running Kubernetes on top of the virtual machines directly. They weren't using GKE.

speaker-1 (07:55)
Yeah. And this is kind of, you know, we come at this from an interesting perspective at Cloud Posse and me in particular. there are so many, like the whole problem space of DevOps has gotten so large. think it's almost impossible to claim to be a domain expert ⁓ top to bottom on all of it. We have been laser focused at like day zero, like going from bare

AWS account, for example, and then where do you build up from that? And what is the architecture of that? And how do you scale those systems? And how do you manage that with infrastructure's code? And then how do you build the software delivery patterns on top of that? And how do you get the team on? How do you know? It just goes on and Okay. Right. And cross-plane to me is really interesting, but everything I just said cross-plane is almost after that. And that is the conundrum is that we simply that like it's

speaker-0 (08:48)
Yeah.

speaker-1 (08:52)
If it's OPP, other people's problems to manage everything I just said, absolutely agree, cross-playing sounds really awesome all day long. And if you're running on GKE and you have autopilot, I guess you're closer to that. But then there's still everything else that has to go on there that is more than just this one little cute thing of deploying apps and infrastructure with that for your applications because it's like an iceberg where the rest of your infrastructure is underneath the surface.

speaker-0 (09:21)
I love the picture. I think there is this aspect that exists in the industry, especially in the infrastructure as code space that evolved from we want to automate this thing because we know it's dangerous or I have more than one machine I have to SSH into to make a thing happen. And you get to this place where now, okay, what is the right tool of choice? And everything looks better than what you had before.

speaker-1 (09:46)
It's

good to take a step back sometimes and look, reflect on where we're at. And boy, I don't know when you got started on these things. I remember when Capistrano was like the silver, the holy grail of automated deployments and things. This was a little tool built in Ruby. I remember doing a contract on this probably in 2008 or something in 2007. Gosh, that's almost 20 years ago at this point.

So, but yeah, things were simpler back then. And you, we wonder like, what have we dug ourselves into here on all of it? And the one mea culpa that I would say that I've had over the last year or two softening is recognizing that there are practical limits to IAC and automation where the ROI is not there. And it's not going to be, there's no universal rule for it. It will be different depending on the enter, the

organization, for example, in enterprises, what I've seen is like for whatever reason, you know, they have to use service now. They have to have this way that they provision OUs and accounts and things like that. And it doesn't make sense to introduce ICE. mean, fundamentally, think it still makes sense to introduce ICE. I understand why they say that they're not going.

speaker-0 (11:03)
I mean, I think the message is it doesn't make sense to try to convince them to use IAC. It should still make sense to use IAC, but you know, try to have that conversation. That's the challenge. Yeah, obviously. I mean, it's like not worth it in any way. Like they're still paying me to do the consulting or advisory that they've asked for, but that conversation is not like maybe I have to sacrifice a little bit on my principles, but realistically at the end of the day, like no one's going to care.

speaker-1 (11:12)
GIVEN UP

speaker-0 (11:30)
I do need to ask about that though, because I feel like there is this area and maybe it's worth going into it that basically you've mentioned is one of the cornerstone problems of IAC. I think it says bootstrapping and I think it was like, ⁓ historically it was like, how do I even get onto the machine in the first place to do the deployment? Well, I need to SSH, but I want that requires username and password and I don't want that. So.

Obviously I need my authorized keys there in order to even remote in, but then how do I get the key there before the first time? Well, like, ⁓ and then there's like a whole bunch of system administration, like activities that, technologies that no one ever wants to talk about ever again, like pixie booting. But the same problem exists in the cloud, right? Like you still have to, for your very first AWS account, like sign up and like start clicking some things in order to even deploy ⁓ a deployer from your CICD platform or whatever you're going to utilize.

to even grant it the permissions to get in and start doing that. What is the state of the art solution today? Or is it just basically admitting that what we have is just not good enough, but there's no alternative?

speaker-1 (12:36)
would say off the shelf from the cloud providers themselves, it's all an exercise left up to the end user. And here's medium, go read the posts and you figure it out. I would say there is, depending on who you ask, you'll get wildly different answers. And I come at this, first and foremost, with major imposter syndrome. That's why we open sourced everything, because I want to do like,

Who am I to do these things, maybe if I open source everything, know, believe me. But the other one is now I have the curse of knowledge because I've done all of this so many times and we've just been focusing on that. So the example that you bring up here is I think how so many organizations approach the problem. Okay, we need to deploy our apps. We need something, a deployer, we need something that does that, code build, GitHub actions or something like that. And okay, so what do we need?

Okay, we're going to need, I guess, a platform to run our apps. These days we would pick a container thing. So let's do ECS or Kubernetes or something. So many teams pick Kubernetes for the wrong reasons. And then they spin that up and you're successful with it. Like for the demo that you needed to do and it's all throwaway. it is like, if this is for a actual organization, an actual business, you got to ask almost like the five whys and you got to go deeper and deeper and deeper and deeper. And it starts down to...

fundamental questions that we ask, for example, like, all right, so what regions are we going to be in? What namespace are we going to use to organize our resources? What IP ranges are we going to pick so that we can grow and expand and not overlap? How are we going to do service discovery? What are the DNS names? And it's like, there's, we call these design decisions and these are the things that are not done by teams before they just start to go build out. So most of our engagements are companies that started in...

A single account that grew and grew and grew, churned and churned and churned, one DevOps engineer left, left it to the next guy who's keeping track of this thing. Then they realized, we need one more account because we shouldn't have everything in one account. I'm just going to add a dev account. That's kind of cool, except for then they rushed something out. So now dev is kind of in production and not, and then it's not reproducible and everything just grows organically. But this is why, like, if you ever built a modern software application, you don't do it this way. ⁓

It's like start, it's like, I'm going to build the next billion dollar company with bash. You don't do that. Point is you go more importantly with a framework for how to do these things. You don't go, I'm going to go pick up a book on JavaScript and then just go build my billion dollar SaaS company. No, you're going to pick next JS and deploy it on a modern platform like for sale or, you know, serverless functions or Kubernetes things. You're going to build on things that exist and not start from scratch. And this takes me back to what we started the conversation with.

CloudFormation. is so fundamental and so basic and it sucks to use. So the first thing everybody does is reach for their templating system or reach for something like this, but they're not reaching for a framework. And that's what everyone else does for every other language. I don't get

speaker-0 (15:47)
I gotta tell you, realistically from our standpoint, one of the things that caught us very early on is when we have a multi-account set up in AWS, we see the OU stack set as a strategy where realistically, AWS has a built-in mechanism to automatically deploy across all your AWS accounts in your entire organization without having to know the account IDs, without needing permissions already set up in some specific way. And you can throw cloud formation at it, but...

You can't throw open tofu out. You can't throw a Pulumi at it. I don't know if we're calling a terraform anymore. If system initiatives still exists as a product, you can't throw that at it.

speaker-1 (16:26)
I think you need to close down.

speaker-0 (16:30)
We can now we can officially talk about that and I think it really does help some part of the bootstrapping story It is a little bit of a framework, but it doesn't get you to any real, you know final point one of the problems here though is is I think a lot of people end up in a situation where as you mentioned the or the people who had constructed that architecture of their deployment story are no longer present at the company and so you're coming into a scenario where

Some people believe it's all working correctly and other people know the truth and those people don't work in the company anymore.

speaker-1 (17:05)
Exactly. And I've seen this too many times. When the tough gets going, the going gets, or how do say, going needs to get tough, but they just get going. ⁓ We'll leave it to the next person.

speaker-0 (17:20)
Why

would you stay? Like you fixed it. You know it works.

speaker-1 (17:24)
Yeah, exactly. And then they write a post about high scale to, you know...

speaker-0 (17:28)
How was so great, right? How we completely re-architected the deployment story at, know, insert ⁓ some giant multi-billion dollar company and then they get hired because of that story of which you, like, it was not a long-term viable. And maybe I'll ask this question. Like, what is it here that really solves the problem for us? Like, how does an engineer or a team lead or a director who comes into a company and sees the current state, how do they get from that to...

something that we would consider a viable long-term infrastructure strategy.

speaker-1 (18:02)
Yeah.

that's a really good question. I like to think that we solved it, but let me not answer it from how we approach it and just more generically. I think any conversation on this would be incomplete if we didn't also factor in how the landscape has changed as well today. And I'm going to get to where I'm going with that. AI. Let's face it. But what has, what the key thing about AI is it rewards best practices and brutally punishes.

bad practices. Now there is no canonical best practice. There's golden paths or recommendations or conventions. So that's where I would like to take it. So if we go into this organization and they're in this position and they're going to approach this from the ground up, one thing is what are all the practices that they're missing that mean that they can't institute these systems and things like that have done really well for us are architectural design records along the way.

Because oftentimes decisions made were the best at the time, given what was available, what is ⁓ the time, budget, money, all these other factors. But the problem we as engineers have when we go into one of these problems is I know exactly how we solve this. I'm going to do exactly what I did the last company, but I just left. I didn't stick around to manage it or maintain or see what it was going to be in one or two years. But I get to now advise the next company exactly what I'm going to do.

And hence, we're just recycling advice for more than it's worth. it's worse and worse. So what I wanted to say is seek to understand what the problems are before we seek to fix. In one of these enterprise environments, think it's too often they just start, we're just going to build new, we're going to create a new account where you're going to repeat all the same mistakes all over again, not learning from what we did because it's as engineers, it's what we do. We engineered.

speaker-0 (19:43)
Yeah.

I mean, I like the business focus. I think it's one of the things that's often missed and realistically when you're focusing on this, is that conversation. It's why do we think what's currently there is broken? ⁓ Why do we think the new thing is gonna be so much better than what we currently have and not really worrying about the cost or the transition to get there? I have been coming around to the alternative argument though, is if we have to always, let me say this differently. I've seen a lot of companies say, we're data-driven.

I think that's a dirty thing to say because it means that basically we don't make any decisions unless we have data, but there are certain things that you'll never have data for. So you're basically saying we're never going to go down that path. And I think this is one of those areas where it's very difficult to prove that an architectural strategy that drives your infrastructure as code or your overall infrastructure ⁓ would be better if it were totally different without already having that in place.

speaker-1 (20:31)
and me.

speaker-0 (20:56)
is just like, oh, it's like this one, like, oh, I love when executives do this. We're gonna transition to Azure or GCP because we're getting millions or tens of millions of dollars in credits and it will be so much better, right? The clouds are interchangeable. And then of course, all your enterprise architects stand up and say, no, please don't do that. This is a huge mistake. This is a reason why they're throwing money at you. But, know, good thing we're using Kubernetes. We can just transplant everything. And then you go through it and you realize, okay, you know,

five years later, this was such a mistake. maybe there is a good, or maybe not. Maybe it was the best decision ever. But you only have the benefit of that after the fact. so I do want to ask, so you said, you know, there's the generic answer and then maybe there's also the specific answer because you've been in this ⁓ advisory position where, of course, you are now the de facto expert because you've written the open source modules because your name is attached to them in some capacity. ⁓ You know, you run the company. ⁓ of course, Eric.

Foremost expert in the area. How do you work with these clients to actually transition them specifically? I mean, you did talk about the questions you asked, but there's obviously more to it than that.

speaker-1 (22:03)
You know, the interesting thing, the position that we're in for it is that we put out a lot of material, a lot of modules, a lot of posts. We have our own IAC framework. So if you don't not hear the subtle message here, you need a framework and we have something called Atmos. So it's an open source thing. Go check it out. No. Okay. So back to this question. The thing is we self, we, we find people who self-select that they know the problem by communicating that these things are problems.

so much throughout posts and everything. the hardest thing, you can't like going back to this thing with, you know, click ops for, or control tower or managing organizations or things like that. It's like, you're not going to convince certain organizations that they need to make that change. And dude, they got a lot more money than I have. They're doing something right. So it's not going to make it or break it, even though, you know, fundamentally I disagree. My point is you have to first accept that you have this problem.

seek, understand what your problem is, then seek solutions and see what's out there. So for customers that approach us, they've already identified based on the material we put out that they have these problems. We just nudge them now, like, all right, so you only have three accounts, but therefore you have no ability to control ⁓ the boundaries ⁓ between services effectively. people so much think, one of them, I would say one of the biggest misconception is like,

We just need three accounts, dev, staging and production, right? Yes, you are right. You only need three accounts for an SDLC for software development life cycle. But that neglects the other, that iceberg underneath the surface of everything else you're managing from ⁓ audit logs and security products and network management and centralized ingress, egress and the automation ⁓ where that runs and the list just keeps going and going and going. So for these companies,

We, we paint first. think one of the things Amazon has done a great job on, they have this framework where it's kind of assess, mobilize, migrate. And it sounds like buzzwords, but I've learned to internalize it. Assess means that seek to understand, know what your problems are. And then, I guess I skipped one step. It's modernize. So then show what the future could look like, mobilize, show that you, your stuff can do that way. And now think about how we migrate, but that's dangerous.

I've seen so many things where now, you know, we built this shiny new ⁓ panacea, but you never, you know, it just remains that way because now you have two problems. You can't move off of the old and now you have the new one. And by the time you move on to the new one, now there's a new one. And God help you if you've been naming these environments new.

speaker-0 (24:49)
Well, there's like, know, legacy, legacy, know, 2000. mean, my thing is always just put the data in it. so I do need to ask, though, especially in today's climate, I'm wondering what the field looks like from your perspective now, because I think it's going to be unavoidable for you to be seeing environments that were partially or completely set up by the recommendations from LLMs.

speaker-1 (25:12)
Yeah, yeah, yeah. And I think this is where a lot of ⁓ valid concerns are, but a lot more FUD, fear, uncertainty and doubt. And I like to always bring it back down to applications and application architecture and so forth. They, you will have a much worse outcome if you are just saying, Hey, spit me out some JavaScript and reinvent React for me. And don't use a framework and just do that and see where that gets you. And that's what people are. That's what

What you're hinting at, think leading at is that that's what some companies are doing is that they approach infrastructure as code as though they were approaching JavaScript basic, not even TypeScript, just JavaScript. And that's why ⁓ there'll be this problem. I don't know what the solution is for fixing that. I know what the solution is to avoid it, which is golden paths, best practices.

codified decisions, architectural design records, requirements. And this is catnip for AI. And when you start with that from the foundation, now you can start every new thing follows those conventions. And in your use case here, you have your own in-house framework and there's no shame in that. You define the skills, you define the agents around that that know what to do. And that...

That's what's going to ensure a successful ⁓ contributor process throughout your organization for everyone else. What do I mean when I say agents and skills? Not everyone is familiar with these concepts yet. So Claude, think has done the best job evangelizing it. And let's face it, prompt engineering is all marked down. And that thing we used to call WikiOps, a pejorative, now is

speaker-0 (26:56)
Don't say it. Don't say it.

speaker-1 (26:59)
But that now is what enables the AI. So an agent is basically an instance of AI running with some instructions with a prompt in its own context. And a skill is basically a prompt that you have. So you have a library of prompts. Those prompts have front matter. The front matter is kind of like the yaml at the beginning of the markdown. And that's what the AI looks at to say, hmm, should I care about this bit of ⁓ knowledge? Should I care about this bit of knowledge?

It finds the skills it needs. It finds the agents that can do the job and then it executes it and it'll do well. And this you can implement today and you your organization, if you haven't already, this is what we're doing at our organization. And it's been amazing because we have this IAC framework because we have the Terraform modules. We have all the context. We have the ADRs. We have the requirements. Dude, it's like, I'm talking to my digital twin. I don't mean like cloning environment. Like I have my...

my assistant here who knows exactly what I want, how I want it, and it implements that. And this is what enterprises need.

speaker-0 (28:05)
I think you've hit it on it very specifically. Actually, a couple episodes ago, we talked with Dan Walleen from Azure about skills specifically. But I think what we actually landed on as sort of a definition, and I think there is an aspect here that a lot of less technical people, or even those that just haven't really explored agents as much, are really understanding what skills are. And after a long time, I think where we landed is this idea.

that historically you had to write a README file and in that README file for open source repositories, was like 100, like basically a hundred pages long of how to build and deploy that piece of software to your infrastructure. And often it didn't have handling. What? right. That's the point. It's like.

speaker-1 (28:48)
If you were lucky, you had those.

speaker-0 (28:50)
Oh yeah, it like probably had explanation for like how to deploy it with Docker swarm only. There's no Kubernetes. And if you're not using you want to, you know, bare virtual machine, there was only Ubuntu server version 2016 and that's it because the maintainers didn't decide to actually go any further because there's so many flavors of stuff and on the end there's just and there's going to be a new one all the time and managing that was just going to be too complicated. And on the flip side, in order to even understand those instructions, you would have to.

also be an expert. I feel like historically now, you know, in last few years, the agents would have to know how to handle each of those pieces individually. And when we look at what's available in the training data for for agents, like half of it, least half of it is definitely garbage. And this is what you're pulling into your organization. And I think this is where you were going realistically, is that you don't want to trust what's in the agent.

Specifically or the LLM the foundational model to understand how to deploy everything correctly you want to use what's in the repository But that's incomplete and historically you could write some sort of bash script that had like if Mac OS if Windows if Windows Server You know 2007 if Windows Server, you know, 27 or 2012 etc, etc You don't do that anymore because some of that's included in them in the LLM model But you also don't want to trust that so you would install the skill for deploying in your infrastructure in kubernetes in docker swarm Whatever you're

and also the skill from that repository for deploying that specific open source. And now hopefully those two pieces together would be sufficient for actually deploying. I do like your perspective of if you don't do the ADRs, you don't write any documentation for your internal stuff. You're pretty much just pulling whatever garbage is in the LLM. And so if you do write ADRs, if you do write readmeets, that's like incredibly valuable stuff because that's exactly going to be the instruction. So you can call it whatever you want. You can call it skills. can call it prompts, et cetera.

But that's going to be the thing that actually causes the right thing to happen when you ask your LLM, hopefully not an internal one, one of the foundational models, how do I write this script to deploy to my infrastructure?

speaker-1 (30:52)
Yeah, I think that's it. that's, you know, so garbage in, garbage out. So if you're only starting with just a pure LLM, it's amazing. The first time you do it on the surface, it was perfect. Just what you asked for. And then you dig in and you see it. And then here's the kicker. And anybody who's been playing with these things knows this. You delete that. You ask the same thing again. You didn't get exactly the same outcome. You got something kind of similar. And that's disconcerting. So ⁓ you need an imperative artifact from this, right?

speaker-0 (31:22)
I actually don't mind. One thing I get so triggered when people say it's not deterministic because that has like from an engineering background for me, that has a completely different meaning than how people are utilizing it. I get the same prompt, different output potentially, but there's like seeds and temperature you can change and stuff like that that you're just not like directly exposed to. So, you know.

speaker-1 (31:44)
between my machine and your machine and the ice system and

speaker-0 (31:47)
Yeah, for

Yeah, so that's definitely one thing. And I think there's a lot we can potentially talk about there. But the one thing that it doesn't line up for me is there still all these potential pitfalls that exist in IEC development that we just assume go away as soon as we utilize like repositories or golden paths and therefore LLMs. But I still feel like are there. And the one that comes to my mind most readily is that there is this aspect where

a cloud provider or tool that you're utilizing can know that a resource should not be deleted. An example could be a database that is currently have open connections to, or has been read or written to in the last hour, day, week. And I feel like there's very few protections in place without pulling in something. Someone said check off ⁓ a little while ago. I don't even know if that really does what you would want. And I feel like there are these pitfalls and I really would like that problem to be solved.

It's not like cloud providers don't offer this level of protection for little mistakes that come into your pipeline. So maybe if I'll take the devil's advocate perspective here, it's so great to have the LLMs auto-generate all the code. I don't care if it's not the same every time because I'm going to commit it to the repository, but man, it's going to accidentally change one of those parameters that causes the whole piece of infrastructure, the database to be destroyed and recreated accidentally.

speaker-1 (33:10)
Yeah. So hearing what you said, I think, I think we are more or less in alignment. I think that the key thing here is you are committing that artifact at a point in time. And that's what you're building on and continue. It's this goldfish approach where we don't even to commit it anymore. We can just commute. We're just going to have a library of what we need and requirements. And we expect every time we do it to have the same outcome. And that's what you know. So you need to have an artifact.

of what it is you're working with and you're building on that artifact, you're mutating, you're improving it. So on the security side, I think these things are, there's no silver bullet, they're going to be layers. So ⁓ adding a layer that lets you look for maybe, you mean to do this thing? It's good, but it's not a guard rail. It means that.

It's a little bit fungible, it's a little bit flexible, and it might see something else that makes it, what you wanted, you didn't, you didn't want to delete that thing. And the guardrail wasn't there when you actually needed it. But having adversarial agents that kind of counteract each other is working well.

speaker-0 (34:18)
So are you seeing like LLM as judged here, like where the guardrails are just like another LLM that is validating what infrastructure is being changed and using that to make decisions or are we really talking about like some sort of programmatic and I'll use the term that I hate deterministic here that can actually evaluate what's happening and make a real decision of whether or not that change is a good thing to do.

speaker-1 (34:39)
Going back to this idea, I think it's defense in depth or layers. I think at the state we are at today with LLMs, using it for like a compliance level guardrail or like black or white, I don't think it's 100%. What is though, is using that same LLM to generate the policy that is 100%. And that is closer to where we're going. And that's what I've seen.

a lot of security startups focus on. Then there's the, like the, the, side of that, which is just the code review piece. And there's a lot of products in this space. The one that we've been using, ⁓ we would not have been able to get to where we are in the past year if it wasn't for CodeRabbit. Like CodeRabbit has been amazing. Technically speaking,

There's not a lot there. And I'm pretty sure your engineering team can vibe code that on the weekend to do the same exact thing minus the insane amount of training data and specializations and optimizations they've done to actually give really good reviews, at least from our perspective. So speeding up the code review process, getting better code into the system, having the policies in place, simplifying the process to create those policies. So you're not, you don't have to know OPA.

by heart, but you can express it, then it's effectively applied imperatively through the policy.

speaker-0 (36:10)
I want to break that down for a moment there. It sounds like what you're actually suggesting is that on top of whatever I see you have, it makes sense to write policies about validating the changes, maybe what can be changed, what is allowed to be changed from an organizational ownership standpoint and open policy agent. think that's what you meant by OPA is one way that you could potentially do that. just for as far as

you know, should this person even be allowed to make this change or the permissions you're granting to a service client be allowed to be granted and which services get access to which databases. And then on top of that, ⁓ I actually don't know a lot about CodeRabbit, but I believe it's the one that looks at your pull requests and gives you pull request reviews basically, and then also suggest changes and stuff like that to improve what you've got. Is that right?

speaker-1 (36:54)
Yeah, it's like GitHub Copilot. ⁓ Our experience with GitHub Copilot from a code review standpoint, look, these things change monthly. And thank you GitHub. I am actually not a hater. I appreciate everything you do for open source and everything we do. Copilot ⁓ reviews have been frustrating because I feel like they're either pedantic or false positive or they haven't helped us move faster really.

speaker-0 (37:03)
We gotta throw the cap there.

speaker-1 (37:23)
But CodeRabbit reviews, yeah, it's like on it. It understands our repo. sees that, wait, even though this is maybe not the industry standard for your convention, I see that this is how you're doing it everywhere else in your repo. Maybe you should do it that way and not this other way, even though that might be better. So it helps you keep the consistency. It's allowed us in all of our projects just to... Now code reviews are like the 99 % done by the time we say, hey,

Andre, can you review this pull request? Now he can look at something that is beautiful. It's been blessed. Now it's just like, did this capture like the essence of what we're trying to do ⁓ versus is this.

speaker-0 (38:02)
Like,

looks good to me, you know, just more approvals there. sort of saw what I want to ask is, are there aspects of the CodeRabbit review that, you know, and I don't want to turn this episode into an advertisement for CodeRabbit.

speaker-1 (38:16)
Your affiliation,

the way, just a happy couple.

speaker-0 (38:20)
Is there something surprising that it ends up catching for you that would have a lot of impact specifically or is it just reducing the challenge or the task of doing the code review in the first place?

speaker-1 (38:34)
Let me take, I'm to answer you, but I need to answer a little bit indirectly. Go for it. And something else I've been kind of mulling about on my morning walks is this concept of how fast we're moving. And the concept that's evolved for me is this term of vibe years. So we're all vibe. And if you're familiar with light years, it's really interesting because light is the fastest thing we know, right? And yet...

It takes millions of years for some of that light to reach us. And during that time, the universe itself has been expanding. So you haven't really gone anywhere further. Here we are with vibe coding is really awesome. It's really new. ⁓ we're moving faster than we've ever moved before, but we're comparing ourselves to a model T and that's how we were operating before. Now we're operating really fast, but the problem space and the opportunities are expanding just as fast.

So I would say these days, every week, we're basically doing a year's worth of engineering progress every week. And when you produce that much, the burden to review it is a new problem. Because in the problem solution cycle, every problem you solve leads a whole new set of problems. And you're never done. And this is also where I think

You need something like CodeRabbit if you are going to be effective with AI in anything, whether it's your application development, whether it's infrastructure and so forth, you need something that can review it at the scale, at the speed that you will be moving at. And that's what ⁓ CodeRabbit enables by knowing your PRDs, noticing your requirements, reconciling that with the rest of what you have there so that what you can review is manageable. Did that answer it? Or did I just...

speaker-0 (40:23)
Yeah,

I mean, I think that's certainly a huge part of it. And maybe a good look at what your software development life cycle should actually be. And I think maybe this is the moral of the story for your episode, realistically here, is that you have to stop and think about what the long term is actually going to look like.

are just making individual pull requests and you put the LLM to generate some of the code in the process and you're reviewing it yourself before putting it up, then you haven't really changed anything. Maybe the LLM is just a replacement for doing the internet search that you used to do. But if you're gonna be generating code or generating validations or policies dynamically as part of your development process, then there's no way, like this is how you get around this.

the critical bottleneck in your pipeline, is actually ensuring that the right code is being written and the right feature was being implemented. You need to understand that sort of like a manufacturing process so that the flow makes sense. However much you put in at the beginning is being managed and worked on in an equivalent amount at all work cells in your process.

speaker-1 (41:27)
Yeah, validation.

speaker-0 (41:29)
Yeah, for sure. So.

speaker-1 (41:31)
So review kind of solved, generating code pretty much solved, writing requirements, specs pretty much solved. Validation is, there are a lot of attempts at it right now. Some cool ones I've seen, I forget exactly what their names are. We talked about one on office hours yesterday, but in the end you still need probably a human to still look at the recording.

or the screenshot or the something to validate that that thing is working. And for infrastructure as code, I don't have the silver bullet for that yet. I don't know what that is that actually validates that it is that thing that you think you need.

speaker-0 (42:13)
You know, this is where I'll like quote Corey Quinn, who's like famous for proponent of a click ops. I, sort of mentioned that I think at the end of the day, what everyone actually wants is the simplest experience possible that allows them to just click one button or type one thing and have exactly the right thing come to come out. But at the end of the day, there is nothing that can get you exactly 100 % accuracy in what you're doing. And even before LLMs, this was still a problem. We just sort of replaced.

the imperfect human with determining it with another imperfect object, but you still had to do the validation at some point. I think so my question is going to be like, well, obviously everyone wants exactly as they perceive it to be without any of the hidden pitfalls that they don't know. Right. What's the next iteration here? Because I do feel like we're in this thing that people are only starting to realize, which used to take a lot of cycles. And that was

there would be a technology or a platform that was built and then things would be built on top of that and would go very slowly and we would build more and more and then we'd realize that all this stuff in the middle was wrong. Basically, we built the wrong stuff, the wrong abstraction, then we delete all of it and replace the fundamental original platform with something just slightly on top of that which does all the things that we expect correctly. And I think with the usage of LLM, we're just accelerating this much faster. A recent example is like we talked about MCP like a couple of months ago.

on the podcast and everyone was like, MCP is the best thing ever. And if you look in the recent, you know, knowledge, like the last couple of weeks, everyone's like, MCP is dead. We're back to just calling APIs and CLI.

speaker-1 (43:51)
MCP has died. His obituary was written on LinkedIn.

speaker-0 (43:55)
Yeah. So at this point, I think we're going to officially switch over to PIX. So Eric, what have you brought for us today?

speaker-1 (44:04)
So a little open-ended, not sure exactly if this is it, but ⁓ two guiding things for me. One has been an epiphany I had when I read a book. Had nothing actually to do with, you know, software, what we do, but it was the 10X rule. Okay. And ⁓ I think about it every day. It's on my license plate. It's literally 10X rule. And the idea is you got to set your goals 10 times higher, therefore work 10 times harder, and then you'll actually achieve the thing you set out to do.

And that explains why every software project fails because we didn't set it 10x. And I'm constantly reminded of this, that everything's going to take more work, but you set the goals high, you get a pretty good outcome.

speaker-0 (44:47)
like

create extra stress though.

speaker-1 (44:49)
yeah. I don't know if this generalization still holds true, but you you have a better quality of life maybe in Europe. Here I slept four hours last night.

speaker-0 (45:07)
Is that self-inflicted?

speaker-1 (45:08)
Did

though? Yeah, because I said that in X-Roll.

speaker-0 (45:12)
I think the thing I would take away from this realistically is that there is a different way of living for everyone and it's not worth necessarily following the cookie cutter pattern that an LLM recommends to you. mean, your therapist recommends to you about how you should live your life. And some people absolutely are adrenaline junkies or feel like they need to achieve something or build something. thinking about how you want to go about and do that, I think is actually a real challenging topic.

when you're consulting an organization and ask them to think about who they want to be when they grow up. I mean, you know, what, are they even doing the product that they're doing? So are you trying to achieve something specific right now?

speaker-1 (45:54)
Yeah, I, well, you know, ⁓ I have this delusion that I will finally solve, you know, infrastructure and automation and how these things work. And it's what keeps me going. But the reality is, you know, we're never done. But it's doing something I love. I there's one other thing that hit me. Wear sunscreen. It's a weird song from like the nineties. was like a commencement speech.

speaker-0 (46:09)
Yeah.

speaker-1 (46:18)
But in this thing, he states these like, you know, obvious facts like wearing sunscreen is good, right? And brush your teeth, floss. But really the rest of my advice is, you know, has no basis in reality than my meandering self, right? It's just like, there's, I love that song. It just, it's spoken word, interesting song, and that resonates with you at least if you're 40 plus.

speaker-0 (46:41)
Wait, is it, is it, you said commencement speech. Like was it at a commencement or is it, it's just part of the.

speaker-1 (46:46)
⁓ It's recorded, at least it is, but then it was re-recorded with a soundtrack and a guy playing a guitar. I've been listening to it. Everybody's Free to Wear Sunscreen, I think is the song.

speaker-0 (47:02)
You know, I think this, so this segment didn't start off as a challenge for the host to find, you know, very esoteric links to things that guests have referenced, but I think you've just, you've got a new high here that, you know, like if I can't find this, I'll be just doing a disservice ⁓ to all of our listeners about who now want to listen to this thing.

speaker-1 (47:26)
I'll share the Spotify link.

speaker-0 (47:28)
Okay, okay. So this is an easy found thing, not like you have to buy a DVD from Australia.

speaker-1 (47:33)
So,

it is, yeah, it's, it's interesting. was on Peloton one day and it was the closing song they played and I was like, wow, that's deep.

speaker-0 (47:39)
Okay.

Okay, you know, I had this weird flashback. Like there was a song I was listening to in a taxi one time on the way home from an airport when I was living in Wisconsin. And I don't know to this day what that was. And you say this and I'm wondering if it was that. But so I'm going to have to go and check this out. And you may have just saved me or ⁓ this may be a new opportunity. So ⁓ is that related to the 10x or is just just completely all worth it?

speaker-1 (48:15)
polar opposites.

speaker-0 (48:17)
Isaiah, protect yourself, also chase after your dream.

speaker-1 (48:21)
It's just it's just a very succinct summary of just life, know in you know, you're in this race But in the end it's only with yourself and you know, sometimes your head sometimes you're behind, you know Maybe you'll dance the funky chicken on your 75th birthday or you know wedding anniversary or maybe your divorce You know, you don't know what you're gonna do right now at 21, but you know still people at age You know 40 still don't know what they're gonna do, but the race is long and you you just keep going and so

I love these kind of motivational, inspirational things. There's no one way and you can't just raw, raw, raw 10X and I didn't mean it to sound that way either. But it also like if you're, you if you don't get the outcomes you want, the basis is too low, the angle is too low, increase the angle and you know, it's not gonna be exactly that linear, but it's gonna be higher.

speaker-0 (49:13)
You got to make it mathematical. Okay. Well, you know, that's something that will definitely resonate with ⁓ quite a few people, think, especially those that are aspiring for something more than where they are right now. I like, you know, the second best time to plant a tree is today. So you can take that and invest it in yourself. ⁓ Sorry, it's ⁓ from a book, the 10X yourself. yeah. Okay. Okay. I just want to make sure we have that on the recording. Okay. I love it.

speaker-1 (49:35)
The 10x rule by Grant Cardone.

speaker-0 (49:42)
Yeah, so I guess I also brought a book. It's going around because I think it was just released on video or at the movies wherever you are. Project Hail Mary based on the book by Andy Weir. And I think a previous pick of mine was The Martian book, which I just only recently saw The Martian movie and it's just nowhere near as good as the book. So definitely read the book, Project Hail Mary. And I worry about bringing this up on the podcast. ⁓ Eric, have you seen the movie or read the book?

So, you know, luckily for me, don't have to discuss it because I don't think I was trying so hard. Like, how do I even explain what it is without completely spoiling it in every way? Please do not watch trailers. Please do not read synopsis or descriptions of this in any way. Just jump in and read it. It's science fiction. So if you don't like that, whatever. But if you do like it, like everything, don't spoil it for yourself. I will say that it is about a main character that wakes up in a suspiciously constructed spaceship.

And of course, there's like some of the common cliches like he's got amnesia, but there's actually turns out that there's a great reason for that. So if you like science fiction, if you like The Martian or his second book, Luna, I think those on the moon or he wrote some fan fiction about Alice in Wonderland forever ago. Actually, ⁓ there is a really great short story that he wrote that I didn't know was by his from forever ago called the egg, which I also did a pick on. So.

That's also pretty great. That was a pick in a previous episode. I don't know. don't love, I don't read that much, but when I do, I somehow stumble upon stuff that I absolutely love and this is one of them.

speaker-1 (51:16)
Sounds like you're reading definitely for pleasure, which is a good thing. ⁓

speaker-0 (51:19)
Well,

it's interesting. I used to do this thing where I swap back and forth between one science fiction or fantasy book and one sort of nonfiction about technology or leadership in some way. But I found that over time, I was collecting like a lot of notes on what I should do or what I could do in hypothetical situations that I was never in at that moment. And it was just getting too monotonous for me. Yeah.

speaker-1 (51:43)
You don't have to finish a book, but there's a book for every problem. And that's how I look at it.

speaker-0 (51:49)
No, I totally agree. think that's a great way of putting it realistically. It's as a leader, the thing that I've learned is it's not about having the right answer to the problem. It's about knowing how to find the right answer once you've identified the problem. So it's about problem identification, more so about actually prescribing the solution. Well, thank you so much, Eric, for joining us on this episode of Adventures in DevOps. It's been absolutely fantastic.

speaker-1 (52:12)
Likewise, Warren, it's been awesome. I really enjoyed the conversation. You ask a lot of good questions. You keep it going.

speaker-0 (52:17)
Oh, well, thank you so much. if anyone else wants to be the target of the onslaught of interrogation on my behalf, I'm happy to give you your own episode. But until then, thanks for all the listeners for tuning in for this week. And hopefully we'll see everyone back again next week.