speaker-0 (00:07.928)
Welcome back to Adventures in DevOps, where apparently we're on a spree of roasting CI CD platforms. Historically, we found guests who shine talking about why all build systems struggle. Last week, Cassidy Williams was on to unofficially defend GitHub. It was a great episode, by the way. And this week I'm bringing in Chris Farris, AWS security hero, cloud security evangelist, and famously quoted as saying, if there are 15 ways to do CI CD, my company is doing twenty-one of them. Welcome to the show.

speaker-1 (00:36.364)
No, thank you. And yes, apparently they've expanded since I said that in twenty seventeen and now that I think they're up to three hundred and sixteen. but they've also become four companies or or changed ownership four times. So, you know, that's what happened.

speaker-0 (00:51.374)
How do you know what the count is? Like do you have some sort of crawler that goes out there on the internet and finds all the companies suggesting that they can solve CI C D as a platform or as a product and just keeping yourself up to date?

speaker-1 (01:03.532)
Like any good AI these days, I make up the number. But it goes back to a a thing that I had dealt with early on in my cloud security career, which was hey, everybody is doing something differently because we all kind of evolved our way to the cloud rather than planned on migration to the cloud. And so, yes, I had one team over in this one brand that was doing something with.

you know, maybe Circle CI and another one was Building with Jenkins and another one was I I don't remember all of them. This was before GitHub Actions was a thing. And so like each team had kind of evolved to do its own thing. And then I was said brought in as the the first cloud security person and told, okay, everybody should be doing CI C D get them all to do it securely. And I'm like, you're doing it that way. I can't.

so I started like trying to explain it to leadership and to others that, you know, if there's fifteen ways to do CICD, we're doing twenty-one of them. and then that company merged with three other companies and there were more ways that I was responsible for. And then I left to join one another company which was doing similarly the same thing. Now those two companies came together. And so, yes, it does seem like that over time there just seemed to be more and more ways to solve the same problem.

And large organizations never really kind of forced teams to get on board with doing it one particular way.

speaker-0 (02:28.13)
Why do you think that is? Is it like all of them are just bad? And so it's this swamp where you get thrown in into the deep end right away and i people are just pulling the first thing that they can find.

speaker-1 (02:39.224)
So it's not a strategic priority for management to invest in getting teams to convert to, you know, the one true way. Assuming they could even agree on the one true way, there's no way that management is going to be like, we could either launch this new application or we could retool this team to go use the the company's CICD pipeline. So it it became just organizational tech debt. that hey, this team used

you know, this one tooling and this team used another tooling. You know, I've since jumped around a number of companies since I, you know, came up with that quote in around 2017. And I've seen a couple of companies where there is typically the company's one true way of doing it. And most everybody's on board with it. But almost every company these days has some mergers and acquisitions that go on. And so

We bring in this company, and well, the main company's in AWS shop, but they're in GCP, they're doing things differently. Again, no strong motivation for management and the board to say, migrate off of Google into Amazon. No, it's just like this is working and we want to integrate this product with this product because that's how we're going to make money. And so that's what they do. And so you've got half the company on GitHub, half the company on GitLab.

You know, some random, you know, weirdos using Bitbucket on-prem. And as a security person, I have to defend that, not the practice, but the actual infrastructure. It, you know, becomes pa painful and frustrating, especially when, you know, you're you're one person doing that. so if I sound bitter and if I sound angry at folks, it's not that I'm angry at you. I'm angry at the situation and maybe angry at your managers for not like prioritizing that.

We do do something about our raging problem of tech debt. because I think tech debt is leading to and is going to lead to a really big disaster coming.

speaker-0 (04:40.47)
I feel like it's always the case. It's the thing that is sitting in many organizations. It doesn't matter if it shows up. Like it's it's always showing up in some regard in some area in every single team. And I think one of the reasons it shows up everywhere is because it's not well defined on what that thing i actually is. Yeah. I feel like there's a lot of definitions out there, but you know it's like I it's hard to describe it usually, but you know it when you

speaker-1 (04:57.079)
What tech debt is?

speaker-1 (05:05.752)
See ya. So it's the potholes in the paved roads, right? You know, if you've got a really well paved road, you're going at Autobahn speed, great. But you start having potholes in the Autobahn and you're no longer going, you know, 200 kilometers an hour. And so that's what tech debt is. Tech tech debt is those speed bumps, those potholes that exist in your infrastructure that prevent you from going as fast as you want.

speaker-0 (05:29.752)
For sure, for sure. So I I think it's interesting that you bring this up in connection with any sort of continuous integration platforms, because historically that's been the mechanism to go even faster. And so like on one on one side, it seems like the thing that should be prioritized in every single organization. And for a lot of years we saw I mean, before the door report came out, we saw people doing whatever. And then after the door report, we still saw people doing whatever, but

They sort of understood that there were organizations out there that were moving faster than they were using one of these strategies. And I think while you could point to a leadership approach of this mindset has to be applied throughout the organization because clearly there are some benefits. I found it to be incredibly a challenging conversation to have with leaders to explain that they do have tech debt in some capacity and they're not actually it's not just about going fast necessarily, just even building a

product that works tends to be a core challenge.

speaker-1 (06:26.284)
Yeah. So I've been doing this since well, I started my career about the time that Linus was creating Linux. So we have, you know, I I have a broad perspective of what this looked like. And before we had DevOps, we had gray bearded sysadmins who would walk into a room and take a machine out of a box and shove it in a rack and cut their fingers on those rack screws and everything else. And then they'd

stick in a CD and then we got really fancy and we were able to pixie boot off of a and kickstart a server. But you know, that was time. And even before those, you know, sysadmin was taking the thing out of the box, there was the finance people approving the budgets and then the procurement people ordering the stuff and the UPS driver delivering the servers. Right. So we suddenly were able to shrink all of that down into, hey, let me run a Terraform apply and I suddenly have a network and I'm but

bunch of machines and a database and a firewall and all of that. And so managers were like, wow, this is so much better that they got from horse and buggy, you know, 25 mile an hour kind of speeds up to, okay, well, now we're going 55 and this seems really cool. But there some some teams want to go faster, you know, some teams have a more of a lead foot. So they haven't invested to go and get that

incremental amount of performance out of the the pipelines by cleaning up the infrastructure and and everything else. And so that's probably why we've gotten to a point and it's good, but it could be better.

speaker-0 (08:04.62)
So I I think I I don't think it's a stretch of the imagination for to say like technical minded people to be very well behind the idea of paying down tech debt. because it's easy for them to see, I think, how much of a problem it is to delivering stuff effectively. But I think you're on to something when you say like there's a difficult conversation there to be had with the people making the decisions on what to work on, because it's not something that's so easy to just explain.

The impact before actually doing it. One thing I see as a struggle in a lot of organizations is, and this, you know, happened a few years ago when I was advising one company, they would say they're very data driven. And for them, what data driven meant was that you needed to have the data to justify a decision before you actually did it. The problem with that is a lot of things in engineering, you don't have that data because you haven't been allowed to perform the experiment to collect that data in the first place. So saying we're data driven is just code for.

I disagree, we're gonna do it my way and you can't do anything about it.

speaker-1 (09:06.54)
Yes, I would say that, you know, that can be a a different way of weaponizing the the disagree and commit sort of thing is like, yeah, you disagree, but I'm in charge, so now we commit. Shoving observability into the people processes of things is definitely a challenge if it wasn't baked in there to start with. Even just the simple matter of me trying to figure out where my team is spending time and being like, Well, I could look at Jira tickets, but that's not the

fly in help that somebody does giving, you know, answering a question in Slack or, you know, the time necessary to read up on, hey, what's happening with this latest Vercel breach and are we impacted or not? So, you know, the the level of what you can measure doesn't necessarily even always reflect what what the reality is. And so right, you know, I think my challenge to everybody in an individual contributor role is

Find the small places where you can like just solve a little bit of tech debt at a time. And even that helps clean it up.

speaker-0 (10:13.45)
I wan I sort I sort of want to get into that because I feel like historically we can draw a connection between sysadmins early on, basically micromanaging and some pristine garden that they had of the scripts and servers that they had in place and how to stand them up and manage. Chris raises

speaker-1 (10:31.502)
Having been a sysadmin, I will tell you that it was never a pristine garden. That was a lie. We were fooling you. Go ahead.

speaker-0 (10:39.394)
Well, I I think the thing is is that there was a d disconnect between the work that had to be done and the time that needed to do it. I feel like sysadmins would be like, you what, it's better if we wait to the weekend to do the upgrade and then they would come in at ungodly hours to make that happen. And but that meant like during the rest of the week, they may not have had to dedicate themselves to doing every activity. The expectation of being in the office from nine to five thirty or whatever your work hours were be didn't necessarily align. And that may that that sort of lended us to the capability of

doing the right thing at the right time and not worrying about the amount that we were getting done because it was always more like there was always more work. But I worry that we're making this transition, or we've already made this transition, to every aspect of the job and the management, going from a garden that we failed to curate so well, to pushing out as much stuff as possible as fast as we possibly can. And that means that we no longer have the capacity to curate stuff. And so my hypothesis is that

we're only going to be creating more and more tech debt unproven. I have no evidence for this what whatsoever. But I think it's being accelerated by the LLMs that are just creating stuff where we're now not only expected to do more, but n no longer have the time to actually go back and want to clean it up.

speaker-1 (11:56.03)
Absolutely. So I'm I was talking with a mentor of mine who actually was one of the people running about 12 of those 17 CI CD things. And he was commenting that, you know, at this point, we're getting to where the cost of writing code is approaching zero. And, you know, it was true. I started experimenting with Claude. I turned on dash dash dangerously skip permissions in a dev container. And I'm like,

Claude, go figure this out. And it had access to a sandbox AWS account. It had access to my code. And it had access to the internet. And it solved a problem for me that, you know, was like, okay, that would have taken me like three days. It built me a website based on a Yugo template I downloaded and a GitHub repo of an open source project. And it figured it all out. It put together that. It put together all the reasoning and the rationale behind it, all the documentation.

Claw did that one Saturday afternoon while I was walking along the beach. And so, you know, like it it is a lot easier to produce the content. But now then the question is, right, just because your developers or what I'm trying to remember the exact quote, right? But your developers were so interested in whether or not they could create this code, they did not consider whether they should have created this code. And

I like to hearken it back to, again, being now Graybeard, Microsoft Word from the 90s and 2000s, how Word had features that some random enterprise customer insisted that we will only buy Microsoft if it does this. And so you had this massively painful piece of bloated software that everybody hate, but everybody had to use because it was the only thing that was, you know, compatible. I see a lot of products.

Where you've got the idea of product manager or product owner wants to go into linear and create a ticket and sketch out something and then LLMs will, you know, push it into staging andor production. That's gonna lead to a lot of products starting to look a lot like Word of the the 90s. Features that they thought were a great idea, but they didn't measure the observability, you know, they didn't instrument do customers want this?

speaker-1 (14:20.652)
Are customers using this? There's this whole feature set, but is anybody actually clicking in? Has anybody even found it to know that, hey, that you can do all of these cool things in your product? All of that is going to be an entirely new area to suss out as the cost of writing code and the cost of adding features goes to zero. Of course, the cost of adding code goes to zero, but the cost of maintaining that code over time adds up.

speaker-0 (14:50.606)
I like to say that we're in the in the I I I think that we're in the age of disposable software is is how I'll put it. And but you bring up an interesting point about everything tending to go the direction of having feature bloat and then the consequences of having those extra features. And so I think my question from there has to be how many features can a product have before the onus of having those features becomes a critical burden or obstacle to releasing or

maybe managing that product that you actually have available.

speaker-1 (15:24.2)
Folk song from the 60s? You know, that's a good question because we're gonna patch over a lot of that operational pain by just throwing more tokens at it. So it's like, okay, we've we've added a lot of features, now the product gets slow. Well, now we tell the LLM how do we make the product faster? so I think we're gonna get we're we're gonna put more plaster on this hole, even though we haven't actually like fixed the framing behind the hole.

speaker-0 (15:50.904)
So I think a problem here is that my my theory is that it's impossible to want to review AI generated code. Yeah. so I I think I have this like singularity. My thought is that every organization has the to technological singularity. Although what it means is that it's the moment where they no longer are producing or managing their software development lifecycle with humans and having AI do it.

speaker-1 (15:59.598)
Absolutely, yes.

speaker-0 (16:19.266)
But as soon as they do it, they can no longer produce good software after that. So you get to decide like which part of your product or service you have that you are totally fine with, that you love, and that you don't really want it to change, but you need to make micro changes to it or improvements, which you can have an LLM do. So I guess where I'm getting at with that is that there's these organizations that vibe code out of the gate. They're AI first companies, but that for me means they're generating tech debt from.

day one in a non sustainable way. That means the future of their product is is, you know, f a failure from immediately going to happen.

speaker-1 (16:56.76)
So I don't know if there's any vibe coded companies that have really taken off to the extent that they are now operating at an AWS or Salesforce scale, right? Even some of the companies that are kind of like leading edge in the AI space, the Vercells, you know, they they started off as actual humans writing code. So there's probably still, unless they've been laid off.

Some humans who recall some of the architectural decisions and have some of that institutional memory of, yeah, what happened that time when you know the data center caught fire and we had to go fix things and and all of that. You know, that that's the big fear that a lot of us in the cloud community have around the massive layoffs of high-level people at AWS, is all of that operational experience is walking out the door. and so then when things happen.

completely out of the blue that have never happened before, like missiles landing on a a a region taking out multiple availability zones, right? The folks who thought about, hmm, what would happen if we lost two of our three availability zones, how would that respond? Those people aren't there anymore. So, you know, even if they're, you know, it's it's not like you can just go and replace an AWS region. It takes them years to build one, takes them minutes to burn down.

But even figuring out, okay, how do we best support our customers during this horrifying time, they don't have the right people to do that any so I feel like, yes, the the the new companies that are fully v AI first, vibe coded from the get-go. I don't think we have enough history to know what they're gonna look like. I think we do know what companies who started off with humans writing code and are now transitioning into this AI world.

I think we have an idea what that's gonna look like. And that's gonna look like a lot of micro decisions made without the broader context of experience. Cause right, that experience isn't necessarily in the LLM's training data. And you know, the context windows are too big to go back and look at every Slack message from every incident that a human

speaker-1 (19:18.216)
you know in a L7 kind of human would would be able to do. So yeah, I I think we'll we'll see what that looks like. And then I think part of that is going back to the tech debt, right? The more tech debt you get rid of now, the easier your models will have it. Because it's not like they're looking at there's 16 EC2 instances running in this this account. they must all be important. No, they're probably not.

Right. There might be five of them that the LLM needs to worry about, and eleven of them that are off here that were spun up by Bob and it's like Bob's ping test. and by the way, that's open to the world and are running Ubuntu twenty twelve since twenty twelve. And, you know, so that's probably cost you eight thousand dollars just in idle compute capacity since this was done, you know, twenty four years or fourteen years ago.

But it's still something that the LLM has to put into its context window. and the bigger that problem set gets, context windows, you know, are are getting bigger, but I don't think that they can scale to the level of cruft and dirt and grime that you find in people's old cloud accounts.

speaker-0 (20:35.532)
Yeah. No, I I so I actually brought this up as a pick in one of the previous episodes where there's something known as like the the maximum maximum effective context window. basically there is this problem with LLMs where the more a longer context window doesn't actually help solve problems because we're already seeing context rot within a single set of tokens for a single prompt, basically, when there's too much noise there, how do you figure out where the value is? And I feel like humans have the same problem. I write

blog posts that are sometimes quoted as being 16 minutes long. And at that point it's like, well, where's the val like, where is the most important thing that's in that post? And sometimes I get people saying, you should have cut that in multiple pieces because I couldn't read it on my way to work. It was too much. And I'm like, yeah, but I thought about I actually thought about this that, you know, there's not a good place to just cut it in half. And I feel like that is then becomes the work. It is incredibly important on the context that we're giving to

humans and therefore the context we're giving to LMs when we need to generate something. And I feel like if the only context available is a lot of terrible software code that was written by another L LM, it doesn't really have a lot of intelligent stuff to go on.

speaker-1 (21:44.108)
Yeah, or terrible software code that was, you know, written by humans. Because let's let's face it, I can look at any year and just the number of CVEs are going up, up, up. And that's not from AI-generated code. That's from human-generated code. So we're not doing a great job in writing secure code. we're training everything on the insecure code.

code. I don't even know how Mythos was able to actually learn how to do this stuff and what was right because I don't know where right is.

speaker-0 (22:15.648)
So I think I think there's something to be said here. And and that's and I I like that you brought up mythos, mythos, whatever whatever it's called, is that we've known for a while what the vulnerabilities look like. And I don't think it's so much that we're not good at creating secure code. We're I think we're just incredibly bad at removing insecure code from our environments. And like you I I think this gets back to the thing you had started with, which is that it's not really a priority if it's quote unquote working and

the errors and omissions or liability insurance covers it, why why not why not leave it as vulnerable? I mean, who is that really, really hurting? And so I I think it's totally believable that we can automate the process of finding these vulnerabilities, which we already know exist. We've known they've existed forever. We know how to exploit them. I mean, I I think one thing the LLMs do, and I think what r really mythos is doing, is putting together individual steps

Which hypothetically in the past you always knew could be the case is actually stringing them together in the right way.

speaker-1 (23:20.578)
Yeah, building the attack chain is what Mythos is apparently particularly good at. and whether Mythos is hype or it's the end of the world, it's either Mythos or Mythos the next generation. You know, at some point we're gonna start seeing this flood of attacks. But even before the flood of attacks, I think with Mythos and with Anthropic's project Glasswing, which is their limited release.

Hey, come check out Mythos that they're giving out to 50 highly trusted companies, including Amazon and Microsoft and Apple and Google and you know, bunch of other companies that are like make software. We're gonna see this influx in CVEs. So, right, one of the things that everybody needs to start focusing on is how quickly can we patch and how hard is patching? And how many places do we have that Ubuntu 1804 instance still running?

that aren't gonna get the patches, because that then becomes your dangerous attack surface. Once you figure out and you actually get to, you know, regular patching that that nothing in the cloud lives for more than a few days, you've got to find all of the things that are older than a few days. And most of those are going to be the that that that tech debt, the Bob's ping test machine or

That one MySQL five database that nobody's really sure who's responsible for and nobody's really sure that nobody's really sure they want to put their career on the line by saying, Hey, yeah, I'm gonna go turn that off now.

speaker-0 (24:54.546)
I I'm sm I'm smiling because you know, I I don't know if you are, but I'm just gonna assume you are a huge serverless advocate. It sounds like we were right all along. Serverless is the way to go because if you have to manage individual versions or versions of packages or operating systems, like you're it's a losing battle against against what is going to be an onslaught of just attacks coming in because it's going to it basically we're talking about it basically being free to exploit something.

speaker-1 (25:22.862)
Claude, tell me how many Python 3.9 lambdas I have running in the Prime Harbor environment right now. by the way, Claude isn't listening and I haven't hooked it up to have full access to everything, but I guarantee you it's gonna come back with a number that's in the triple digits. Because, right, you know, it's like, I'll spin something up and it's a proof of concept and then it's sitting in an AWS account somewhere. The nice thing about serverless is it tends to not have much of an attack surface.

especially with the event driven serverless stuff that I write that's not beh you know, fronted by an ALB or a a API gateway.

speaker-0 (25:57.048)
Yeah, no, I I mean I totally totally behind that. Actually, it's interesting you bring that up because for Lambda it runs on Firecracker and I think I think one of the first C VEs was just released for Firecracker of a container escape list or last week or so. which is I mean, it obviously like an LLM found it. I say LLM found it. A human found it using a set of tools that was available to them at the time. But yeah, it just it just seems inevitable for for those sets of technology and limiting your attack service.

to the area which you can actually secure seems like the thing that everyone should have been doing all along, but now is a good reminder that if you're not actively paying attention to what that interface is, that you have a a huge vulnerability there.

speaker-1 (26:39.916)
Yeah. Well, and it's the things that are it's just working, right? Yeah. I don't need to change it, it's just working. Well, now, even though it's just working, it's still running Python three nine, you need to change it. You need to change it for the sake of changing it. and that's I think where like engineering management and folks in security and folks on the front lines maybe have a different set of opinions and and expectations. So, you know

I think the primary takeaway from all of this methos in Project Glasswing is prepare to do a lot more patching. and expect that that's gonna become a bigger part of not not even a security professionals time, but a DevOps professionals, an SRE time that that we're gonna end up in a situation where security isn't handing you spreadsheets anymore.

It's security is saying this thing has been running for a little bit too long. Make sure that it's got everything up to date. But not too up to date because there's supply chain attacks. And so you never want to grab the latest. You always want to grab the slightly less latest, unless, of course, there's a zero day out there. and so, you know, I mean, how could you not win at this?

speaker-0 (27:59.902)
see, I think I think that's one of the fool's errands current security advice that is not a not a good thing to tell people, but is right now a little bit valuable. The the delay installing new packages for one week or two weeks is only helpful as long as other people are installing those packages right away. So

you're just utilizing this delay and finding value in it. But as soon as everyone agrees, yeah, you know what? Vulnerabilities only happen in new release versions. Let's just wait a couple of weeks. Then everyone will be waiting, which means that no one will find those vulnerabilities until that moment happens.

speaker-1 (28:36.174)
Unless of course those vulnerabilities are being actively exploited, in which case the folks who will find them are in fact going to be, yeah. I think there there is value in pausing and not grabbing the latest thing that was uploaded ten minutes ago. But I don't think it's more than seventy-two hours and that's just to give you enough time to actually enjoy your weekend before you know some something drops in like that. And then you you need to have that kind of

emergency switch where it was like, yes, the general rule is wait 72 hours before using it. But in this case, go grab this particular version right now, even if it's younger than 72 hours.

speaker-0 (29:16.302)
hours. Yeah. No, agreed. I I think it's a bad proxy for what should be looked at, which is trust. How much do we trust this version of this package? And d time can be helpful because it's correlated with the number of say zero days or discovered exploits. They're not zero days at that point. However, there's no that's not actually the metric that you probably want to be using.

speaker-1 (29:38.488)
There's a good number of companies out there that are scanning the latest and looking for weird things. So you will, you know, even we're not finding these hacks based on the fact that somebody's getting breached from these supply chain things. We we really are hearing about these hacks from the vendor community, from the researchers who are looking at things and

Even just the project maintainer who may wake up and be like, I didn't release three point six point one last night. and then being like, Shit, pulling that and and warning everybody if you installed this, don't, you know, go get three dot seven. So there there is some value in some level of pause, but I don't think you you certainly don't want to race to and you don't want to rely on your fellow companies that are leveraging this to find this stuff. Cause yeah.

We're not good at finding this stuff. That's what the sec yeah, that that's what the security vendor community is for. They are working together, they're doing a reasonably good job of of tracking this threat activity.

speaker-0 (30:44.824)
So we're actually hopefully a couple a couple of weeks from now, we should have someone coming on to talk a little bit more about supply chain security on the developer side as well as what's getting released. There's a lot of interesting products in the place in the in the marketplace and the internet that claim to do something. And hopefully we'll see a a deeper understanding of what exactly is happening there. One thing I do want to sort of probe you on a little bit is the sorts of activities that you're doing and what you're seeing at some of the companies that you're supporting.

when you're going in today, what is like the number one area that you seem to be focusing a lot on?

speaker-1 (31:21.112)
So it's two things. one it's gosh, we have to do AI and h how do we do AI, how do we do it f quickly? And then I'm like, Well, yeah, but you're gonna generate risk here if you do that. And then the other thing is, hey, help me figure out, you know, how to eat these security vegetables that have been sitting rotting on my plate for a bunch of years. and those tend to be like the the the two things and they're actually related, right? Because

Again, I'll say if you you shouldn't be doing fully agentic autonomous AI stuff if you've got a lot of tech debt and you haven't actually built your environment up to support that. So I'm working with one client on okay, so what does fully agentic mean? And what what what do they have to do to be able to get there? And then, you know, just other companies with help me figure out, you know, do I need this security tooling or can I turn it off? Right. Like

Cost and security and operations are really three elements of this this cloud governance triad that I talked about many, many years ago, probably while I was making sourdough during p the pandemic. And so, right, it's like cost matters, security matters, you know, the operations element matters. And all three generally are in alignment because security and oper security and finance can get together and say, look.

This thing is a security risk and it's costing you money. And so if you turn it off, you've solved two problems at once.

speaker-0 (32:51.334)
I guess, you know, part part of it is you're you're coming in and you're helping companies answer the question of basically implementing AI or agentic solutions with their current organizational structure, the current technology. How are they where are they defining the goalposts to actually be? And I I feel like that's a very vague sort of unit's like we need I I hear this a lot and I think it comes up on the podcast. We need to have AI in our software development lifecycle. We need to have it within our actual product that we're providing.

But I feel like a lot of times the people that are talking about this don't fully understand the capabilities there. So I guess my question is like, how are you helping them to navigate that? Or is it they've already decided where the goalpost is and now you're like, How can we still make sure this is secure?

speaker-1 (33:35.678)
could have given me some goalposts, that would have made my job a heck of a lot easier. What it started with is, we should use some AI. So let's get a, you know, some chatbot in to help and do code assist. And so we start working on, okay, what are the procurement, privacy, all of that stuff to do code assist. Meanwhile, there's a couple folks over here who are like all off doing cursor agents and MCPs and all this fancy stuff. And then everybody's suddenly like, ooh, we want to do what they're doing.

and so then like the the engineering and the the initiative effort pivots to that. But we still haven't actually solved some of the things around like, so what's the proper governance for MCPs? And they're off like, MCPs are dead, everything's command line now. and so it's like, okay, yeah. Yeah, and and I get it, right? It it took me a while before I finally like built the harness so I could let AI be AI.

And the moment I did that, the moment I filed up, fired up the Trail of Bits dev container and gave it some AWS credentials and a code base and said, go fix this thing for me. And it did it, and it iterated on itself. And I had, you know, slash remote control turned on. So I was shopping for socks at the mall while I was on my phone watching it figure out what it needed to do, me giving it occasional prompting.

You know, that was the aha moment for me. That was the same level of aha moment that I had back in 2014 when I went to reInvent and saw somebody replace the data center engineer who took six months to get fiber from one end of a room to another one because, we don't have the fiber, or we bought the wrong optics, or well, I was dealing with an outage, or suddenly I went to a cloud formation session on a Friday morning.

And they were like, and so here's how VPC peering looks in in in cloud formation. I'm like, my God, 16 line of JSON will replace that engineer. It was like, Then a few months later, they made, you know, made YAML available in CloudFormation. And then it was like four lines. That was my cloud aha moment. And then this was the kind of AI aha moment where it's not code assist. Yeah, code assist is helpful, but it's like, here's a problem to solve, and here's your constrained.

speaker-1 (35:56.246)
Your constrained environment in which to do it, go do it and show me what you got. That was the beautiful moment. That was my moment of. So that is really what I'm now trying to sell to customers who really are like, what should we do about AI? You should get to the point where for certain discrete, concrete tasks, you can prompt it and it will do it and it will test it and it will give you a commit. And then you'll sit or you'll you'll spend some time looking at it and being like,

Yep, this seems like a reasonable solution. And you'll merge it into production and you'll get something out. But you need to have a really good sandbox environment that the LLM can work in. You need to have a good staging environment that you can test what the LLM did long before it gets to production. And if your engineers have access to production and staging and dev, and they're just sharing their dot AWS directory to the models.

Well, now your model has access to production. You might not have told it to go do anything in production, but these things are really determined to figure it out. And they will go ahead and delete a database or terminate your, you know, Ruby application because, well, I was asked to upgrade this to Ruby 311. So easiest thing to do is Terraform Destroy, and then I can go recreate it. And it was like, that was running the company.

speaker-0 (37:22.424)
Have you have you run into any horror moments with the rollout in any of your your customers and seeing

speaker-1 (37:28.586)
I have I have not had them, you know, the the meta AI safety expert, my god, running to unplug the Mac mini because their open claw started deleting everything. I have seen Claude be helpful and go ahead and accept a mark AWS marketplace terms and conditions on behalf of me for Claude. so anthropic Claude enabling Claude in bedrock. And it was like, hey, you just did signed a legal agreement. What is that legal? What what how does that work?

right, you know, when the LLMs are clicking on terms and services, has anybody actually signed that? I am really interested to see where the first legal case of of that goes.

speaker-0 (38:07.892)
If we let the executives at large organizations have their way, I think they would be saying that whoever owns that LLM is accountable for the decisions that LLM makes. But I think in practice, where it's gotta be the other way around, where we it's incredibly impossible to get engineers to review hundreds of thousands of lines of generated code that contains mostly garbage, there's no way you can convince them to somehow know.

magically what all the things are that an LM w we're doing. I think it's really unfair to hold them accountable for that.

speaker-1 (38:42.784)
Well, and that's it. And I don't think I I think that because, you know, I had this discussion with a client where like we were writing an acceptable policy and acceptable usage policy. And one of the things that, you know, was you're responsible for the output of your code. And I'm like, that made sense six months ago. Now I don't think so. I think at this point, this whole

You're accountable for the output of the LLMs. This whole idea of human in the loop, I think is quickly going to become a fiction. And I think by the end of the year, we're very much not going to have most things where somebody is actually line by line reviewing and understanding what it is. Yes, I may take my output from Claude and hand it off to Codecs and say, Hey, Codex, is this, you know, slop or is this thing good? You know, or

Level multiple agents that have like, you know, coding standards and architectural standards and everything else to like review that. But you know, I think at the end of the day, you know, we've got very specific and tragic examples right now of where human in the loop has failed and has led to yes, the disasters that have happened in recent conflicts. So if

like life and death decisions don't have human in the loop. Pushing this color change feature and breaking the CSS probably is not gonna have human in the loop, right?

speaker-0 (40:06.976)
I think for the longest time the engineers, the organizations that were on the cutting edge of how to lead an organization effectively were very on top of the idea of blameless p postmortems, which is sort of like this concept of we did all of the things that were highly likely and recommended in the moment, given the lack of information we had at the time. Right. Like we did the best job we could knowing what we knew. Is that the case? And usually that's the case.

And so I feel like it's gotta be these organizations that don't have a concept of blameless postmodem mortems that are trying to transition the accountability model from individuals, which they felt like they could fire, to LMs where they feel like they don't have the ability to manage on an individual level. I don't know where that I'm going with that.

speaker-1 (40:49.844)
That accountability is like to the level of wanting to terminate something or somebody. Let me hear from you. you know, but the accountability is right, you know, you broke it, you fix it. you know, the accountability is, hey, you know, Frank made this mistake, let's all learn from it, kind of thing. And so I also want to actually take a step back and disagree slightly with the idea that.

Small and medium-sized businesses can model their practices off of what Amazon, Google, Microsoft, and Salesforce are doing. These are trillion dollar companies. They've got thousand-person engineering teams, or tens of thousands of engineers, thousands of security professionals working there. Most of us are working at much smaller organizations, maybe a hundred engineers, maybe three to four security folks.

What what Steve Schmidt can tell me that I should be doing in a reInvent leadership session and what I can actually practically do drastically drastically different. And so the first of those leadership sessions I went to, I was like, wow, this is so inspiring. And then I kind of was like, you know what? Screw you. You've got a thousand people working for you and you've got a trillion dollar budget to do this in. I've got three people.

And a company that's, you know, like twenty to forty years in tech debt, and you know, I can't accomplish that. So you telling me that I should be doing this, it's like eat your security vegetables, you know, it's like eat a field of corn. No, it's like I can eat one thing. My partner here can eat two. So we're we're gonna get this done. That's it. We're we're not gonna be able to do all of what you can do.

speaker-0 (42:35.532)
I I think there is definitely a a miss there in communicating all of the relevant facts about utilizing LLMs effectively. I think even if they are capable of say solving a particular problem that's been advertised, they very rarely, if ever, get into under like conveying what the necessary infrastructure was to make that happen. And one of the things they leave off is the dollar signs, or euro signs or pounds or whatever. I see a a lot of the hyperscalers saying, look what we accomplished with

just having a few a couple of very senior engineers, distinguished scientists or whatever, their their L sevens, L eights, L nines, using an LLM were able to in one month bring this whole product up. And I'm like, Okay, great. I sort of believe you, but if a person outside of your company went to try to use your models, like how much would you charge them to actually utilize it? Like what plan would they pick? And I thought if you evaluate that, you're probably paying millions to accomplish that same thing. And and I'm just like that's not

speaker-1 (43:34.41)
half of that just to get an L eight from one of those companies. Right. So right you've got a million dollar budget, half of that is going to the one L eight that's gonna come in and run the model, not to mention then the cost of the model.

speaker-0 (43:48.096)
I just like even if you're paying even if you're paying through a third party, like even if you were to pay, say, Claude or Anthropic for Opus four point seven model to just run in cycles to complete your product for you, how much would that actually cost? Is just an ungodly amount of money, realistically, that I think just completely gets ignored when Anthropic comes out and says, we created this new product, we did it using our model. You don't have access to the the cost equation that they do.

speaker-1 (44:15.81)
There is another element of that too, in addition to cost, which is time, right? The LLMs don't sleep. So once you take the human out of the loop, the LLM can do a hell of a lot more while the human is sleeping. For sure. I do want to say that like when we talked a little bit about Mythos, right? The one one of the main bugs that it found was like this 20 year old thing and OpenBSD and whatever. And I'm told that it was like a mid five figure token cost to find.

So right, you know, I'm not gonna go and for my little open source projects drop twenty to thirty thousand dollars to have it run the most frontier of frontier models to to to look for my problems.

speaker-0 (44:55.414)
Yeah. Well, one of the problems with that is I'm pretty sure they ran the exact same practice on thousands of repositories out there in the world that are also all claimed to be highly secure. And then when they found one, they just reported on that one instance because that is news making and not the fact, yeah, actually we ran we burned, you know, hundreds of thousands on this, didn't find anything. so we're just not we're just not gonna talk about that.

speaker-1 (45:22.956)
Yeah. I think I heard somewhere that like they the project glasswing token budget. Yeah. where Anthropic is giving out tokens is about a hundred million in tokens. Yeah. and that's defined, you know, the vulnerabilities in the core critical infrastructure, you know, that that makes up, you know, most of

modern civilization. So I imagine that, yeah, there there are a number of passes that Myth Mythos is making over the Linux kernel, over, you know, Java JDK, yeah, JDK you know, runtimes and all of that.

speaker-0 (46:00.216)
I think that makes sense. My worry is that we are already getting to the place where the companies that have the most money are standing to get the benefit of collaborations and partnerships. And the companies who don't have the resources to protect their technology have no ability to fight back against the it the incoming attacks that are going to be there.

speaker-1 (46:22.604)
So right before I joined this, I was reading this article on 404 Media about something called Malice, M-A-L-U-S. And it is basically an LLM-based clean room. And they wrote it as a proof of concept of like, hey, is there an LGPL library that you really want to use in your product, but don't want to be tainted by the LGPL? Well, feed it into Malice and Malice the one.

Half of malice will completely deconstruct it into a spec, and then the other half of malice will completely rewrite it from spec in what was, you know, a 1970s kind of clean room way of doing, of where they were reversing original IBM BIOS stuff. And so this isn't it. We're researchers who trying to make a point, created a company to create a product to

Do this. And so, right, like I think there is a lot of a lot of the motes are gonna go away very quickly. I was discussing because I was irritated with AWS because they were blocking me from, you know, spinning up some resources for weird fraud reasons. And I was like, I bet you in the time it will take AWS's decimated support teams to respond to my ticket, I could probably have ported this entire thing to GCP and started running it there instead. And so

Even companies like AWS, which have strong data gravity and generally, I'm still an AWS fan, but like it was like, I could do this, right? Even their mo is getting smaller. If I can just pivot my architecture from AWS to Google, then the only thing that's keeping me in AWS is my pricing agreements and data gravity.

And eventually, right, the the fact that AWS has fallen down on customer obsession and everything else may be a win for the other providers. I think it'll be interesting to see here in Europe where, you know, the idea of sovereign cloud and not being beholden to tech giants that are beholden to potentially adversarial you know, government agencies, how easy is it to pivot to a local provider?

speaker-1 (48:35.616)
Most of the local providers are effectively VPS as a service. You know, you you get the basic EC2 or you know, VMs, maybe a managed database, you know, maybe some object storage. There's a couple that are like getting into the ideas of serverless, but their idea of serverless is yeah, we'll run Kubernetes for you.

speaker-0 (48:56.36)
There's just so there's just so much here. I do I do want to get your perspective on just to maybe loop it back around to something we started talking about at the beginning of the episode, the implications of the tech debt that we have in organizations and the LLMs generating it in a way. I think it's inevitable that organizations will find themselves in a situation where some of their more sophisticated

engineers or technology members are utilizing LMs outside their control, the most notable of being some sort of claw, open claw, micro claw, whatever they're called. There there's so many of them now.

speaker-1 (49:32.984)
Okay, I didn't know the lobsters were proliferating, but I'll take it even one step back. We don't even have to talk about open claw. I'll give you two examples of critical engineer is the source of breach. The first one was LastPass back in twenty twenty two, I think. So very advanced threat actor, because hey, if I n want to get something that I'm gonna go after everybody's password manager.

compromise a Plex server in an engineer's house and then use that to network pivot to their laptop with and then use that to get into the password manager to get the credentials to AWS and the decryption keys to the convault backups to be able to get at everybody's LastPass thing. And so that was engineer working from home as they're likely to do and a very senior level one, because not every engineer is going to have the decryption keys to the backups, got popped from their home Plex server. and it seems like

I know of a couple of incidents that involve Plex servers, but that's the one I can talk about. and then I think, you know, yesterday, the day before today, it was Vercel, but it was the thing that they context. Thank you. Thank you. Yes. So an example from today is Vercel got compromised because one of their engineers was using a product called context AI on their work machine, and they shared.

speaker-0 (50:37.035)
Context AI.

speaker-1 (50:55.354)
OAuth token between context AI. Context AI got popped in a way that I don't think has been yet yet disclosed. But so now you've got because one engineer was using context AI and context AI got popped, you've got companies that are now scrambling to rotate access keys because they were Bruce Help customers. Software bill of materials to figure out everything between the Linux kernel and your application.

I think we're gonna start needing to have like and we do it in the data privacy space if you're gonna process personal data. GDPR requires you to disclose all your subprocessors. I think that's gonna land to whether or not you're holding data. You give me a list of all of your vendors, and so and then you have to give me a list of all of their vendors on all the way down until we, you know, hit I guess everybody will be you using AWS.

speaker-0 (51:44.366)
So I wish that I I hate S bombs. I think they're completely useless. I think there there is a value theoretically, but in practice, this is a good one. we found out it is known that Context AI was vetted for their security posture by Delve, which has its own scandals that have been running for the last year or so. And I don't want to get into it because I don't have all the details. But at this point I would be suspicious of anyone using Delve to audit them for.

SOC to and they then passed that attestation off for to Vercel as proof. And what had happened within Vercell was apparently the engineer that started using context AI, which was not used by Vercel, approved access to their Google Workspace account via the admin permissions through the OAuth two flow, which is just honestly a a quite a

challenge to deal with because you aren't logged in as a user that has limited permissions. You're logged in as your identity, which often has access to do things in Google Workspace. And so when a random OAuth two window pops up, because thank you Google, it's such a great experience, I'm just going to click approve most of the time. And that means getting access to vital resources within your GCP account.

speaker-1 (52:59.18)
And go find the setting in admin.google.com where you can disable and prevent your employees from doing that. And if you find it, let me know because it means they finally have released the thing that I've been asking for for years.

speaker-0 (53:11.916)
You can set permissions and products that should not be OAuth too enabled, that would leak too much access to your account. But it the permissions in Google are so not fine-grained enough that often it means basically just turning off access to every single like logging in with any single product out there. That's because most products don't do like granular permissions or w there's a name for it.

Where you basically step up, access requests from yeah. Well it's it's not just that. It's just like the the global recommendation is like, yeah, no, just put them all on the initial login because you may need them at some point and then y users aren't paying attention because it's just too much. This is just a disaster waiting to happen. I just can't imagine the number of products, third party products out there that offer proxies to as their poor service.

speaker-1 (54:01.474)
Jira for HR ta tickets. Yeah. So they want to turn on an MCP that says, hey, I want this MCP to be able to r read my engineering JIRA tickets. And I point out as like, there's no fine-grained access control in this. If you have access to see the engineering tickets and the HR tickets, there's no way to filter that. There is no way to say

I am Chris and I have all of these powers, but I only want Chris's LLM to have this subset of access. It's all or nothing. You know, the agent is Chris with all of the powers he has, even though Chris does not want him to have those powers. That's the fundamental problem I have with a lot of the agentic tooling that they're building right now, is they don't give me the ability to say, I want my agent to have this, but no more.

speaker-0 (54:49.474)
Yeah. And it's it's like even if you trust the L LMs in a way and trust what they do, the problem is twofold, right? I mean, obviously there's fundamental issues with just what will access, but then that as a tool having its own exposure surface and vulnerability attack surface that can be compromised and and expose those credentials that it then has. So I I think this is just turtles all the way down.

speaker-1 (55:13.462)
I I would agree. And then you you brought up OpenClaw and one of the biggest concerns I have with OpenClaw is actually the supply chain around the OpenClaw community. how many of the things that are out there are tend to be malicious.

speaker-0 (55:27.02)
I think I think the the one thing I'll say there is you can't just give it its own identity because eventually its identities will contain all of your data because you're only using it for everything and there's not a good way to sequester access.

speaker-1 (55:41.334)
I don't know about that. If you actually like if you look at executives who have executive assistants, they can delegate their calendars, they can give access to certain emails, right? You if you actually think of your open claw as an executive assistant, that is another completely separate identity. That's the way to look at an open claw tool like that, right? You know, it's like, I want you to propose trades, but you're not going to have access to my password vault.

Be able to log in and make it. you can update my calendar, but you're updating your my calendar as Chris's open claw, not as Chris. There are ways to do that. but fundamentally, it's still open claw lacks the three things that an executive assistant has, right? Conscience, they know what's right and wrong. Consequence, they know what's gonna happen to them if they do steal from the executive. And and

common sense because the only common sense the L L Ms have is what they've been trained on, which is the internet. So we can say that their common sense is probably nil.

speaker-0 (56:46.2)
I think that'd probably a a good point to switch over to picks for for the episode. So Chris, what did you bring for the audience today?

speaker-1 (56:54.336)
So I think what I would suggest say is, and this actually the consequence, conscience and common sense line actually came from a a newsletter and a podcast that I really, really like. it's called Risky Business. And three times a week I get a summary of all kind of things that are happening in the security space, whether it's vulnerabilities.

Influence operations, what nation state actors are up to, what criminal cyber syndicates are up to. and then s you know, a little bit of industry news. It it is the one email newsletter that I get on a three times a week basis that I'm guaranteed to read, you know, in the morning while sipping my coffee.

speaker-0 (57:38.4)
I I actually I do I read it as well. It's absolutely fantastic. My favorite part is when when it's there, there's usually a section dedicated to three reasons to be happy this week. And that's the third reason. That's like yes. Yeah. that's like the w one thing where it's like there's so much negativity in in and around the world and specifically in the security space that I appreciate, you know, someone taking the time to be like, you know, here's why we can actually be happy. Yeah.

My pick is gonna be non-technical. I have taken the opportunity, which is likely a mistake, to rewatch Rick and Morty, because it's been suggested to me so many times. And I've gotta say, some parts have not aged well for me that I'm almost embarrassed recommending it. But I will say seasons two and three are absolutely fantastic. And if you like any sort of science fiction fantasy stuff, it is absolutely great. I I I love it.

speaker-1 (58:28.53)
I would often leverage Rick and Morty memes in presentations when I was working at the company that that produced Rick and Morty. but I tried to watch it and there were a few episodes that were good and then there were a few episodes that it was just like this is the same bad joke over and over again.

speaker-0 (58:45.334)
Yeah, yeah, that's that's accurate. And I'm I'm sure I'm gonna get some hate for saying this, but Morty is my favorite character. And I'll say the smartest too.

speaker-1 (58:54.008)
Definitely the smartest, yeah.

speaker-0 (58:56.184)
So thank you, Chris, for joining us for it for this episode. I have thoroughly enjoyed it.

speaker-1 (59:00.12)
Cool, thank you for having me and yeah, can't wait to catch

speaker-0 (59:03.288)
More and thanks to the audience for coming back for another episode of Adventures at DevOps. Hopefully we'll see everyone back again next.