Warren (00:00) Hello and welcome back to another episode of Adventures in DevOps. I brought Co-founder and CTO of Moderna. and previous Senior Product Manager at Pivotal after Principal Software Engineer at Adele, ⁓ Olga Kunzis. Welcome. Olga (00:15) Thank you, pleasure to be here. Warren (00:16) You know, I was looking at your profile on LinkedIn and I noticed that you were at Dell for 12 years. And one way or another, I feel like I've been using Dell products for as long as I can remember. I absolutely love the Dell XPS line of laptops. And I never really considered that there was a huge software engineering department at Dell. I'm sort of curious, what was going on during Olga (00:40) So I came to Dell through EMC acquisition. That's where I spent a lot of my, all that time. I've worked in enterprise data protection, kind of working with the large enterprises and with everything that they had in their environments from Oracle, Postgres, DB2, storage rate, networking, moving data to the cloud, Yeah, I think there are lots of different product lines and lots of different integrations and it was super interesting kind of how that whole ecosystem evolved together. Warren (01:13) And you were like in the technology side for quite a long time. And then if I got this right, you've moved over to the company that's pretty much turned around Spinnaker into a real product for the Olga (01:25) So there was a large community of people around Spinnaker when we worked with it. There was Google and Netflix kind of ran that community for a time with contributions from Armory. Pivotal was the third largest contributor to Spinnaker at the time when we worked on it. we did put a lot of enterprise features like authentication authorization. things like that into it. But what we've heard from customers and community users was often that people were very interested in advanced capabilities of Spinnaker, like canary analysis and everything. But very often we heard time and time again, like talk to me in a year when I'm done migrating Spring Boot one to two or fix this lock for shell vulnerability or like, and Then I will finally find the time to, and you know, the time will never come because you just spring with two to three, three to four. it's every year it repeats and kind of we've heard from pivotal customers, like exactly the same thing at the same time. It kind of dawned on us that, like we used to have technical debt that was our own. It wasn't our own applications. It was developer making a mistake. ⁓ choosing a wrong pattern and now you have application that struggles with technical debt. But a lot of what we have called technical debt is actually a system of software moving from under you. A developer that makes a perfect choice today, the latest framework, the best libraries, the best architectural patterns, the fashion of the day, builds an applications and six months later it's struggling with technical debt. had a meetup at one point and we asked developers, we polled them, how long your applications are going to continue to function if you're not allowed to touch their source code. And the answer that came back was six months, which is shocking. In six months, your perfect application is accruing technical depth to a point of stopping. But then you think about it, it's like, it's the real cadence that we see in the industry. The Kubernetes community, they make deprecations, they make releases every quarter. They duplicate something in one release, they remove it in the follow-up release. Spring boot, the same thing. So it's really if you don't touch the application for six months, that's what happens. Warren (03:34) You know, it's really interesting that you asked that. I'm surprised at the six month answer almost from the other perspective because I feel like a lot of companies today are trying to move so quickly. They're not actually considering the cost of the technology or software that they're putting out there. And Olga (03:49) alert the other day that the Azure Functions stopped functioning and couldn't renew some sort of certificate. And we looked at it, we developed this function like nine months ago, it was perfect Azure Function, now it no longer works. And I think it's like the second time in the history of this company that we have Azure Functions stop working on us. Warren (04:08) yeah, no, I can totally see that. I'm sort of curious, so what is the right timeframe then that you would expect Olga (04:16) I think it depends on what kind of software we are talking about. Think about applications that we developed, I don't know, 10, 20 years ago, the one that we deployed on prem and they ran on our own servers. Those were self-contained and isolated. And maybe if you don't touch their code, they don't magically accrue any features, but they don't stop working. And then there is the cloud native applications that are really impacted. I think the applications that you mentioned like satellites and more like applications that we deployed on-prem, they are self-contained, whereas the cloud native applications are the ones that are the most impacted. They have like 80, 90 % of their code is actually open source and third party dependencies. This glue code that constantly needs to be restitched, otherwise it fails to function. Warren (05:01) What's special about the cloud native environments that causes Olga (05:04) it's third party dependencies. We know that cloud native applications are 80 to 90 % third party dependencies, open source frameworks, vendor APIs that are changing from underneath you. you don't keep up Kubernetes infrastructure making breaking changes every six months. So that's, that's the key. Warren (05:23) What happens if organizations just stay on that first version of Spring Boot or version 14 or 16 and never upgrade? Olga (05:32) So the first one, are two sides to this. One is vulnerabilities, right? Like Spring Boot right now no longer supports even version, don't know, 3.3 is the latest. If you fall behind and you have vulnerability fixed in the library that they no longer support in the open source, you need a patch for that. You're to pay millions of dollars as enterprise to this vendor. And the second part, think, is businesses want modern applications. Applications that they built on frameworks that were popular a decade ago look dated, and developers don't want to work with them. And so you get into a state where you have legacy applications that are the most valuable applications that you have in your portfolio that created your legacy as a business, but you cannot add anything to them because they are old. No one wants to work with I think the other one is performance. like we know cloud costs that save by moving from Java version eight or 25 are like in the 30 to because there's more optimizations went into Java runtime, it's more efficient. able to evolve your business applications. faster and bring user experiences online plus security vulnerabilities. Warren (06:43) That's interesting. hadn't heard that perspective before. the reason to be upgrading the technology in your stack relies on the fact you need the engineers that you have actually only want to work on the latest technologies that you have available to you, or maybe more realistically, if you are growing and you need to hire outside of your company, what are you going to put on those Java applications? And are you going to put on React version 4 or, as you said, Spring Boot version 1 or Java version 8? If you do, who is going to be able to come who wants to work on those things. That's an interesting know from working in healthcare and aerospace and e-commerce, honestly, I don't think that the type of company that you are working with or even the technology has a huge impact on the generation of tech debt. Olga (07:29) would be working primarily with business critical business applications. think there are different embedded applications, hardware, operating systems, and those I believe would be different. But my experience primarily was with this type of software. Warren (07:46) think there is this aspect where is it this elusive hypothetical problem that people just point to when they can't describe an actual scenario? Or Olga (07:55) I don't think I'm passionate about technical debt and passionate really about being able to develop software faster. think we, on the business side, we are really constrained by how much developers can create software. think we failing to update like the infrastructure or our society. It's like still runs on cobalt, a lot of vulnerabilities in the stock, et cetera. Warren (08:02) Hmm. Olga (08:19) we talked about ROI or open rewrite and modern to our customers, it's not like you say this much effort by using automation to remediate vulnerabilities. It's we return engineering capacity back to business. Right now we know engineers spend 30 to 40 % of their time on technical Warren (08:38) I think this is a mistake that lot of inexperienced engineers make where they just point to the word tech one of the challenges that I've been wrestling do you deal with the challenge of as you create more, there is more to have to deal with. think that you can just always stack more on have to wonder, is there some maximum amount that we're just always as a we create more software, we are fundamentally always going to get to some Olga (09:03) to kind of visualize the problem that we have right now, just how much source code we have. And we know one of our customers at the time had 500 million lines of code. So we said, if you take this 500 million lines of code and write it in the same books and put these books like side by side. Like not in a toll-back case, but just in one line, how long this line will stretch. And this line will stretch from Miami to Montreal. This is just how much code they have on the management right now. And sort of when we all the work with code maintenance right now is very manual. It's a developer pulling this repositories into their ID, doing something to it, checking back to GitHub. Like the amount of code we have, it does not fit into this workflow. Warren (09:47) I want to ask you about that, but first I want to get some, like my bearings set first. How common is 500 million lines of code versus 5 billion lines? Like is that a lot? Is that average? What do you normally expect when you look at say the comparison of a company that just came out of say, series A was interesting you bring this up because our guest from last week, John Papa from Developer Relations at Microsoft, was actually sharing that saying when you're at the dinner table or you're meeting some new colleagues for the first time, what do you say you do? A lot of people say, oh yeah, I write code. And he's like, no, no, we read code. That's our job. Olga (10:24) Yeah. Warren (10:25) I feel like now something that can't even be done effectively. So, you know, my concern is that more and more companies will start producing this unreadable amount of Olga (10:36) I think we should have less code, not more. Right. But I think what we have is what we have and no one understands what's in it and how to do it. And think about the trends with AI. AI is not good at optimizing refactoring. It's just good at creating more similar looking stuff. now with AI, the developers create more code, but refactor less. Warren (10:58) Interesting. can believe that. go ahead. Olga (11:00) of similar code is worse with the AI than it used to be before. Warren (11:06) I want to I want to come back to that. First I want to ask about this is the what has caused you to basically create the cornerstone of your business. ⁓ Open rewrite what exactly is that I understand it, it's sort of what Rosalyn is for C sharp. evaluates your source code and converts it to some ASTs. in order to make like not regex related changes, but actually understanding what the source code is doing from a structure Olga (11:34) we talk about history of OpenRewrite, it came about. So my co-founder, OpenRewrite when he worked at Netflix Engineering Tools. And in that organization, was freedom and responsibility and central team couldn't break the bill and say, at this date, you have to migrate X, Y, or remove this login library. And so people kept telling him. If you do it for me, I'll accept the change, but otherwise I have other things to do." And he heard it enough time that he said, I'm going to try to do it for them and try to like almost immediately. He looked at the tools around and all of the tools around are based on abstract syntax trees, which just the syntax. And that was already not sufficient in order to make one of the first migrations that they wanted to do is to replace homegrown library with a standard. library for logging, which they regretted the mistake of starting their own. They wanted to standardize and they couldn't like which Netflix engineer would want to come to work and replace log in statements one for one. And as we talked to a lot of enterprises at Pivotal and heard time and time again, like I need to migrate Spring Boot 1 to 2, talk to me in a year, we felt like this technology was ripe for like repositioning for this type of migrations, right? And kind of build the catalog and it's highly repeatable across the enterprises. And so unlike Roslin, which is ⁓ kind of works on a single repository in the IDE, OpenRite was developed to run outside of IDEs to accumulate different transformation steps for migrations. with modern, we actually have a technology that serializes these LSTs that we produce for repositories so we can study them and work with them in a horizontally scalable manner. OpenRite allows the developer to consume framework migration or Java A to S. 25 migration on a single repository. can polish it, work with it, understand how they maybe some architectural changes are necessarily on side of it. But with modern we can study the code basis at scale with this catalog of recipes, which are units of transformations, which could be as small as change method name or as large as Spring Boot migration. Warren (13:50) think you're on really interesting path here because it seems like a foregone conclusion, which is we already have too much source code in the world that's riddled with changes that need to be made. I'll not use the word tech dead so I don't get any angry letters, but if we just assume there are changes that we want to make to our services, upgrades, patch changes, remove vulnerabilities, change versions, or swap out libraries, we have a whole list of things we want to do. And. At the same time, now we're using LLMs which are generating an immense amount of garbage code, duplication, unnecessary, wrong in some way. ability to even consume that and understand what's going on is problematic and yet we know that there is a concrete value associated with making these changes. How can we even do that? Which brings us to the conclusion of there must be improvements to our tool chain in order to actually automatically make those changes. Like why would anyone go into their ID and open up and do even a reject search to replace a string when what you need to do is so much more complicated can write something programmatically Olga (14:51) Yeah. Even logging statement replacement was very quickly failed to be done with Redgex. Just like you mentioned, you logger dot, which logger are you looking at? You don't know. Warren (15:01) is like your second open source tool. there's a leadership of Pivotal with Spinnaker and now on to open rewrite at Moderna. It seems like you absolutely are in, like you prefer the open source Has it been all sunshine and rainbows or Olga (15:18) So with open source, we knew that underlying framework has to be open source just because we have so many third party and open source libraries and dependencies. we need in order to scale this as an ecosystem, need an engagement from a lot of framework authors to help create refactoring recipes to move their consumers forward. so the core framework is open source. We work with a lot of a number of framework authors, Quarkus. Micronaut and many others contributing recipes whenever they make a breaking change to their library, they create a recipe that migrates their consumer and sort of the unit economics of change. It's kind of the best of both worlds. The framework authors can make changes to their framework and adopt the best patterns. So if they change their mind between versions of frameworks, they can make the change and not lose all of their consumers at the same time. and then consumers can be upgraded at the time they get the new best library. Unfortunately, not all software framework authors made such changes. Some went the path of, I'm going to be backpatching and charging millions of dollars from my consumers for private fixes. And then also what happened is two years ago, Amazon Q Code Transformer announced the Migration Assistant, which was based on OpenRewrite. There was IBM Assistant for Migrations that also was based on OpenRerite, Microsoft Co-Pilot, Broadcom Application Advisor also based on OpenRerite. Warren (16:46) It's interesting you bring up that other open source maintainers would actually vie for the opportunity to create a recipe to allow dependencies to migrate between change log or even a migration document, but it's very high level and doesn't really help you in any way. Olga (17:05) I it depends on people and how popular they are from framework. It's just for their, it's such a benefit to their consumers. We, for example, we now have a recipe that migrates from spring to Quarkus as well. So you not only can migrate between the versions of one framework, you can move people from one framework to another. And Quarkus is contributing, providing their consumers with recipes for migrations, both between Spring and Quarkus and between versions of Quarkus. We've seen very different behavior from different people. Honestly, that's, I the unit economics of like a framework author making one API change in one place. Warren (17:43) so one of the challenges I want to ask you about is sort of the trust you put in your own tool that you've created here, especially the Open Rewrite ecosystem. I think as someone who in the past has done extensive software development, my concern is always, can I trust the Olga (17:57) Yeah, it's a very common question that we get as people try to adopt the tool. And I think we distinguish between two types of changes. And we start people with very small, simple changes. For example, look for shell remediation. It's not optional to remediate it or not remediate it. And the timelines are very, like you do it now. And the fix is like two lines of code inserted surgically into the application. Like with the rule-based system, not with the AI assistant, if it does the right things in one place, you know that it will make the same changes across the code base. So the manual change, or AI assisted change, you have to review every single occurrence of it. With the rule-based system, at some point you test it, you prove it to be right, you now know. Like things like Gradle wrapper upgrades or like... minor different patch version upgrades. Like we do it with automation and just mass pure it out. But then there is, like you said, the difficult framework migrations that may, the recipe may be like, because the open source ecosystem is so deep and it so depends on what you use from that open source ecosystem. Like you may find that the open source recipe makes only 80 % of the changes that you need to make. Then you look at what changes are left over. You may decide to write more recipes to cover that. Or maybe you make changes. And then you do need to test it. So that's kind of the pool based changes that developers need to pull on their workstations and test it out. And these are also not optional, but because 3inboot1 has vulnerabilities and you don't want to pay millions of dollars to the vendor. So, but the timelines are different than... You do it maybe in between sprints or you plan with your business owner or product manager saying like, will do this here. it's like, we'll make application look modern and we'll make build more features faster, but I need this downtime in this period. And we kind of aligned where it is. So we support right now Java infrastructure as code. So Kubernetes Manifest remediation, Terraform, CI-CD pipelines, Docker images, things like that. So because infrastructure code as code is copy paste drift, so kind of being able to see across the repositories what you have there and being able to uplift it all together. We just announced JavaScript support. And Python and C Sharp is under development. we will become a sort of universal platform for code maintenance and modernization and evolution. Warren (20:28) Wow. so, sorry, I have to think about that for a moment. My question is then the biggest challenge must be not only understanding how one would write code in those languages, but what is idiomatic and more than that, what is the actual structure of the language in order to correctly parse it? You're writing a language parser, but getting into the LSTs that you mentioned seems like a huge challenge for some languages more so than others. Olga (20:50) Yeah. So it's interesting that we discovered what is called in academia a C language family, which is like C, C sharp, Java, JavaScript, Python. They look very different as in source code, but you see the abstract syntax here very similar, like the for loops, the if else, the, you know, the method calls, et cetera. And so. We actually are able to reuse the underlying Java-based implementation and extend it for additional languages. So we have a reuse of the Recipe Catalog on day one. We build the LST. To build LST is very hard. We actually invoke compilers for each language and we guide the compiler through the first two stages where they do abstract syntax tree and semantic information about the code. And then we extract it out of the compiler. Compiler doesn't care. about this data representation. It wants to start writing machine code, but we stop it there and we create this lossless semantic tree that is serializable and so on to be able to operate on it for refactoring the source code. So very, very deep IP Warren (22:00) How is doing the work to integrate with those Olga (22:03) You need to figure out how to invoke the compiler and look at the compiler internal data structures and extract the data. I think the interesting part, in language parser development, it's very tedious for developers and highly repetitive because you take a look at those data structure, you obviously need to make decisions how you do it, but once you decide... Warren (22:11) So. Olga (22:25) You need to move from those data structures to open rewrite data structures and map things. recipe development is also kind of very similar. And the interesting part that the coding assistants are very capable of doing this repetitive work. And that's where we see a lot of acceleration in all like parcel language development, like as well as recipe catalog growth. very quickly these days. So the cost of custom recipe development went close to zero and the parser significantly accelerated. Warren (22:56) But every open source library out there would potentially still need to write their own recipes, right? Olga (23:02) So you could like bootstrap the recipe development for this library. If it has good description of what it is that they're changing, you give it to cloud code and it starts developing recipes. So OpenWrite has a very declarative test framework where, know, inserting that unit test before and after and make sure, the model, like it's hard for the model to cheat on tests. So that was a great investment ⁓ that we've had. Warren (23:29) Is the model that you're expecting the open source maintainers to be utilizing, is that a model that, a foundation model that you've developed something that you have fine tuned or at this point is it just a matter of the available LLMs providers out Olga (23:43) we actually worked with all of the like, entropy, Jiminy, like open AI, we work with all of them via the API and tested a variety of different models. then they all worked similarly, but then no significant differentiation between them. So if you feel like this space is good enough, we don't need to fine tune our. train our own models, we just allow customers bring your own model. Warren (24:07) in a way it avoids your own concern and allows you to push that down to where that's actually necessary, Olga (24:14) Yeah. And customers very quickly start writing their own recipes internally because like, in addition to having consuming open source, they usually also have some sort of internal framework on top of which a lot of business applications are made. So they need to create the coverage for this part of the stock that they have. So, and they can bring whatever tool developers already use for writing more recipes. Yes. So all of our experiences with the models today, and we do write a lot of recipes and a lot of other code with coding assistance of various kinds points us to the direction that this agents are really not autonomous. They are amazing. They do great stuff for us, but the developers need to be closely involved in what they're doing to make them go in the right direction. I think the funny thing happened yesterday. I saw it in. in our fun Slack channel developer screenshot what model told him. And she said, I'm getting confused here. Would you like me to continue or would you like to debug this for me? They don't have access to debugger right now, right? It's not one of the tools that they have. And he said, this is the first attempt of AI using humans as MCP tools. Warren (25:27) It's interesting you bring that up because when we were talking with Incident IO, they had brought up the challenge of when there is a production incident, they actually want to suggest a pull request fix the problem. something simple like a null reference exception or something else to actually generate that pull request. And to do that, they need to understand the source code. And the strategy has been that they need to run the customer source code in a protected, secure virtual machine to actually run it, to actually do that debugging. I feel like in a way you have quite an interesting alternative here, which is if you are generating LSTs, you actually in a way don't need to runtime do a debugging session because you can fully understand what the source code is supposed to be doing intentionally. So relying on an MCP out to a real tool or MCP out to a human to perform stuff. I think Olga (26:11) Yeah. Warren (26:17) I think we should be careful because I totally see a bunch of companies going jumping on that especially for asynchronous work. Olga (26:17) Yeah. So Warren (26:23) I think what you've built here is actually really clever because it provides a very technical, well, it provides a very deep technical solution to understanding the complexity of a code base at scale, historically, especially when we look at things like LLMs, they're going to always be limited by small context windows. We know from the research, large context windows don't solve problems. Small context windows, which means utilizing tools that are actually able to consume a whole. repository or realistically understand what the code is doing at a technical level without actually having to read every individual piece. It seems like one of the critical components for actually allowing LLMs or our usage of LLMs through agents or some other complex asynchronous processing to function at a higher level. Olga (26:55) Yeah. Yeah. Yeah, I think like LLMs are data hungry and they want to write sized data as well. Like if you give them the whole repository as text to read, they, like you said, lose attention and cannot find it. It's like giving a human a book or giving a human a paragraph where can they find the context of what they need better in it's in the paragraph. Warren (27:30) It's interesting you bring up that analogy because I think that it's something that we're going to continue to see over and over again that the constructs that we've created to help our human societies advance in both technology industries and non-tech alike are being rediscovered through the creation and the improvements of LLMs. Every single time I feel like a company jumps up and down and says, look, we figured out a really important thing. And then we can point to like five other examples. I think the one that had come up recently for me was, you know, if we use the example of a book, there's usually a table of contents and in the back, some sort of index. And it's like, we should have like an LLMs.txt file or an agents MD file that, you know, explains the different things that could happen. And I'm like, yes, of course. We always knew that was the case. That's why books have these things with terms and definitions at the back or references and at the front, a good overview because we know that for very intelligent entities and organisms, we need those things. So there's no way that an LLM would be able to make progress without also having those exact same things. And where we've discovered complicated technical processes or tools that we've developed for ourselves, we've seen in a lot of tools the idea of attribute-based programming or reflection be a real thing, there's no reason why that be excluded from LMS and you provided the capability to actually make that happen by exposing Olga (28:54) I think the LLM paradigm of tool calling was like really game changing. Warren (28:59) I agree with you. think the interesting thing here is that a lot of companies stand up and say like, no, this is the new best thing ever. This is the only thing we need. I think what we keep seeing actually in practice is all of the tools together are important. Like if you have a toolbox with a bunch of tools in it, you likely still need the instruction manuals for how to use those tools or a list of what those tools are or how they're being utilized. But then there's a whole bunch of other things and other scenarios where you do want the like literal recipes for say cooking a particular dish in your kitchen, right? That's not listed on any of the tools that are available. Your tools are your blender or your stand mixer or spoons. Yes, you need all those to actually work effectively, but where is the recipe still? And your catalog recipes, like those things still need to exist. I think the only mistake here is either assuming we're all like have solved everything or that there is like one new innovation just around the corner that Olga (29:55) Yeah, historically we just built high level obstructions and not just software but everywhere else as well. Warren (30:02) I think that leads us in a lot of different possible directions. And I think some of those topics we've explored on other episodes of the show, Yeah, so at this point we will move on to PICS. So Olga, what did you bring for us today? Olga (30:15) I bring an observation and sort of like we talked about LLMs and AI where it's going and what's going to be next. And my observation in the next few weeks is, like I mentioned that modern we work a lot with cloud code and we noticed that first there is the sonnet group of models and then there is an opus group of models and we've seen opus for one. marked as a legacy and which was the most capable reasoning model. And I just wonder why is this happening? Is this the cost of running this model for Antropic? Is it the cost going to go down eventually and you'll return these capabilities? don't know. Living in very much uncertain time. Every morning I wake up, I look, what's going to be? new and exciting in this space and this is just I hope we get to a point where we have the return of these capabilities. Warren (31:11) Yeah, I think that's what everyone's waiting for. is this expectation that they in a particular direction. And I find that we don't have Olga (31:19) I think it's the technology is amazing, but it's still too expensive. And so at what point it drops down in cost and we see a lot of build out of data center capacity as well as energy needed to power it. And we'll see. Warren (31:35) Oh yeah, we've gone into extensive previous episodes about the energy costs associated with that. So we'll leave that out of this episode. And I guess I'll share my pick for today, which I had something different. But since you reminded me that we've worked at Dell in the past, I brought in my favorite computer, which is the XPS Olga (31:40) Yeah. Warren (31:57) years old I think but I absolutely love this laptop it is fantastic I'm a little bit disappointed that Dell decided to stop their XPS line so I have I've been recommending it to everyone that's well that's thinking about getting a new laptop I don't know what it is I just Olga (31:59) Mmm. Mm. it's unfortunate when the things we love get discontinued. Warren (32:16) Yeah, and I actually don't fully understand. I don't know if it was a matter of them merging the lines together and I'm still on this old laptop, is just quite not up to date anymore, but it's still I can open two versions of my ID on it and that's as much as I need. ⁓ So thank you Olga so much for coming and sharing with us all about OpenRewrite Olga (32:36) was a pleasure, really enjoyed the conversations. Warren (32:38) I'm glad to hear it. Thanks again to all our listeners for ⁓ showing up for today's episode see you all again hopefully next week.