Warren Parad (00:07)
Welcome back to Adventures in DevOps, where every episode's a deep dive into a specific topic with an expert guest. Today's adventure focuses on writing documentation and feature flags. As the expert, we've got someone with quite the unfamiliar title. Previously, she's been a software developer, API integration engineer, and now is the documentation lead at Unleash, Melinda Fekete.

Melinda  Fekete (00:28)
Hi Warren, thank you so much for having me.

Warren Parad (00:30)
Yeah, you know, and I'm really excited. What I'll say is that we try to limit who shows up on the podcast based off of their titles. And you let me in on a little secret before we started recording that you don't believe in job titles. So what's that about?

Melinda  Fekete (00:44)
Yeah, you know, ⁓ I work at a very small company. So I did build the documentation website and I maintain it. And I also do all of the technical content that is on there, but it's also, just a part of my role. So I do other DevRel things like conferences and meetups and developer education and talks and workshops and a of marketing, a bit of this and that. So what would you say is a good job title for that? I don't know.

Warren Parad (01:14)
You know, I think this is where if you're at a small enough company and still like around the startup phase, then there's like this idea of founding engineer. But then like, what do you call the second founding engineer and the third founding engineer? And then I feel like you have this idea that you should start applying titles or roles. I think what I would say is that often labels are helpful, but they're all wrong in a way.

I do want to share though, a long time ago when I was in the university, I had a lot of professors who would say, we want engineers who can do at least one other thing. That's the highest number of feedback that we get from the industry. And this was, ⁓ wow, almost 20 years ago. And I didn't really understand it at the time. And I think the longer I've been in my career, the more I've started to come to terms with this idea that just doing one specific thing like your

engineering job ends as soon as the code gets deployed to production is just like not that most critical aspect.

Melinda  Fekete (02:08)
Yeah, it's what they sometimes refer to as T-shaped, right? So being really good in one kind of vertical area and going really deep and having that expertise, but also ⁓ just dipping your toes in a couple of other things and trying things out. And I think it also helps with job satisfaction. And if you just do the same thing every single day, it's going to get boring quite quickly. So of course there's engineers who only like to code and that's all they like to do. But even back when I was an engineer,

⁓ if you had just told me to code all day and do nothing else, I would have probably quit because I really need that variety of like, ⁓ being involved in like, I don't know, interviews. So it's nice to have a couple of different responsibilities that you also care about and can kind of experiment and try new things.

Warren Parad (02:57)
How did you make that shift into an area where you felt comfortable with that wasn't your primary remit when you started? Like you didn't just like one day wake up and like, you know what? I only want to write documentation from now on.

Melinda  Fekete (03:11)
For me, when I saw the spark and ⁓ excitement for the first time with documentation is ⁓ I was working at this coffee roasting company. I was integrating ⁓ with different APIs. ⁓ We built this IoT espresso machine. A ⁓ big part of my role was just trying to figure out these different integrations. I was looking at a lot of API docs and some of them were terrible.

You really had to spend days and days trying to, with trial and error, figure out what the heck was going on. And that really inspired me to, how can you do this better? And started looking at some of the companies who I thought were doing really well ⁓ ever since. I probably wouldn't look back. think that it's been three or four years now that I mostly do documentation and some small things on the side, but I love it.

Warren Parad (04:06)
So I think there is this old joke where no engineer comes into a company and is looking over the current stack and looks at all the source code and says, wow, this last guy just wrote the most perfect code ever. I don't need to change anything. But I actually think that's true for documentation. I don't remember looking at any portal and being like, wow, the docs for this software product, they are fantastic. ⁓ So you mentioned that there are some that you're like, we should do better and look to some.

pinnacle ones out there that should be modeled for any software product. What are those in your mind?

Melinda  Fekete (04:42)
⁓ Unleashed does a pretty good job.

Warren Parad (04:44)
Make your own company, okay, because you're working on it. So everyone's gonna have to go look at that after this episode now.

Melinda  Fekete (04:52)
Yeah, I mean, mean, some of the big open source projects like GitLab, I think do an amazing job. Actually the landscape has changed, I would say significantly over the last couple of months even. LLams, AI, they're very good at writing documentation, right? So you can produce a lot of good quality content with very minimal input. So the name of the game is kind of more been around like...

what's some of the experience you can build around it. things like the AI search piece of the puzzle now is like you have to make it, you have to make your documentation also usable for LLMs because about 50 % of all documentation visitors for a site like ours are now AI tools. So you have to find a sweet balance of like, like what's working for humans and what's working for LLMs as well.

Warren Parad (05:41)
it shift your job responsibility from historically having to write docs for humans to now having to write docs for LLMs?

Melinda  Fekete (05:49)
There's a couple of fun things you can use, like the tooling that I use, it allows you to distinguish between how you want to write for a human and what you want to expose to a human and what you want to expose to an LLM. So you have different code blocks and you can show some of it to humans, some of it to LLMs. A lot of it is like there's a good overlap, but...

LLMs typically do well when a lot of the content is just plain markdown. You strip it off any of the fancy tables and accordions and all the things that ⁓ you put a lot of work into making the content readable for humans. But then you have to strip away a lot of those fancy UI features for LLMs and things like, I don't know if you've heard about LLMs.txt, which is ⁓ typically something you...

you would expose for all of your pages so that when LLMs come and look at your docs, it's a very clean and simple, easy to understand structure for them.

Warren Parad (06:53)
Yeah, I mean, can imagine really realistic from this, there's like two paths down there. There's exposing the docs that you have to the training processes that companies who are building models are making so that that information can end up in LLMs just straight away so that when your users or whomever perspective customers are querying an LLM or prompting it for information that it can actually return the results. And then there's basically a more complex search.

aspect where at runtime being able to answer questions from the web that haven't historically been able to be answered. Realistically, you may be gotten away with some high level JSON blocks at the top of some web pages, but now LMS are directly consuming the data that is on ⁓ individual article pages and summarizing it or providing a useful answer. And that has a second nature there. And I think one of the problems is a lot of the tools out there don't do a great job of

removing the visual elements that have been added in some way. Like if you just look at generated HTML or even markdown in some cases, it's very difficult to get useful, optimized LLM based ⁓ output for or input for LLMs. ⁓ What's the strategy there? Do you just like write the docs twice?

Melinda  Fekete (08:09)
I think a lot of the modern documentation platforms do it behind the scenes for you. So the way they generate the LLMs.txt file, if they're doing it well, they'll automatically strip some of those. We write markdown with custom React components, which are the things like the tables and the drop downs and the buttons and whatnot. And so all of that gets stripped out ⁓ from the LLMs.txt. And I think that alone is already a big win. And yeah, I'm experimenting with including

⁓ certain explanations, like if you strip out a table, but you still want to tell like the LLM what, what that is all about. Like you can probably explain in a different way. And I've, I've just started experimenting with some of it and I'm, ⁓ I have the data around, you know, what pages got viewed by what percent of humans versus LLM. So I still need to do a lot of digging around. ⁓ but there's, there's some useful features out there.

Warren Parad (09:05)
for sure. So just for context, we've been using DocuSource for a while now and the LLM's dot text functionality plugin is atrocious in every way. ⁓ So it's always something that we're sort of looking at, especially offering a very technical product. You can be sure that it's more likely to get picked up in some way. And I think one of the biggest problems there is when

you have custom react components or view components or just any sort of thing that you wrote yourself and not just pure markdown getting embedded links to work has been a huge struggle for us like if you're using a customer react component that has like something clickable in it to go to a different page like is whatever process you're using to write your documentation is going to be smart enough to somehow pull that out and then listed appropriately.

Melinda  Fekete (09:51)
Yeah, I'm not like extensively looked at it. So I would say ⁓ we recently migrated off of DocuSaurus. Exactly ⁓ why you, some of the reasons you mentioned about the plugins and entire ecosystem was giving me nightmares. But ⁓ the tool we have now is called Fern and we started using it as of last Monday. So it's very new for me and I'm

Still just like trying to learn and experiment and understand what is happening behind the scenes with things like the LLMs.txt conversion. So ask me again in six months and we can have a chat about it.

Warren Parad (10:32)
Well, if you do the research and write it down, then we'll definitely want those stats. So Fern's the one that does the SDK generation but also contains the docs portal, Was the motivating factor mostly on the doc side or the automatic SDK generation?

Melinda  Fekete (10:40)
Yeah, exactly.

We don't use them for the SDK stuff. Just the docs. Yeah, I wanted a platform that was a little bit easier to maintain for me. And I think they're kind of out of the box components are a bit more sexy than the docusaurus ones. ⁓ just overall, the team was super nice. So loved working with them throughout kind of the evaluation process. ⁓

Warren Parad (10:49)
Interesting

Melinda  Fekete (11:13)
They, you know, the AI search capabilities and also this ⁓ other AI tooling like the LLMs that TXT Generation was not something we had before. So for me to just get all of that out of the box with as little involvement from my side as possible was a win. you know, doing the platform and the content and also these bunch of other things around developer relations ⁓ can be ⁓ time consuming.

Warren Parad (11:43)
So I really want to dive into that a little bit because one of the things that has come up, I'd say biggest learning for me was who am I writing these docs for? And you're nodding your head, I'm sure you have some opinions here. So my first question is like primarily when you're thinking about writing the docs, are we talking about like end users like through Fern who are technical users? Are they how-tos? Are they guided tours?

Melinda  Fekete (12:11)
Well, so I would say about 80 to 90 % of our audience is developers who are either just getting started with feature management or maybe they've already got Unleash and they're trying to figure out how to integrate their SDK and maybe 10 to 15 % are business decision makers who are looking to evaluate what platform to buy.

We categorize the content into three or four main types. So we have kind of getting started content, which is very developer focused. Then we have more of the tutorials and guides, which is more of a step-by-step kind of hand holding, very detailed across all the different SDKs. And then we have the API documentation, which is one of the largest category. And then the SDK documentation and release notes.

So I would say those categories are pretty classic in terms of what you'll find on the typical documentation site. we typically monitor our usage around what SDKs our customers are using and try and focus on those languages when we do examples or ⁓ guides and tutorials.

Warren Parad (13:20)
absolutely makes sense. guess one of the things I sort of realized early on is that the message that I heard when I was writing documentation in like Wikidocs was, like, who is your audience? And I think all my English teachers from my entire academic career would always try to get this point across. And I'll say I never understood what that actually meant. But I stumbled upon a few years ago...

⁓ There's a website dietaxis.fr, which I think really makes the point here. And you brought up whether they're how-to guides or release notes. But really the point is, who is this for? Why am I even writing this?

Melinda  Fekete (13:59)
I think developers are notoriously hard to track as well. So sometimes you don't get all of the data you wish you had. ⁓ But I do have information on what people are searching for in the search bar. I get information on what people are asking in the AI search and ⁓ what percent of their answers are being answered correctly. ⁓ Then I also get ⁓ feedback on every specific page.

Did this page help you or not help you? Or feedback on specific code examples. And also inside the product, we send out a survey to users on I don't know, I think three to six month basis. You get a survey request and then some of them fill it out. And then there's questions around the documentation there. So I try to rely on all of those different feedback points to figure out, this page working well? And is it solving the problem that I

think it should be solving. ⁓

Warren Parad (15:01)
I mean you're using the information to decide like what to write next or to change pages, to improve docs in certain areas, or is there something else there?

Melinda  Fekete (15:10)
Yeah, I think one of the most useful ones actually is the AI search. And the dashboard will give you an indication, like ⁓ what are some of your content gaps or what are some of the questions that people struggle with and don't get an answer in your documentation. And that is one of the main ways I prioritize what to work on next besides the new features and the developments from the engineering teams.

Warren Parad (15:36)
I have a question that's going to be controversial. Internal ⁓ documentation for your architecture, for your services, for other developers, does it live inside a Git repository and committed to source code? Or does it go in some sort of open platform that enables anyone to ⁓ update, edit, write it as much as they want?

Melinda  Fekete (16:00)
Well, we're an open source company, almost everything that we do is in the open source repo. And we also document our architectural decision records or anything like that because we have quite a lot of contributors to also help with developing the SDKs. And it's very useful for them to know, like, to understand the evolution of the product. so it's either on the docs or ⁓ inside the open source repo. The documentation is open source as well.

⁓ We have a few things internally, around the cloud architecture and things like that, or like run books, things like that. But ⁓ I would say not a ton. We try to really talk about everything publicly and our customer success team and our support function very heavily relies on the documentation. So we don't have a ton of things that would only live in Slack or only live in...

private site and I think that's quite good.

Warren Parad (17:02)
Yeah, think that one of the arguments is that there's a huge overhead to get documentation into a Git repository, that the whole pull request review process and merging this is non-trivial for someone without technical capabilities. ⁓ Whereas on the other side, it's, I want it to be accurate and I want it to be reviewed before it goes ⁓ public or becomes the source of truth there. So it sounds like you're more on the side of, like,

Let's have it in source control. so my question is going to be, what do you do for the non-technical writers? You'd be like, well, too bad. Technical writers should always have a technical background and be able to use Git in order to write documentation.

Melinda  Fekete (17:47)
I think it is simpler than it initially sounds, like learning Git and working in Markdown. I would say that if you understand the technology that you're working on well enough, you probably are not going to have a lot of trouble learning DocSense code. ⁓ But a lot of the platforms, the documentation platforms, like Friend, they also offer a no-code editor. I personally have not used that a ton, but I have given access to

this editor to, for example, my boss who's not really got time to learn how to download the repository and work. I'm sure you could do it. He's just not got the time. So he's got access to this NoCo platform and I think it works like Notion and you can drag and drop elements around and fix just like a small typo and it will open a pull request on your behalf.

Warren Parad (18:39)
So like the all the benefits of a WYSIWYG notion-like or confluence-like documentation portal but backed by ⁓ some sort of Git repository that gets updated and is auditable and trackable and reviewable straight away. So I do think though you're in an area like where you're forced to have a disciplined approach to what it looks like both from a tool usage standpoint and documentation as far as rollout goes because

The primary aspect of your business, feel like as well aligned with this sort of mentality and that's feature flags.

Melinda  Fekete (19:15)
Yes, that is true. Yeah, what's been a little bit scary for me in this past year, like 2025 was ⁓ kind of a massive year for these cloud outages, right? So you probably remember the AWS one, the GCP one, the Cloudflare one. And in my mind, some of these teams are world-class engineering teams who basically wrote the books on reliability and we look up to them for their practices and to see

To see some of these things fall apart was a bit scary because they, I assume, never worked at Google, but I assume they have world-class CI-CD pipelines and super sophisticated setup. And they're amazing at DevOps and getting code into production. But even companies like that are not good at staying in control of that code once it is in production. So that massive GCP outage that happened in, I think it was in June, it was because of a single line of policy change in Google's IAM.

And they merged the code and it was live for a couple of weeks, I think, before it suddenly got activated and then half of the internet was down. I looked at the incident report in quite a lot of detail to see what was going on. they were able to identify the root cause in about 10 minutes, which I think is pretty good. And they prepared the rollback and redeployed within 40 minutes, also OK. But like the...

whole outage still lasted about four hours due to all of this systematic recovery delays and backlog clearing and stuff. that's the real scary part that even if you have perfect DevOps and you're really good at getting code into production, ⁓ DevOps cannot bring your code back up fast. this is where it comes. I've always been a long time advocate for feature flags, but these stories from the past year really kind of reinforce that, that you really need that.

added runtime control where, you know, our rollback is seconds rather than like four hours. And if I remember correctly, both the cloud flare.

And the Google outage in the action items are kind of the summary of the incident. say, we wish this was behind a feature flag. So yeah, I think that it's a good one to look at. And if you have time to read through those incident reports, I would recommend it because ⁓ I would guess that Google has built their feature flagging platform internally and so has some of these bigger companies like Amazon. But something is going on there. And maybe it's the fact that they're treating some of these backend changes like configuration updates or

Policy changes like it doesn't need a flag and a lot of times people think of flags as like UI changes or kind of these more cosmetic things. But we've seen that it is also very useful for some of these more behind the scenes hidden changes because they can also lead to real outages that are then difficult to roll.

Warren Parad (22:07)
You bring up a really interesting point there. So historically, the outages at say like AWS, which is the one that I've been tracking is not is usually due to very complex sets of

root causes, like not just one thing, like it was a race condition plus a number of other things and not the time they brought down the whole internet because some engineer switched off like the S3 bucket database, ⁓ which did happen. the thing about the policy is that, ⁓ and so I don't even account for like Azure or GCP outages anymore because they always seem like simple things that happen.

They put ⁓ multiple different ⁓ availability zones in the same data center. There's that one, and there was a flood, and so the region was offline. But yeah, I think the Cloudflare one was definitely like, we treated ⁓ changing a feature flag as not the same attention that we would have to merging a pull request. And configuration changes are just as critical and can have widespread effects, especially if you have

your code behind a feature flag, then you should really be having the same attention or even more because that's where the critical path is at that moment. I mean, I think it's an interesting point that you bring up, especially these hyperscalers where you want to trust them that they are ⁓ much more rigorous in reliability. And as you pointed out, literally wrote the docs on, well, like Cloudflare talks about like edge workers and GCP talks about like the SRE, you know, has the SRE book, whatever. And AWS has like a whole

whole portal dedicated to high reliability stuff. obviously they're down all the time. yeah, there is a question there. It's like, if they can't get it right, how is anyone supposed to help?

Melinda  Fekete (23:47)
Well, we've definitely tried to build in the similar sort of capabilities into the platform as you have on a pull request and GitHub, kind of diff and review of what's changing in this feature flag. Same way to add any number of required approvals. And we really try to lock down our production environment so that at least two developers need to approve a change to a feature flag in production. And I think that helps. But another thing that's helped us at Unleash is we love boring technology.

So we're one of those people who just, you know, we love try new things, but when it comes to the products that we're building for our customers, we really try to prioritize the true tried and tested stuff. you know, things like in feature flags, latency is a big thing, right? So when you toggle a flag, you want ⁓ it to take effect as quickly as possible. So architecturally, you can decide, do you want to do streaming or do you want to do polling?

between your SDKs and the feature flag server. so streaming is very sexy and instant updates immediately. But we can see that when there's an outage, things take longer to propagate. And we try to do an approach where, for example, we do polling in our SDKs. You configure the polling interval, and it's very reliable. But if you really care about instant latency, we do offer streaming. But always fall back to polling.

Warren Parad (25:10)
Yeah, I can definitely see that there is a spectrum here, right, for companies in why they're introducing the flags. I think early on, they are maybe not really specific about what their goal is and then throw the same approach at every single instance. Whereas it really does like anything, even if you have a provider backing you, you need to have an attention to what is the right way to approach this. Because on one side, ⁓ I would say latency is good actually here. You prefer to be slow.

to get the reliability. ⁓ But having real-time switches, I know is what the marketing department wants. anyone who's done anything with high-reliability systems knows that the trade-off is ⁓ cash invalidation or ⁓ extra network requests, Very fast polling or streaming, have extra connections up. So you're paying the cost somewhere else, which isn't necessarily a good thing. In one of the previous episodes, we were discussing

with the guests about how great it was that they could avoid even having faster than like one second or one minute it being wrong. Like it's okay to wait 60 seconds before this gets rolled out. It's not that critical that everyone who comes to the website sees the updated version, you know, two seconds from now. Like you don't need that level of precision because the trade-off is a high risk to your product or you you're working in production.

And I think risk is a huge aspect here because the research from Dora, which we actually talked a lot about in the 2025 report episode, says that more untested code is actually getting into production because of AI. ⁓

Melinda  Fekete (26:50)
more code, ⁓ velocities up but like, ability is down.

Warren Parad (26:54)
Yeah,

quality is quality is down. The interesting thing is that people feel like they are being more productive quote unquote, whatever that means. ⁓ But the actual quality metrics show us that solutions are getting worse for the end users for customers for clients. And so there is this question of if you're using feature flags, there can be a tendency to throw the work over the wall and have it be relied on by just enabling the flag.

How do you get teams to be disciplined about ⁓ making sure stuff is tested before actually deploying it to production, even though it's behind a flag?

Melinda  Fekete (27:31)
actually what we do at Unleash is we do breakathons with every new feature. So we enable the feature in production only for ourselves and then get together on a Google Meet call like all of us and like try and break it. So spend like an hour together, try and like find all of the things that are wrong with it. And it's very fun actually. ⁓ Maybe not so much for the person who built the thing, but for everyone else it's very fun. So once you're happy with it internally and you start rolling out.

we're rolling it out to say five or 10%. You can put in some automation in place that says the error rates are below this threshold for let's say 12 hours. Then we can progress to the next stage, which may be only 10 % or it could be 50 % or maybe it's a segment of your customers that you think or you know are more like experimental and ready to try new things rather than the ones where you really need that stability. Then you kind of progress through these stages and you can go away from your

laptop, you can go to sleep or whatever, and you know that if those metrics spike, then your rollout can be paused automatically or go back to the previous stage or whatever you kind of define.

Warren Parad (28:37)
So I think we're probably at a good point to move over to pics for the episode. Nice. So I'll ask you, Melinda, what did you bring for the audience today?

Melinda  Fekete (28:46)
So I brought a game. It's a game is called the wavelength. Do you know it? No, it's like a how do I describe it like a communication? collaborative communication game and it's something that I've played with five-year-olds and my friends and family and But also like my co-workers. We actually love to play this game as like ⁓ We're a remote team and we have a team hour every Friday and we always pick a game and this is something we played

Warren Parad (28:52)
I don't.

Melinda  Fekete (29:14)
recently and like had tons of fun with it. The way it works is it's a board game as well, but there's a digital version, like an app version that is quite good. You get a spectrum and with two extremes at the end and it changes every round. And let's say in one round, you get a scale, which is from good pizza topping to bad pizza topping. And you get a random point on the scale and only you see that point in the scale. And so if it was like off center towards bad pizza topping, you have to come up with a clue to your team.

to help them identify where that point in the scale is. So I would say maybe like pineapple and the team have to like.

Warren Parad (29:47)
Going straight for controversy right there. She's like decided before the episode. You know what? I was gonna be pineapple on pizza. That's gonna be my example

Melinda  Fekete (29:54)
Just trying to help you with the YouTube comments, you know? And so the team like debate and discuss what ⁓ like what pineapple must mean and I try and identify that point in scale and then you score points based on that. But it's so much fun. So I really recommend it if you're looking for something to play with your team or at

Warren Parad (30:15)
Yeah, really embodying that Italian philosophy there. ⁓ Yeah. So my pick is actually a television show this time. It's ⁓ called Bosch. It's a LA detective procedural. Have you heard of it? ⁓ I started watching in December and I've binged like all 10 plus seasons of it. ⁓ Because it's just so great.

Melinda  Fekete (30:18)
you

I've seen it, yeah.

I love it as well. I love detective shows and like hospital dramas and stuff.

Warren Parad (30:41)
I don't do the hospital dramas, I think the main actor ⁓ Titus Wolver is just absolutely fantastic. It reminded me a lot of Law and Order, which I watched a lot when I was younger. And it's so much better. Honestly, this may be one of the best procedurals I've ever seen. ⁓ And it does these nice skips during the show to get rid of downtime that you would otherwise have to deal with.

You never know what's going on. You get dropped into the middle of a situation and it's like, I'm still trying to figure out is there going to be a crime or what's going on in these people's lives at this moment? it's like always getting into a new show, which I find they really captured well. It doesn't feel like every season is just like a continuation of the one before it. does feel like ⁓ new ⁓ every time you watch it.

Melinda  Fekete (31:32)
Yeah, plus one for that. Go watch it.

Warren Parad (31:34)
Well, thank you so much, Melinda, for coming on today's episode. It's been absolutely fantastic. Feature flags and how to use them correctly and most importantly, ⁓ what's next in documentation. And thanks to all the listeners and viewers for coming on for today's episode and I hope we'll see everyone back next week.

Melinda  Fekete (31:45)
Thank you for having me.