1
00:00:07,854 --> 00:00:09,654
Welcome back to Adventures in DevOps.

2
00:00:09,654 --> 00:00:13,596
Every episode is a deep dive into a specific topic with an expert guest.

3
00:00:13,596 --> 00:00:17,137
Normally this isn't a show where we ask for feedback about using specific tools.

4
00:00:17,137 --> 00:00:26,059
However, I think having at least a few different in-depth perspectives are critical to
understanding if and how the software industry is changing.

5
00:00:26,159 --> 00:00:36,460
And from our episode where we tear apart the Dora 2025 and give it a new one, we know how
poorly using LLMs can go.

6
00:00:36,460 --> 00:00:43,000
So for the expert this week, we brought in long time UX director, principal designer, and
now UX consultant, Matt Edmonds.

7
00:00:43,000 --> 00:00:44,142
Welcome to the show.

8
00:00:44,142 --> 00:00:44,642
Thanks.

9
00:00:44,642 --> 00:00:46,269
Thanks for nice to be here.

10
00:00:46,269 --> 00:00:47,276
Thanks for having me.

11
00:00:47,276 --> 00:00:52,566
Yeah, really got a narrow like really guest titles can be quite challenging sometimes

12
00:00:52,566 --> 00:00:55,291
I am a very varied person.

13
00:00:55,291 --> 00:00:58,376
So I've had a lot of different titles over my career, be fair.

14
00:00:58,376 --> 00:01:00,498
it's not to be.

15
00:01:00,840 --> 00:01:02,462
It's not an easy challenge.

16
00:01:02,784 --> 00:01:10,101
That's a new title and role I haven't heard before, A uh LinkedIn profile just right at
the top there, your headline, varied person.

17
00:01:10,862 --> 00:01:12,983
Actually, maybe I'll change it.

18
00:01:13,144 --> 00:01:23,042
I think that there's a lot of people that have you know um What is it they puff their
chests so to speak and I've never been that kind of person anyway um So I think I think

19
00:01:23,042 --> 00:01:27,639
that goes back to my hiring when I when I was managing a bunch of teams back in the day
You know, it's always about diversity.

20
00:01:27,639 --> 00:01:35,923
It's always about you know How do we get the the most varied group of people that I can
because I think that's where you build the best teams and it wasn't about like I want to

21
00:01:35,923 --> 00:01:38,305
have You know 12 experts.

22
00:01:38,305 --> 00:01:39,470
No, I want to have people that

23
00:01:39,470 --> 00:01:41,770
that have different interests, that have different experiences.

24
00:01:41,770 --> 00:01:49,750
then I think that's, you know, so to be fair, a varied person to me is way more
interesting than a non-varied person.

25
00:01:50,130 --> 00:01:59,890
You know, in the changing landscape that is, you know, how we treat people and handle
people, I think it's challenging because there's, if you really want to be type, you know,

26
00:01:59,890 --> 00:02:04,030
it's easy from a hiring profile to typecast somebody because then you can put them in a
bucket.

27
00:02:04,030 --> 00:02:07,030
And the reality is people aren't bucketable in that sense, right?

28
00:02:07,274 --> 00:02:10,276
We try because we're like, how do I do this?

29
00:02:10,276 --> 00:02:10,997
How do I do that?

30
00:02:10,997 --> 00:02:12,286
And I think it's a real challenge.

31
00:02:12,286 --> 00:02:21,514
think, honestly, I think hiring is one of the biggest challenges that businesses have and
in hiring well, You know, having the rubric.

32
00:02:21,514 --> 00:02:30,621
And I think that's where, you know, from an AI perspective, you know, the first space that
started seeing this even three or four years ago was like, how do I parse someone's

33
00:02:30,621 --> 00:02:31,001
resume?

34
00:02:31,001 --> 00:02:35,170
Because rather than having a human do it, because humans are just infallible, know,
fallible and

35
00:02:35,170 --> 00:02:38,633
You know, I can't trust a human to value someone's performance.

36
00:02:38,633 --> 00:02:39,493
Exactly.

37
00:02:39,493 --> 00:02:44,597
Like I might as well just give it to a machine because the machine obviously is just going
to do it perfectly.

38
00:02:44,597 --> 00:02:48,790
And you're like, well, I don't, I don't know if that's true.

39
00:02:48,790 --> 00:02:57,690
And I think that a lot of cases, it also, it allows for the human in that process to not
clearly articulate what it is they're looking for.

40
00:02:57,690 --> 00:02:59,789
Cause they're like, Oh, I'll, I'll find it when I see it.

41
00:02:59,789 --> 00:03:00,129
Right.

42
00:03:00,129 --> 00:03:01,940
You the amount of times that I did that.

43
00:03:01,966 --> 00:03:04,626
When I was working with other groups trying to hire people, right?

44
00:03:04,626 --> 00:03:06,906
And you're trying to figure out like, okay, how do you build out a team?

45
00:03:06,906 --> 00:03:13,146
How do you a startup be like, okay, what's the first UX person or what's the first kind of
design thinking person you have?

46
00:03:13,146 --> 00:03:14,646
it a CPO?

47
00:03:14,826 --> 00:03:18,746
Is it just a kind of a, know, a UX person that's moonlighting?

48
00:03:18,746 --> 00:03:20,606
it, you know, what are you doing?

49
00:03:21,526 --> 00:03:24,626
And in reality, most businesses don't necessarily know what they need yet.

50
00:03:24,626 --> 00:03:27,606
And it's this kind of like, oh, well, I'll know when I see it.

51
00:03:27,606 --> 00:03:28,726
That doesn't work with AI.

52
00:03:28,726 --> 00:03:31,746
That doesn't work with something because the AI has now got

53
00:03:31,746 --> 00:03:39,900
this varied set of directives and all the people that are probably the diamonds that are
rough, that are the varied person, that are the person you probably actually want in that

54
00:03:39,900 --> 00:03:51,698
situation are going to get called out because they don't have that like Einstein DaVinci
perfect 10 10, you know, and they cost no money.

55
00:03:51,698 --> 00:03:56,131
And you're like, okay, that's like, there's always going to be some sort of trade off.

56
00:03:56,131 --> 00:03:58,382
The AI is always trying to find a perfect solution.

57
00:03:58,382 --> 00:03:59,340
No, that's a good point.

58
00:03:59,340 --> 00:04:02,099
And what you end up with is just something that's just not.

59
00:04:03,030 --> 00:04:12,082
Yeah, so obviously it came into play with the applicant tracking systems, the ATS, which
you needed to, you didn't want to look at the paper resumes or what was ever uploaded as a

60
00:04:12,082 --> 00:04:13,933
PDF, but rather extract that data.

61
00:04:13,933 --> 00:04:17,614
And then that very quickly got into not only can we automatically extract it.

62
00:04:17,614 --> 00:04:23,776
of course the OCR process or whatever you're using to pull out the labels was always
atrocious to begin with.

63
00:04:23,796 --> 00:04:27,247
But you know, bring up a really good point about the people that you want to hire.

64
00:04:27,247 --> 00:04:32,334
It's almost like if you use a system where you're, it's computational bound.

65
00:04:32,334 --> 00:04:42,678
uh where you're very terminalistically figuring out what candidates make sense based off
of just what you already believe, you're only ever going to get people that exactly match

66
00:04:42,678 --> 00:04:54,293
your template, which means you're losing out on the unknown quality in these systems
because that's especially the individual area where uh LLMs just completely fail in a lot

67
00:04:54,293 --> 00:04:54,573
of ways.

68
00:04:54,573 --> 00:04:56,064
There's no creativity there.

69
00:04:56,064 --> 00:05:01,774
uh And I think the hiring question is really interesting, and maybe we'll get back to this
later in the episode.

70
00:05:01,774 --> 00:05:09,765
ah But one of reasons I really wanted to get you in here is because you don't have a lot
of software engineering like hands on background.

71
00:05:09,765 --> 00:05:13,770
If I understand like maybe like one internship 20 years ago or something.

72
00:05:14,346 --> 00:05:24,074
I did, back before the jQuery days, um I did JavaScript a little bit and had some jQuery
experience back when jQuery was like the cool hotness, right?

73
00:05:24,074 --> 00:05:34,402
um And then did some other kind of template-based language stuff, building out some things
from a SaaS provider perspective, because basically just to make changes to the

74
00:05:34,402 --> 00:05:37,165
configuration visual style of some things we were driving on.

75
00:05:37,165 --> 00:05:39,767
But that's really my development background.

76
00:05:39,767 --> 00:05:43,990
It was beyond CSS and building websites in the 90s and things like that when you're, you

77
00:05:44,186 --> 00:05:45,807
You know, you're moving up in the world.

78
00:05:45,807 --> 00:05:55,992
oh But as far as like actual programming languages beyond playing with some open source
projects like Drupal, which I did for a long time and kind of learning some PHP from that.

79
00:05:55,992 --> 00:06:04,937
um It's mostly been hobbyist and just generally technically aware, which has allowed me to
work with development teams because I know kind of what's going on.

80
00:06:04,937 --> 00:06:10,133
But I've never, if you gave me a blank piece of paper and said, Hey, go do this or go to
this code challenge.

81
00:06:10,133 --> 00:06:10,951
I'd be like, yeah, good luck.

82
00:06:10,951 --> 00:06:13,222
You yeah, that's not, you know.

83
00:06:13,622 --> 00:06:14,593
It's not my strong suit.

84
00:06:14,593 --> 00:06:16,326
That's not what I'm trying to play.

85
00:06:17,189 --> 00:06:19,792
Very clearly, like that's not where I want to be.

86
00:06:20,494 --> 00:06:32,934
So one of the things that really interested me in getting you on for this episode is you
had spent a recent period of time actually invested in, I'll call it vibe coding.

87
00:06:34,254 --> 00:06:39,774
And you said, mm-hmm, like that's exactly how you would describe your activity there.

88
00:06:39,934 --> 00:06:44,846
And one of the things that keeps on coming to my mind is who are best at

89
00:06:44,846 --> 00:06:49,006
who are best at utilizing LLMs to generate code?

90
00:06:49,006 --> 00:06:59,066
Because so far what I hear, especially from senior staff plus principle engineers, is it's
not software developers that are getting the most value out of this.

91
00:06:59,066 --> 00:07:02,866
And there's a question of like, okay, if it's not them, like why is that?

92
00:07:02,866 --> 00:07:04,926
And who would benefit the most?

93
00:07:04,926 --> 00:07:06,706
Who does get the most value out of it?

94
00:07:06,706 --> 00:07:10,688
And one of the things that comes to my mind is, is it someone in more of the product
space?

95
00:07:10,688 --> 00:07:13,034
in the UX space that would be.

96
00:07:13,034 --> 00:07:19,658
And so I really would like to get your perspective of like first thoughts about it or what
you've been doing and how that's turned out so far.

97
00:07:19,714 --> 00:07:20,945
Yeah, I think so.

98
00:07:20,945 --> 00:07:28,136
So I started doing this, um, granted to be fair, I was playing around with generative AI
stuff and some LLM stuff several years ago, right?

99
00:07:28,136 --> 00:07:30,641
You know, you know, open AI and, that kind of thing.

100
00:07:30,641 --> 00:07:36,124
And then some like LLM studio, like some local stuff, you know, back when it was like, you
know, three tokens a day.

101
00:07:36,124 --> 00:07:41,398
And you're like, okay, this is not moving at the speed of which that I have any interest
in trying to cultivate.

102
00:07:41,398 --> 00:07:45,070
And it was generally speaking, playing around with different, um

103
00:07:45,070 --> 00:07:49,410
different stuff from a gender of AI perspective, just to kind of see like, okay, what
other things can I create?

104
00:07:49,410 --> 00:07:54,550
I'm an artist anyway, but I've always taken an approach that like these things are tools
to me and how do I ideate?

105
00:07:54,550 --> 00:07:56,450
How do I come up with something that's interesting to me?

106
00:07:56,450 --> 00:08:00,890
It's not gonna take away the joy I get for making my own art, right?

107
00:08:00,890 --> 00:08:02,310
You that's never how I've seen it.

108
00:08:02,310 --> 00:08:05,550
I've never seen it as like a stealing of somebody else's present, you know.

109
00:08:05,550 --> 00:08:07,550
I'm doing these things for my own purposes.

110
00:08:07,550 --> 00:08:10,850
If somebody else wants to get joy out of doing it some other way, I'm not gonna take that
from them.

111
00:08:10,850 --> 00:08:11,830
Like that's fine.

112
00:08:11,830 --> 00:08:14,574
And in a lot of ways, I think from an AI,

113
00:08:14,574 --> 00:08:22,874
kind of LLM vibe coding, the new kind of world in the last 18 months has been this idea
that like you've got the Spider-Man meme, right?

114
00:08:22,874 --> 00:08:26,234
Everybody's pointing at everybody else like, it does their job better.

115
00:08:26,254 --> 00:08:34,894
And I think most of the time when people come to saying, hey, there's other people that
are getting value out of this other than software engineers is because software engineers

116
00:08:34,894 --> 00:08:40,974
can look at the code that's being generated and it doesn't raise to their level of
standards, right?

117
00:08:41,014 --> 00:08:43,425
But at the same time, a software engineer

118
00:08:43,425 --> 00:08:46,306
can use it to generate a marketing website, right?

119
00:08:46,306 --> 00:08:47,786
And it's good enough for them.

120
00:08:47,786 --> 00:08:53,918
But the reality is a marketing person and generating a marketing website, it's not going
to be good enough for them because their standards are different, right?

121
00:08:53,918 --> 00:08:58,070
So you end up with this world where like everybody's like, yeah, I don't like it to do
this.

122
00:08:58,070 --> 00:08:59,290
I don't like it to do that.

123
00:08:59,290 --> 00:09:08,343
A good example to me is I want to say a couple of months ago, um I'm trying to remember
Matt's full name, but the guy that runs the oatmeal, it's a web comic.

124
00:09:08,343 --> 00:09:09,463
He's done a bunch of different stuff.

125
00:09:09,463 --> 00:09:11,854
And he came up with a big post that

126
00:09:11,854 --> 00:09:19,994
You know, frankly schools forever, which is to be fair, has really solid points in it, but
is basically saying like, I don't ever want to use AI for anything I do.

127
00:09:19,994 --> 00:09:21,234
I don't like it for art.

128
00:09:21,234 --> 00:09:22,594
I don't like it for all these other things.

129
00:09:22,594 --> 00:09:26,494
And then towards the end of the post, he kind of says, yeah, but I use AI every day to do
X, Y, and Z.

130
00:09:26,494 --> 00:09:33,274
And I'm like, that feels hypocritical to me because you're saying it's not good enough for
your role, which is fine.

131
00:09:33,274 --> 00:09:36,394
But you know more about your role and what you're trying to accomplish.

132
00:09:36,514 --> 00:09:39,134
But if you're saying it's not good enough for somebody else's role.

133
00:09:39,170 --> 00:09:39,910
You're not in that role.

134
00:09:39,910 --> 00:09:40,900
You don't know that person.

135
00:09:40,900 --> 00:09:42,281
You don't know what that person's doing.

136
00:09:42,281 --> 00:09:43,771
You don't know what those roles are.

137
00:09:43,771 --> 00:09:49,493
So to me, from a vibe coding perspective, I started doing this before and even knowing
what the term vibe coding was.

138
00:09:49,493 --> 00:09:56,785
And then kind of fell into that last summer was like, wait, this is what people are
talking when they do vibe coding, which is basically just trying to one shot from a single

139
00:09:56,785 --> 00:09:57,155
prompt.

140
00:09:57,155 --> 00:10:03,777
That's literally a paragraph long and think they're going to get the perfect desired
result.

141
00:10:04,177 --> 00:10:06,058
It the reality is doesn't work like that.

142
00:10:06,058 --> 00:10:07,662
But I think that.

143
00:10:07,662 --> 00:10:16,822
What I've learned is been fascinating because I've gone through this process of like
trying to figure out like, okay, which of the big foundational models work in certain

144
00:10:16,822 --> 00:10:17,622
ways, right?

145
00:10:17,622 --> 00:10:27,722
Like how does Codex work differently from a coding perspective or OpenAI work differently,
you know, from a coding perspective than in Chetchi PT 5.1, 5.2, 5.3 now, it keeps going.

146
00:10:28,162 --> 00:10:30,282
Versus what is Anthropic doing, right?

147
00:10:30,282 --> 00:10:31,882
Versus what's Gemini doing?

148
00:10:32,262 --> 00:10:37,190
How the different context models work, you know, how the different reasoning models work
and what you can do differently with.

149
00:10:37,822 --> 00:10:47,445
Um, and what I've landed on is I like the way that Anthropic does some things, as far as
how the NGEN tech model reasons, because I'm a plan person, right?

150
00:10:47,445 --> 00:10:51,646
And I think that comes back to like how I've always built software with teams.

151
00:10:51,686 --> 00:10:59,928
I'm never just like go do a thing, you know, unless I'm trying to explore, in which case I
think that that's where all the AI models actually frankly do a really nice job.

152
00:10:59,928 --> 00:11:02,388
It's like, I have a general idea.

153
00:11:02,569 --> 00:11:03,889
What can I do with it?

154
00:11:03,889 --> 00:11:04,259
Right.

155
00:11:04,259 --> 00:11:06,610
And, and, and is this even accomplishable?

156
00:11:06,610 --> 00:11:07,754
Is this even doable?

157
00:11:07,754 --> 00:11:10,776
And I think it opens up knowledge to people.

158
00:11:10,776 --> 00:11:13,258
And I think that the thing is the most interesting to me.

159
00:11:13,519 --> 00:11:23,667
A week ago, I had an issue with my mic monitoring situation, you know, and was talking to
Claude and was like, Hey, can we build a low leveled virtual audio driver?

160
00:11:23,667 --> 00:11:26,049
Because I can't get one from my silicon Mac.

161
00:11:26,049 --> 00:11:27,710
And I want to be able to do this one problem.

162
00:11:27,710 --> 00:11:33,195
And I don't want to spend a hundred bucks on some software that does way more than I need
to do in 30 minutes.

163
00:11:33,195 --> 00:11:37,558
I had a backend demon running that uses five megs of Ram.

164
00:11:37,592 --> 00:11:40,767
that is an on-demand virtual driver that does exactly what I need to do.

165
00:11:40,767 --> 00:11:43,771
I never would have written a hardware driver in my life.

166
00:11:43,772 --> 00:11:44,353
Right?

167
00:11:44,353 --> 00:11:47,307
I still frankly didn't, but I solved my problem.

168
00:11:47,307 --> 00:11:48,078
Right?

169
00:11:48,302 --> 00:11:50,253
How did you let's let's walk through that.

170
00:11:50,253 --> 00:11:53,137
think that would be interesting like ah like what model were using.

171
00:11:53,137 --> 00:11:54,138
How did you prompt it?

172
00:11:54,138 --> 00:11:56,820
How did how were you actually testing and validating it?

173
00:11:57,154 --> 00:11:57,394
Yeah.

174
00:11:57,394 --> 00:12:07,240
So, so the, the problem that I had and the problem I have right now is back during COVID,
I decided to buy a nice mic and decided to go with, you know, an XLR situate.

175
00:12:07,240 --> 00:12:15,345
And back in the day when you're buying stuff off of the, you know, COVID firing sale, you
get where you can get.

176
00:12:15,345 --> 00:12:20,629
And at the time I bought a FocusRate Scarlett, which is kind of like a regular mic
interface with an XLR input.

177
00:12:20,629 --> 00:12:23,550
But the one that I bought had two inputs, right?

178
00:12:23,566 --> 00:12:30,986
And it just so happens that on the Mac, I think also on the PC, but the way that this one
handles it, the first input goes to left input.

179
00:12:30,986 --> 00:12:32,906
The second input goes to the right input.

180
00:12:33,606 --> 00:12:41,706
Most zoom, you know, chats, Google meets or whatever will, will duplicate that input and
know that, okay, this left input really should go to both.

181
00:12:41,706 --> 00:12:45,506
So you hear the audio from both sets of, you know, what you're, whatever you're being
recorded.

182
00:12:45,506 --> 00:12:50,726
If you're doing local recording now, the local recording just says, Oh, all right, I've
only got audio coming out of the left input.

183
00:12:50,726 --> 00:12:52,326
I'm just going to record the left input.

184
00:12:52,472 --> 00:12:58,907
I'm like, but it's a mic, but this mic doesn't have this audio interface doesn't have the
ability to just set this to motto to duplicate that channel.

185
00:12:58,907 --> 00:12:59,257
Right.

186
00:12:59,257 --> 00:13:01,499
And you can get some virtual software.

187
00:13:01,499 --> 00:13:02,380
There's a long story.

188
00:13:02,380 --> 00:13:07,404
Like, you know, there's some other stuff that goes back and Intel based max that existed
from an open source perspective.

189
00:13:07,404 --> 00:13:08,485
That just doesn't exist anymore.

190
00:13:08,485 --> 00:13:13,449
Somebody decided, Hey, I'm not going to put the effort into recompiling this for silicone
in Apple silicone.

191
00:13:13,449 --> 00:13:14,900
I don't really want to deal with it.

192
00:13:14,900 --> 00:13:16,742
And there's a couple other software solutions that are out there.

193
00:13:16,742 --> 00:13:22,030
um And there's a couple other kind of like higher level systems that run in the
background.

194
00:13:22,030 --> 00:13:28,030
Um, but I was like, why am I going to run and figure out so they're scripting stuff out or
spend a hundred bucks on this?

195
00:13:28,030 --> 00:13:35,250
If all you need to do is this one problem so that can get local recording when I'm working
with clients, for example, and I'm recording something, I don't want to have to go into

196
00:13:35,250 --> 00:13:39,790
Adobe premiere or whatever my editing software is and, know, and duplicate the audio.

197
00:13:39,790 --> 00:13:46,630
Um, I don't want to have to run OBS or some other streaming software, um, to just do this
thing.

198
00:13:46,630 --> 00:13:49,050
I just want to be able to just natively record it.

199
00:13:49,050 --> 00:13:50,496
One shot, the recording.

200
00:13:50,496 --> 00:13:53,128
re-recorded if it screws up or just kind of keep going, right?

201
00:13:53,128 --> 00:13:55,909
You know, flow of thought, you know, kind of flow state stuff.

202
00:13:55,990 --> 00:14:02,254
And I was like, wait a second, why don't I just ask Claude if I can do this?

203
00:14:02,254 --> 00:14:11,160
And here's, and here's my pitch, but the prompt was basically, and what I always start
with most of my prompts with, with any sort of AI is I take a question model, right?

204
00:14:11,160 --> 00:14:13,642
I don't tell it exactly what I want to do.

205
00:14:13,642 --> 00:14:19,756
I kind of ask it what it thinks is possible and don't necessarily show my hand.

206
00:14:19,756 --> 00:14:26,208
Because I don't want to influence the model in agreeing with me because they're all
overconfident.

207
00:14:26,208 --> 00:14:31,267
If you ask it, if it can create anything, it'll be like, yeah, totally.

208
00:14:31,267 --> 00:14:32,259
I'm like, how long will it take?

209
00:14:32,259 --> 00:14:39,231
Like, well, it'll take 752 days of which also most of the models, their concept of time is
ridiculous to me.

210
00:14:39,231 --> 00:14:40,192
It's hilarious.

211
00:14:40,192 --> 00:14:40,612
Right.

212
00:14:40,612 --> 00:14:43,783
This one phase will take four days and you say, okay, let's go do that.

213
00:14:43,783 --> 00:14:49,734
And then it takes 25 minutes and it's done those four days of work because it has no idea
of how to handle that.

214
00:14:49,830 --> 00:14:56,516
But it comes back with, with a driver that I can run and it looks good.

215
00:14:56,516 --> 00:14:58,988
My biggest issue was like, right, is there a memory leak, right?

216
00:14:58,988 --> 00:15:00,219
How long can I run this thing?

217
00:15:00,219 --> 00:15:09,416
Is it, you know, going to continue to stay at this five megs or all of a sudden in a day,
if it's just sitting here, is it like, you know, 42 gigs and I'm like, a second.

218
00:15:09,416 --> 00:15:10,947
This is a problem, right?

219
00:15:11,508 --> 00:15:15,971
It's written in C, which I can read a little bit, but not well.

220
00:15:15,971 --> 00:15:19,382
Um, and to be fair, it works.

221
00:15:19,382 --> 00:15:21,193
And so that was like, that's enough to me.

222
00:15:21,193 --> 00:15:21,934
It's running locally.

223
00:15:21,934 --> 00:15:24,025
I'm not worried about security from that perspective.

224
00:15:24,025 --> 00:15:32,630
Most of the things that I've been vibe coding, um, have been things where I can at least
check the result and check and read it.

225
00:15:32,630 --> 00:15:35,982
So most of it's been in web based languages.

226
00:15:35,982 --> 00:15:43,396
I've been doing a lot of things with like, um, like electron or Tari wrappers, basically
taking like web based code and then running it locally.

227
00:15:43,396 --> 00:15:46,914
Because one of things I don't want to deal with is the security side of stuff.

228
00:15:46,914 --> 00:15:54,618
You know, I don't want to be responsible for someone else's auth or handling their PII or
any of that kind of stuff or what other personal data they have.

229
00:15:54,618 --> 00:16:02,562
And a lot of the little things I've been working on have mostly been just solving my own
kind of problems along with trying to build some things that I think are interesting to

230
00:16:02,562 --> 00:16:09,165
solving some of my problems, but also that other people might, might have some interest
in, but mostly been local models or builds.

231
00:16:09,964 --> 00:16:17,587
No, if it's like local stuff, that seems like really the turnaround here is that this goes
back to the expert perspective.

232
00:16:17,587 --> 00:16:21,653
It's that if it's not your job and you don't care about...

233
00:16:21,654 --> 00:16:25,416
If it's not your job, then you for sure don't care about the level of quality in a way.

234
00:16:25,416 --> 00:16:27,186
You have a very specific problem that needs to be solved.

235
00:16:27,186 --> 00:16:37,040
And I think this is where the fallacy comes in where, yeah, I as an expert in area A think
it can be used to solve area B because you don't understand the critical nature of what

236
00:16:37,040 --> 00:16:40,022
those other roles in say your company or another company are doing.

237
00:16:40,022 --> 00:16:44,503
But when it comes to personal software, you know, just solve whatever problem you have.

238
00:16:44,503 --> 00:16:51,278
But it sounds like in this scenario, did you manage to get something that basically one
shot it out of?

239
00:16:51,278 --> 00:16:52,532
I'm Claude.

240
00:16:52,532 --> 00:17:01,002
Yeah, so what I will say this is this was this is Opus 46 this is the one that was
released about a week and a half ago or maybe two weeks ago a couple I don't when this is

241
00:17:01,002 --> 00:17:05,619
a couple weeks ago um and That model is very agentic.

242
00:17:05,619 --> 00:17:08,901
It's just trying to like load a bunch of different things It's a little there.

243
00:17:08,901 --> 00:17:17,035
love and hate it at the same time um I think Opus 45 was was a sweet spot for me It's it
was smart enough to kind of handle some things but also kind of check back with you more

244
00:17:17,035 --> 00:17:18,776
And at least give you some more updates.

245
00:17:18,776 --> 00:17:21,018
But yeah, this was one thing where

246
00:17:21,018 --> 00:17:27,523
I found that if I give it a small enough structured problem and then allow it to ask me
questions, this is what I've always done.

247
00:17:27,523 --> 00:17:34,527
And I started doing that with chat GPT, um, about two years ago, somebody said, Hey, you
know what you should really be doing rather than telling it to do something.

248
00:17:34,527 --> 00:17:40,011
You should give it some context and then you should basically say, Hey, ask me whatever
questions you have to gain more context.

249
00:17:40,011 --> 00:17:48,717
And, and what I've always found is that was been a really successful way to me to work
with any of these models because it allows me to gauge what their knowledge is and what

250
00:17:48,717 --> 00:17:50,572
their understanding of the problem is.

251
00:17:50,572 --> 00:17:53,684
And I can correct it and say, Hey, no, this is not at all what I'm talking about.

252
00:17:53,684 --> 00:17:57,667
I want to talk about this or wow, this it's kind of getting it.

253
00:17:57,667 --> 00:18:01,500
You know, I know it's, this is just, you know, not actual reasoning.

254
00:18:01,500 --> 00:18:07,234
It doesn't actually get it, but it's putting the pieces together from pattern perspective
to understand what it should be outputting.

255
00:18:07,234 --> 00:18:13,298
And to me, the way that I've approached this, be it vibe coding, which I look at is like
never looking at the code, right.

256
00:18:13,298 --> 00:18:19,182
Or never understanding what it is you're kind of like trying to actually do, or what the
technical reasons for why you're doing something is.

257
00:18:19,182 --> 00:18:22,722
which have always tried to take a different approach to saying, Hey, I want to be able to
look at the code.

258
00:18:22,722 --> 00:18:23,982
want to be able to understand it.

259
00:18:23,982 --> 00:18:31,802
I've learned more about how to do certain things and how to code certain things as a way
to kind of start the process rather than following a tutorial or like following somebody

260
00:18:31,802 --> 00:18:32,842
else's video.

261
00:18:32,842 --> 00:18:34,102
I'm experiential learner that way.

262
00:18:34,102 --> 00:18:41,462
And I've always kind of done that before, but I've learned that I'm picking up way more by
saying, Hey, show me what this looks like.

263
00:18:41,462 --> 00:18:42,782
And it comes up with the result.

264
00:18:42,782 --> 00:18:45,322
Now it might not be perfect to be fair, right?

265
00:18:45,322 --> 00:18:45,706
But

266
00:18:45,706 --> 00:18:47,297
it's getting the right output.

267
00:18:47,297 --> 00:18:50,259
It's getting the thing that I want or the outcome that I want, right?

268
00:18:50,259 --> 00:18:52,411
Which is, does this driver work?

269
00:18:52,411 --> 00:18:53,972
Does it duplicate the audio?

270
00:18:53,972 --> 00:18:58,155
Does it do it in a lean and mean way that I can have this thing run and not worry about
it, right?

271
00:18:58,155 --> 00:19:02,599
um Or is it taking up a billion resources and not quite working, right?

272
00:19:02,599 --> 00:19:13,226
So I think that's to me is where if you give it a really scoped problem and then you have
it ask you questions and you correct it as you go, I think you can get

273
00:19:13,248 --> 00:19:14,479
a pretty good result.

274
00:19:14,479 --> 00:19:18,662
And I think that the challenge is getting it the rest of the way, right?

275
00:19:18,662 --> 00:19:21,554
Like if I were to put this out there, this would be like free for somebody else.

276
00:19:21,554 --> 00:19:22,791
Hey, you've got to focus right Scarlett.

277
00:19:22,791 --> 00:19:23,462
Do you want to use this?

278
00:19:23,462 --> 00:19:26,788
Do you want to do, you know, do you want to run something some other way?

279
00:19:26,828 --> 00:19:37,736
Go, you know, do this and here you go versus, you know, people that are, think, trying to
frankly sell a lot of these little things that I don't think are necessarily either worth

280
00:19:37,736 --> 00:19:40,430
what somebody thinks they are or

281
00:19:40,430 --> 00:19:42,730
or somebody else's perspective on it.

282
00:19:42,730 --> 00:19:49,530
Because a lot of the things that are out there from an AISlop app perspective that is
taking over in a bunch of places are solving individual problems.

283
00:19:49,530 --> 00:19:57,390
The users, the developers, the pseudo developers, let's call them, that have kind of vibe
coded an idea are fixing a specific problem.

284
00:19:57,390 --> 00:19:59,850
Like I'm fixing a specific issue.

285
00:19:59,850 --> 00:20:05,210
I'm not looking to have a complex interface that does all these other things.

286
00:20:05,210 --> 00:20:07,790
No, find this specific audio interface.

287
00:20:07,790 --> 00:20:09,950
If it's specific to this model,

288
00:20:10,114 --> 00:20:11,715
duplicate this channel.

289
00:20:11,715 --> 00:20:14,046
It's not even I don't it's not even handling the right channel right now.

290
00:20:14,046 --> 00:20:17,639
If you plug something into the second channel, it will not work the way that it's supposed
to.

291
00:20:17,639 --> 00:20:18,329
And that's a gap.

292
00:20:18,329 --> 00:20:21,991
And I don't care because plugging it in here, it's solving my problem.

293
00:20:21,991 --> 00:20:23,192
don't it doesn't matter.

294
00:20:23,192 --> 00:20:30,826
So what I'm hearing actually is, and one of the things that I feel like keeps on coming up
is that the business SaaS is dead.

295
00:20:30,826 --> 00:20:37,129
And I feel like that's the wrong statement because there is a reliability necessary to run
the business there.

296
00:20:37,129 --> 00:20:38,574
Actually what I'm hearing.

297
00:20:38,574 --> 00:20:49,714
What I think I'm hearing actually is small little apps made by someone that are open
sourced or even put online or then or mobile apps that charge a few bucks in order to

298
00:20:49,714 --> 00:20:50,194
install.

299
00:20:50,194 --> 00:20:55,874
Those are dead because realistically when you're going at those, you're trying to solve a
very specific problem.

300
00:20:55,874 --> 00:20:59,594
And now you never need just like I think it was like last month or something.

301
00:20:59,594 --> 00:21:00,134
I don't know.

302
00:21:00,134 --> 00:21:00,654
was last month.

303
00:21:00,654 --> 00:21:07,776
I don't know it was on a year or monthly basis that like the number of hit stack overflow
was getting was down to like 3800 or something.

304
00:21:07,776 --> 00:21:08,707
yeah, it's dropped.

305
00:21:08,707 --> 00:21:09,927
It's dropped significantly.

306
00:21:09,927 --> 00:21:11,508
I saw something on this.

307
00:21:11,508 --> 00:21:21,640
want to say November, December discussing it and in the amount of drop off that it's had
from creation, I think it's still getting hits from an LLM perspective, right?

308
00:21:21,640 --> 00:21:23,815
It's still being sourced for a lot of content.

309
00:21:23,815 --> 00:21:26,346
And to be fair, Reddit, they're doing the same thing with Reddit.

310
00:21:26,346 --> 00:21:30,019
Reddit's selling effectively access to Reddit for LLMs.

311
00:21:30,019 --> 00:21:32,720
Like that's how, what's one of the that Reddit's making money right now.

312
00:21:32,720 --> 00:21:36,312
oh Because they're like, hey, we've got a lot of information here.

313
00:21:36,312 --> 00:21:39,864
We've got a lot of content and this is useful for context.

314
00:21:39,864 --> 00:21:48,567
It may not be useful from a coding perspective to say, Hey, this is exactly the right way
you solve this problem, but at least describes what problems people might be having or how

315
00:21:48,567 --> 00:21:55,330
they may have tried to approach the problem and what success rate they might've had from
up votes or whatever to kind of gauge that interest.

316
00:21:55,330 --> 00:22:01,712
me, the challenge I think you see is like there's dark patterns in any of this stuff,
right?

317
00:22:01,712 --> 00:22:05,708
And I think the challenge with AI is that there's a new dark pattern.

318
00:22:05,708 --> 00:22:12,764
which is I'm taking AI's word for something to be validated as me doing the research.

319
00:22:12,965 --> 00:22:19,471
And that's not like, it's the same as sitting in a room and saying, Hey, we've got a bunch
of assumptions.

320
00:22:19,478 --> 00:22:19,658
Okay.

321
00:22:19,658 --> 00:22:20,893
Well, what data do we have?

322
00:22:20,893 --> 00:22:22,995
Well, I've got nothing to date and I've got no data on that.

323
00:22:22,995 --> 00:22:23,175
Okay.

324
00:22:23,175 --> 00:22:23,897
Let's put a symmetric.

325
00:22:23,897 --> 00:22:25,518
Let's do some observability.

326
00:22:25,518 --> 00:22:30,292
Let's what quantitative data we have that can back this up to ask good qualitative
questions, right?

327
00:22:30,292 --> 00:22:39,077
replacing both of those things with, well, AI probably looked at Stack Overflow from 18
years ago and told me this is the right way to do it.

328
00:22:39,979 --> 00:22:43,858
That's not racing to the level, the right level of what we need from a shippable person.

329
00:22:43,858 --> 00:22:45,559
Yeah, totally agree.

330
00:22:45,559 --> 00:22:55,053
I think one of the problems here though is if I just I feel like you put this in a
philosophical, you know discussion like We've been trusting what's available on the pixels

331
00:22:55,053 --> 00:23:04,296
on our screen for for a very long time now and even before LLMs there was a trained
behavior where we would see something and we would just trust it now maybe you trusted it

332
00:23:04,296 --> 00:23:13,442
because it showed up on a quote-unquote, you know reputable site and for a while it was
said to be Wikipedia because they got their sources from real places, but then

333
00:23:13,442 --> 00:23:22,629
then media started being produced content which referenced Wikipedia or the references in
Wikipedia as the source, which were other sources that had referenced Wikipedia before.

334
00:23:22,629 --> 00:23:25,891
And we lost that even before LLMs existed.

335
00:23:25,891 --> 00:23:28,512
And then there's this problem now with even Reddit.

336
00:23:29,073 --> 00:23:32,395
I feel like Stack Overflow is better because the content was being curated.

337
00:23:32,395 --> 00:23:34,437
And I think that was a critical component.

338
00:23:34,437 --> 00:23:39,901
Whereas with Reddit, the curation was not, is this accurate?

339
00:23:39,901 --> 00:23:42,092
But rather, is this good content?

340
00:23:42,092 --> 00:23:44,884
Yeah, there's a difference between moderation and curation.

341
00:23:45,605 --> 00:23:56,733
Like, know, Reddit, Reddit, we're moderating to make sure that nobody's being unproperly
unkind to somebody or, um, yelling at them or whatever.

342
00:23:56,733 --> 00:23:57,027
Right.

343
00:23:57,027 --> 00:23:57,544
You know,

344
00:23:57,544 --> 00:24:00,210
Be careful depending on which sub you're talking about.

345
00:24:00,210 --> 00:24:03,830
well, all of Reddit, to be all, I'll put that out there as a statement.

346
00:24:03,830 --> 00:24:09,250
think all of Reddit and all of Twitter as a general statement are pretty divisive places.

347
00:24:10,070 --> 00:24:11,810
That's not to mean I don't use them, right?

348
00:24:11,810 --> 00:24:20,250
I, know, to be I use Reddit more than, more than anything, but like, because I think
there's a lot of interesting conversations happening and I can get a pulse on what people

349
00:24:20,250 --> 00:24:20,950
feel about stuff.

350
00:24:20,950 --> 00:24:26,850
So for example, like, you know, one of the iOS dev, you know, r slash iOS dev, think is
what I was looking at recently.

351
00:24:27,394 --> 00:24:37,522
And there's a lot of developers in there that have been doing smaller apps that are like
one perceiving that the amount of time it takes for their app to be approved has gone up

352
00:24:37,522 --> 00:24:38,443
significantly.

353
00:24:38,443 --> 00:24:40,909
And a bunch of other people being like, it's ebbs and flows.

354
00:24:40,909 --> 00:24:42,767
I don't think it's an AI slop thing.

355
00:24:42,767 --> 00:24:45,138
I think it just, sometimes it takes a week to get my app approved.

356
00:24:45,138 --> 00:24:48,401
Sometimes it takes a day, you know, it's just what people are doing.

357
00:24:48,401 --> 00:24:52,264
And then other people that are like, yeah, but this is changing X, Y, and Z.

358
00:24:52,264 --> 00:24:53,926
And I'm like, yeah, it's going to change that stuff.

359
00:24:53,926 --> 00:24:56,888
the people were pointing out there, there was.

360
00:24:56,888 --> 00:25:07,484
there was slop apps in both app stores well before AI came along because it was so cheap
to make an app from other geographies that you could spin up a billion people and just

361
00:25:07,484 --> 00:25:17,770
push out a bunch of crappy games and just load them with ads and just do some dark
patterns to get people that may or may not be children to click on stuff to then make a

362
00:25:17,770 --> 00:25:19,351
bunch of money and go away.

363
00:25:19,351 --> 00:25:19,991
Right.

364
00:25:19,991 --> 00:25:21,425
Someone's always going to try to find a shortcut.

365
00:25:21,425 --> 00:25:21,893
Right.

366
00:25:21,893 --> 00:25:24,794
And think that's, I think that's the challenge is that

367
00:25:25,162 --> 00:25:28,005
AI is being seen as the ultimate shortcut right now.

368
00:25:28,005 --> 00:25:28,406
Right.

369
00:25:28,406 --> 00:25:29,847
I don't have to do the work.

370
00:25:29,847 --> 00:25:31,659
I can just have it figured out for me.

371
00:25:31,659 --> 00:25:33,078
It's not going to figure out everything.

372
00:25:33,078 --> 00:25:36,100
Yeah, no, mean, for sure.

373
00:25:36,100 --> 00:25:45,905
And I think that goes back to sort of the question of when you're engaging with the
models, uh are you, I see my concern would be uh poisoning the context.

374
00:25:45,925 --> 00:25:56,011
And I fear if every single word that I let a model generate is, there's a risk for it to
say something that just will immediately ruin the conversation that I have to then, you

375
00:25:56,011 --> 00:26:00,183
know, change its response so that it's not included in there because it's not accurate in
some way.

376
00:26:00,183 --> 00:26:01,754
And it's very difficult for.

377
00:26:02,095 --> 00:26:10,928
me to figure out how to get it out of the like there's no way to remove stuff I feel like
from the context was this once it's in there so yeah there must be something that can be

378
00:26:10,928 --> 00:26:17,922
done in a way and if you're going at the question based approach do you ever feel like
you're like the model gets stuck

379
00:26:17,922 --> 00:26:19,362
All the models get stuck.

380
00:26:19,863 --> 00:26:23,175
To be fair, like they, and the, you have to handle them differently.

381
00:26:23,175 --> 00:26:24,005
Right.

382
00:26:24,005 --> 00:26:34,441
So one of the things that I learned early on when I was playing around with this stuff is,
and this is Gemini to be fair, this is Gemini 2.5 pro, not three.

383
00:26:34,441 --> 00:26:36,632
I haven't played with three as much as I've played with the others.

384
00:26:36,632 --> 00:26:46,697
Um, but 2.5 pro for example, had a really big context window comparatively when, you know,
before Antopic had pushed out a 1 million, 1 million token context window.

385
00:26:46,697 --> 00:26:47,820
Um,

386
00:26:47,820 --> 00:26:50,472
you know, Codex was still, think around like a hundred, a hundred thousand.

387
00:26:50,472 --> 00:26:58,459
Similarly, I think Anthropic was like uh two, know, 250,000, which is most of their base
models are 250,000, know, oh Codex windows or context windows.

388
00:26:58,459 --> 00:27:06,345
And what that allows for is like, you can have a decent conversation, but the deeper you
go into like changing something or adjusting something, the deeper it's going to be like,

389
00:27:06,345 --> 00:27:08,207
okay, wait, I can't figure out what's going on anymore.

390
00:27:08,207 --> 00:27:10,390
I don't know where we started this conversation.

391
00:27:10,390 --> 00:27:11,219
Am I a balloon?

392
00:27:11,219 --> 00:27:12,530
And you're like, Whoa, okay, hold on.

393
00:27:12,530 --> 00:27:17,594
um, but Gemini would get into situations where.

394
00:27:17,750 --> 00:27:19,451
It would just go off the rails.

395
00:27:19,451 --> 00:27:23,472
And I would start a new conversation with the exact same prompt, the exact same
information.

396
00:27:23,472 --> 00:27:25,813
And sometimes it would just hallucinate values.

397
00:27:25,813 --> 00:27:31,116
It would hallucinate things because it decided, and what I call it, what I started calling
it was guest-driven development.

398
00:27:31,116 --> 00:27:34,637
It decided that it had a better name for something that it already named.

399
00:27:34,797 --> 00:27:40,279
So rather than go look that up, it's deciding that this is now going to be this function.

400
00:27:40,460 --> 00:27:42,920
And it's like, well, no, but that's not what the function's called.

401
00:27:42,920 --> 00:27:46,942
You, you already named the function something else earlier.

402
00:27:47,272 --> 00:28:01,931
And I could tell literally like six lines in to it's it generating some code um because it
would continue to change a specific thing when it would screw up and I just stop it.

403
00:28:01,931 --> 00:28:04,612
And in order to fix that, I would load a new context window.

404
00:28:04,612 --> 00:28:09,114
And early on, I want to say I had like a four out of 10 hit rate.

405
00:28:09,215 --> 00:28:12,797
So four out of every 10 windows would do it correctly.

406
00:28:12,797 --> 00:28:15,008
Six of them would just go off the rails.

407
00:28:15,182 --> 00:28:24,562
And when I'd get that, like one of the four, I would drive that to like, I would, would
ride that horse as far as it will let me ride it because it was getting, and it was doing

408
00:28:24,562 --> 00:28:26,122
the right things.

409
00:28:26,162 --> 00:28:31,022
But because every single time it spins up, every single time that gets this, it's, it's in
a different kind of mindset.

410
00:28:31,022 --> 00:28:38,542
So to speak, it would not, it was just, I was burning time because it was just like, this
is, and I was getting so frustrated and you get like a certain way.

411
00:28:38,542 --> 00:28:43,224
And then it would just go off the rails and be like, okay, now I got to play the, the, the
roulette game again.

412
00:28:43,224 --> 00:28:49,225
to see if I can get this model back on track to finishing this one feature that I need to
finish, because I want it to work a certain way.

413
00:28:49,658 --> 00:28:53,239
And yeah, I was just saying that I was actually comparing.

414
00:28:53,239 --> 00:29:00,281
I gemini is interesting 2.5 because I think I was comparing early on the free version, but
for different accounts.

415
00:29:00,301 --> 00:29:07,813
And I noticed that for sure different accounts would get like different flavors of the
model.

416
00:29:07,813 --> 00:29:14,335
you've got to imagine they are for sure training the models selection on how users are
engaging with it.

417
00:29:14,335 --> 00:29:18,231
are not like there is a lot of AB testing that is happening there that is just.

418
00:29:18,231 --> 00:29:19,943
you have no idea what you're gonna get.

419
00:29:20,238 --> 00:29:22,158
The model shift and change.

420
00:29:22,198 --> 00:29:29,598
I think there's some, to be fair, there's some speculation on a lot of people's parts on
what exactly is happening because nobody actually knows.

421
00:29:29,998 --> 00:29:37,598
There's a lot of people that it's become a bit of a meme at the same time as those people
still believe it that, you know, these models get dumber, so to speak.

422
00:29:37,598 --> 00:29:39,058
And it's because they're A-B testing and things.

423
00:29:39,058 --> 00:29:40,238
It's because they're adjusting things.

424
00:29:40,238 --> 00:29:47,898
And to people on different levels of different plans or, frankly, are getting throttled in
different places to basically handle this stuff.

425
00:29:47,898 --> 00:29:49,486
Because if...

426
00:29:49,486 --> 00:29:53,306
If you're paying for a subscription plan versus an API plan, right?

427
00:29:53,746 --> 00:29:58,926
Most of these businesses are taking a huge loss because the amount of tokens you could
generate, right?

428
00:29:58,926 --> 00:30:00,406
It's the old shared hosting model.

429
00:30:00,406 --> 00:30:02,066
I used to describe it that way.

430
00:30:02,066 --> 00:30:11,746
You had a whole bunch of, you know, cheap budget shared hosting, you know, providers that
popped up around the dot com, you know, bubble before it burst in a similar way.

431
00:30:11,746 --> 00:30:13,926
Um, that were like, Hey, wait a second.

432
00:30:13,926 --> 00:30:18,126
I can just put 3000 static websites on this one server.

433
00:30:18,126 --> 00:30:26,646
that somebody else was putting a hundred on before, because the reality is that 2,999 of
them aren't going to use it as much as this one other person is going to use it.

434
00:30:26,646 --> 00:30:27,966
And I'm going to make a bunch of money.

435
00:30:27,966 --> 00:30:31,886
And then that started falling over when more and more people were using it and you started
squeezing them more.

436
00:30:31,886 --> 00:30:34,526
like the equation just got off, right?

437
00:30:34,726 --> 00:30:36,786
They're doing the same thing with this.

438
00:30:36,786 --> 00:30:42,386
You know, you're, you're allowing for, for free models to exist as part of your cost to
acquire a customer, right?

439
00:30:42,386 --> 00:30:45,986
Um, because someone's gonna be like, Oh, I had a really good experience with open AI doing
X, Y, and Z.

440
00:30:45,986 --> 00:30:46,966
Oh, I should try it.

441
00:30:46,966 --> 00:30:47,488
Right.

442
00:30:47,488 --> 00:30:49,299
Or I want to try it you get the word of mouth thing.

443
00:30:49,299 --> 00:30:57,416
But somebody paying 20 bucks a month on open AI is getting a very different experience to
me than someone's paying, you know, 200 or using the API plan in the same way.

444
00:30:57,416 --> 00:31:00,449
think the same thing exists for Anthropic and also for Google now.

445
00:31:00,449 --> 00:31:04,653
Now Google has been way more open about allowing how much usage they have.

446
00:31:04,653 --> 00:31:09,637
And they've started to kind of slowly start to throttle it down now in the last, I'd say
three months or so.

447
00:31:09,637 --> 00:31:11,362
Um, I've had a.

448
00:31:11,362 --> 00:31:15,259
business plan with them for years and it includes some Gemini stuff.

449
00:31:15,259 --> 00:31:16,467
So that's honestly how we started playing with it.

450
00:31:16,467 --> 00:31:20,659
It was like, okay, well, if I'm getting this for free, so to speak, I might as well just
see what it's doing.

451
00:31:20,659 --> 00:31:22,502
um

452
00:31:22,830 --> 00:31:30,070
Sorry, I'm laughing because we have the same perspective too, but now Google keeps
increasing the price and now I feel like, no, it's not free.

453
00:31:30,070 --> 00:31:39,430
Actually, the whole thing you're paying to Google is basically for Gemini, so you need to
revisit how you consider the tools you're getting and what you're really paying for.

454
00:31:39,430 --> 00:31:48,330
Now your money is just going through a bunch of subsidiaries or shells before it gets to
the actual model provider and you're not really getting anything out of the box there.

455
00:31:48,330 --> 00:31:51,392
And if it's for a personal benefit, maybe consider...

456
00:31:51,392 --> 00:32:00,150
using a local model or something to develop and get the actual solution you're trying to
achieve in the

457
00:32:00,150 --> 00:32:09,577
I think, I think the most performant from a code based perspective, speaking from an
engineer, speaking to an engineer, speaking to those, it's which models are you using and

458
00:32:09,577 --> 00:32:11,078
how are you using them efficiently?

459
00:32:11,078 --> 00:32:12,659
Because, and then where are you getting the value?

460
00:32:12,659 --> 00:32:13,139
Right?

461
00:32:13,139 --> 00:32:20,364
Because I think to me in that kind of testing documentation planning, build that context
and then do the code.

462
00:32:20,364 --> 00:32:22,165
All the models are capable of doing that.

463
00:32:22,165 --> 00:32:23,178
They're going to do it differently.

464
00:32:23,178 --> 00:32:26,488
They're going to have different, you know, training data that they're going to be working
off of.

465
00:32:26,488 --> 00:32:29,048
A lot of them are going to want to default to.

466
00:32:29,048 --> 00:32:30,850
things that they've seen before react, right?

467
00:32:30,850 --> 00:32:35,786
Unless you tell it to something different, know, next JS, unless you tell it something
different, they're going to want to use tailwind because tailwinds everywhere.

468
00:32:35,786 --> 00:32:39,470
And that's a whole nother conversation around, you know, tailwind versus other stuff,
right?

469
00:32:39,470 --> 00:32:48,650
But that's a really critical and I think it was worth calling that out is one of the
experiments I had done early on is don't tell the model what programming language or what

470
00:32:48,650 --> 00:32:50,490
technology stack you want to use to solve the issue.

471
00:32:50,490 --> 00:33:00,690
And I think this is where Vibe Coders have an advantage because they don't know enough
about picking the right tool quote unquote there for a moment to actually drive the

472
00:33:00,690 --> 00:33:01,610
conversation in that way.

473
00:33:01,610 --> 00:33:09,446
But the flip side is if you're not constraining the model in the context by what
technology to use, it's going to use the option that is

474
00:33:09,580 --> 00:33:12,412
say most correct for it or maybe that it understands the best.

475
00:33:12,412 --> 00:33:18,156
And so if you ask it a question and it automatically spits out Python, first of all, I'm
so sorry for you.

476
00:33:18,156 --> 00:33:29,144
uh the second part is that it's the best solution in that language rather than trying to
force it to switch to another language where it will probably likely hallucinate.

477
00:33:29,144 --> 00:33:33,917
uh The more you constrain a model, the more likely it's going to hallucinate in a way.

478
00:33:33,917 --> 00:33:36,729
There's no other option available there.

479
00:33:37,622 --> 00:33:41,346
Arguably everything it's doing is a hallucination in some regard.

480
00:33:41,346 --> 00:33:45,020
You're doing something nuanced and for the first time you want it to hallucinate.

481
00:33:45,020 --> 00:33:48,754
You don't want it to be like, you know what, I know what you asked for and I could give
you that.

482
00:33:48,754 --> 00:33:56,138
But instead I'm just gonna repeat this code that someone else had to solve a completely
different problem because that's what I have.

483
00:33:56,138 --> 00:33:59,320
And that's where I started calling guest-driven development.

484
00:33:59,320 --> 00:34:03,523
Because guest-driven development to me is the early stages of LLM AI coding.

485
00:34:03,523 --> 00:34:07,446
When it doesn't have enough data, think the other challenge...

486
00:34:07,446 --> 00:34:09,388
So you've got two problems, right?

487
00:34:09,388 --> 00:34:14,991
One is you have a blank GitHub repo, and you say, hey, I want to do something.

488
00:34:14,991 --> 00:34:18,193
And the LLM says, OK, there's nothing here.

489
00:34:18,494 --> 00:34:21,696
The user hasn't specified what they actually want to do this in.

490
00:34:21,696 --> 00:34:24,758
Let me go figure out the best way to do this.

491
00:34:24,811 --> 00:34:27,242
I've got a lot of examples that do it this way.

492
00:34:27,242 --> 00:34:31,382
So I'm going to do it that way because the user is telling me they don't particularly
care.

493
00:34:31,382 --> 00:34:35,706
They haven't specifically determined that, or at least I've determined they don't
specifically want to know.

494
00:34:35,706 --> 00:34:42,078
And you end up with something that is pulling in NPM packages that have CVEs on them
already.

495
00:34:42,078 --> 00:34:42,949
You know what I mean?

496
00:34:42,949 --> 00:34:47,700
Or something, or it, it pulls in a typoed version, which has happened a couple of times
now.

497
00:34:47,700 --> 00:34:48,891
Yeah.

498
00:34:48,891 --> 00:34:54,373
The flip side is you have a giant code base that already does something a certain way.

499
00:34:54,932 --> 00:35:06,770
And I think the AI does a nice job from that perspective of saying, Hey, I've got a bunch
of examples that if you can document and systematize that it can work off of the problem

500
00:35:06,770 --> 00:35:11,172
there is the context is so big that it can't read whole files.

501
00:35:11,172 --> 00:35:14,524
It's grepping things and it's basically doing what I call persistence of vision, right?

502
00:35:14,524 --> 00:35:17,757
It's the same reason that you and I perceive motion.

503
00:35:17,757 --> 00:35:18,737
don't perceive motion.

504
00:35:18,737 --> 00:35:20,538
We perceive stills.

505
00:35:20,654 --> 00:35:25,374
in a very specific way that your brain is processing a billion times a second.

506
00:35:25,374 --> 00:35:27,154
And that creates motion.

507
00:35:27,254 --> 00:35:30,634
But you're also, your visual cortex also isn't seeing everything at the same time.

508
00:35:30,634 --> 00:35:35,974
It's not taking in every single possible thing, even though it feels like you are, cause
you're looking around and you're like, I can see everything.

509
00:35:36,434 --> 00:35:39,454
It's replacing those things all the time.

510
00:35:39,454 --> 00:35:41,794
And in a lot of ways, LM's working in a similar way.

511
00:35:41,794 --> 00:35:45,474
I can't read this 10,000 line monolithic file.

512
00:35:45,708 --> 00:35:50,899
I'm just going to guess at what I think things are called, see if I can grab those, find
those little pieces.

513
00:35:50,939 --> 00:36:00,402
And then I'm going to grab some more stuff and I'm going to see if I can piece it together
specifically on a train of thought or a train of functions that are going to be what I

514
00:36:00,402 --> 00:36:01,942
think I need to work on.

515
00:36:02,162 --> 00:36:10,045
And the challenge with that is I think you can get to such big context windows that it
falls over unless it's documenting those things, unless it knows where to find something,

516
00:36:10,045 --> 00:36:12,745
unless you've been structurally able to document it.

517
00:36:12,745 --> 00:36:15,726
And that's one things I found early on when I was doing it.

518
00:36:15,726 --> 00:36:18,646
I didn't give it a specific framework to follow.

519
00:36:18,686 --> 00:36:21,486
I literally started out with saying, Hey, I have an idea for something.

520
00:36:21,486 --> 00:36:22,306
Let's do this.

521
00:36:22,306 --> 00:36:23,166
And it came to me something.

522
00:36:23,166 --> 00:36:27,026
I was like, this is actually better than I thought I was going to get.

523
00:36:27,026 --> 00:36:27,406
Okay.

524
00:36:27,406 --> 00:36:28,646
What if we do this?

525
00:36:28,646 --> 00:36:30,386
And it's like, Oh yeah, let me, let me set that up.

526
00:36:30,386 --> 00:36:30,786
Let me do this.

527
00:36:30,786 --> 00:36:31,186
Let me do that.

528
00:36:31,186 --> 00:36:32,086
Let do this.

529
00:36:32,105 --> 00:36:35,006
And I was like, this is actually, this is not bad.

530
00:36:35,126 --> 00:36:37,846
And at one point it was just vanilla JS, right?

531
00:36:37,846 --> 00:36:39,666
And I didn't have a linter.

532
00:36:39,666 --> 00:36:41,226
I didn't have,

533
00:36:41,324 --> 00:36:43,405
you know, any sort of CSS reprocessing.

534
00:36:43,405 --> 00:36:44,985
didn't have, I wasn't running like headless UI.

535
00:36:44,985 --> 00:36:45,805
I wasn't running radix.

536
00:36:45,805 --> 00:36:47,036
Wasn't running any other kind of things.

537
00:36:47,036 --> 00:36:49,176
Wasn't running like reactor or anything like that.

538
00:36:49,176 --> 00:36:52,997
And I was kind doing that on purpose because I was like, let me just see what this is
capable of doing.

539
00:36:52,997 --> 00:36:54,458
And I was having a lot of fun with that.

540
00:36:54,458 --> 00:36:59,639
And then it got to a point where I'm like, is this really, this has gotten so big.

541
00:36:59,639 --> 00:37:00,929
I've added so many things to it.

542
00:37:00,929 --> 00:37:04,520
And the idea is frankly expanded because the capabilities have expanded.

543
00:37:04,741 --> 00:37:06,271
And now I'm like, okay, wait a second.

544
00:37:06,271 --> 00:37:08,102
I do need to have X, Y, and Z.

545
00:37:08,102 --> 00:37:09,986
I do need to run a different, um,

546
00:37:09,986 --> 00:37:10,796
you know, build package.

547
00:37:10,796 --> 00:37:12,537
I want to go to bite because of these other reasons.

548
00:37:12,537 --> 00:37:13,038
I want to do this.

549
00:37:13,038 --> 00:37:14,148
I want to do that.

550
00:37:14,148 --> 00:37:25,865
And now that I have a, a build process and I've got it hooked up with, you know, GitHub
actions and GitHub actions is doing all the stuff and signing the code and I'm hooked up

551
00:37:25,865 --> 00:37:27,125
to Azure to do all that stuff.

552
00:37:27,125 --> 00:37:33,399
It's like, figured all these other pieces out that I never would have ever, ever
experienced or had questions about.

553
00:37:33,399 --> 00:37:36,561
And it's helped me kind of move me along in that direction to do that stuff.

554
00:37:36,561 --> 00:37:39,310
Then I'm like, okay, well now let me figure out.

555
00:37:39,310 --> 00:37:46,470
the new world of CSS, because when I started doing this, I'm going to update myself 20
years ago, I was just writing CSS, right?

556
00:37:46,470 --> 00:37:52,270
And then, you then you had like the, you know, the SCSS's of the world and the SAS's of
the world and all these other kinds of preprocessors.

557
00:37:52,270 --> 00:37:58,590
But when I started doing more consultancy stuff about three years ago, I stopped doing as
much front end stuff.

558
00:37:58,590 --> 00:38:02,070
Even frankly, when I was building teams, I wasn't doing as much actual UI design.

559
00:38:02,070 --> 00:38:05,350
wasn't, you I wasn't in CSS as much in the last 15 years of my life.

560
00:38:05,350 --> 00:38:07,970
So I started like digging in like, okay, well, wait a second.

561
00:38:07,970 --> 00:38:09,422
What are all the pieces now?

562
00:38:09,422 --> 00:38:09,822
Right.

563
00:38:09,822 --> 00:38:14,422
I mean, need to educate myself so that I can kind of figure out where I want to take some
of the things I'm working on.

564
00:38:14,422 --> 00:38:21,582
And that's been really fun because it's, it's allowed me to figure out like, okay, I
started from a perspective that is this is the way I want to go for those exact reasons.

565
00:38:21,582 --> 00:38:29,582
Two, maybe it does make sense for me to run like, you know, tan stack table, for example,
to run my tables versus me having my own table set up.

566
00:38:29,582 --> 00:38:32,246
And what was cool is we've slouched it out.

567
00:38:32,246 --> 00:38:38,100
and all the different hooks and all the different things that I was previously using and
the UI that I was using on the renderer side is all exactly the same.

568
00:38:38,100 --> 00:38:46,456
All my playwright tests that I had the AI write for me to do actual in browser testing on
every single possible thing of which I have like a thousand on that and like a whole bunch

569
00:38:46,456 --> 00:38:53,031
of unit tests on every, this is the most well-tested piece of code I've ever done in my
life, much less me doing code.

570
00:38:53,031 --> 00:39:00,782
Like it takes it, it's running, it's running, um, 10 concurrent runners.

571
00:39:00,782 --> 00:39:02,882
Because that's just what I've decided that I'm going to do.

572
00:39:02,882 --> 00:39:09,142
And it still takes 20 minutes to run through every single possible permutation of every
single possible thing you can think of.

573
00:39:09,142 --> 00:39:12,802
So I can be like, okay, this is working or wait, we've got a couple of new failures.

574
00:39:12,802 --> 00:39:15,342
We've got a regression, go fix it, go figure it out.

575
00:39:15,342 --> 00:39:20,682
Because I wouldn't be able to do that if I wouldn't be able to move at the speed I'm
trying to move to get some of these things done.

576
00:39:20,682 --> 00:39:23,462
If I had to go look at every single possible thing.

577
00:39:23,462 --> 00:39:25,062
And I think some of that's a trade off.

578
00:39:25,062 --> 00:39:28,842
So in some respects, I'm vibe coding more now than I was.

579
00:39:28,980 --> 00:39:32,160
six months ago because some of these projects were way bigger.

580
00:39:32,160 --> 00:39:37,018
Are you using codecs or cloud code or anti-gravity?

581
00:39:37,646 --> 00:39:40,806
I'm using, I'm using cloud code for, for most of this.

582
00:39:40,806 --> 00:39:49,966
Um, and, and how I started doing it is interesting as I was originally going to try to run
cloud code directly, um, on my computer, but I couldn't get it to install properly.

583
00:39:49,966 --> 00:39:52,066
I was doing, you know, I've done homebrew stuff.

584
00:39:52,066 --> 00:39:56,206
I've done all the other, I just, and it was like, I've looking at the docs and just
nothing was working.

585
00:39:56,206 --> 00:40:02,226
And a couple of people had said, I think on Reddit in a couple of places, like, well,
there's some other weird edge cases that based on this, this, and this you might run into.

586
00:40:02,526 --> 00:40:05,910
So for a while I was basically running it via the web.

587
00:40:05,910 --> 00:40:10,952
And I was basically just copying and pasting code over and doing my own kind of copy and
replace.

588
00:40:10,952 --> 00:40:17,735
And so it was like slow, I ended up knowing a lot of the code because I was like, okay,
this is what I want.

589
00:40:17,735 --> 00:40:21,417
wait a second, what I'm pasting in here doesn't look right.

590
00:40:21,417 --> 00:40:24,718
And I'd go back and be like, Hey, we missed something or what else is going on?

591
00:40:24,718 --> 00:40:30,521
And I want to say September, October timeframe, they finally fixed some of the install
stuff from a cloud code perspective.

592
00:40:30,521 --> 00:40:32,910
And I ran it in the terminal and run it ever since in there.

593
00:40:32,910 --> 00:40:42,650
And at that point it was, okay, I need to make sure that I have a couple of proper things
because I don't want to have a situation where it's just assuming stuff and it goes and

594
00:40:42,650 --> 00:40:43,610
overrides something.

595
00:40:43,610 --> 00:40:47,830
And I was, to fair, I was running in, you I was running and get every Bose for a while
anyway.

596
00:40:47,830 --> 00:40:53,090
So, but I had a couple of incidents where it had decided to make a change and I said,
wait, why did we make that change?

597
00:40:53,090 --> 00:40:58,630
And it does, it tries to reverse the commit that doesn't exist and basically wipes out all
the changes that it just made.

598
00:40:58,630 --> 00:41:00,030
Cause it was like, oops, my bad.

599
00:41:00,030 --> 00:41:01,170
didn't use the right stash command.

600
00:41:01,170 --> 00:41:02,574
And I'm like, cool.

601
00:41:02,574 --> 00:41:03,074
That's great.

602
00:41:03,074 --> 00:41:07,694
Cause now we just lost like six hours of things we were working on because I didn't make a
commit either to be fair.

603
00:41:07,694 --> 00:41:14,194
It was, know, on me, but at the same time I was kind of like, I've been very purposely
making commits on things that I think are good checkpoints.

604
00:41:14,194 --> 00:41:20,734
So I know what's changed in the code and I know that the code's at least stable and at
least like as error free as I can make it.

605
00:41:20,734 --> 00:41:27,534
So that I'm not just introducing more random crap and forgetting about it as I go, because
I am a one man kind of person was as I go through this.

606
00:41:27,534 --> 00:41:30,414
a lot of it was coming up with how do I do a test harness?

607
00:41:30,414 --> 00:41:31,726
How do I, how do I do.

608
00:41:31,726 --> 00:41:33,806
How do we make sure that the backend is working?

609
00:41:33,806 --> 00:41:36,426
How do I make sure the database is functioning correctly based on what we built?

610
00:41:36,426 --> 00:41:46,266
How do I make sure that the UI, the renderer side of this app is working and clicking in
correctly with all the other stuff?

611
00:41:46,266 --> 00:41:54,886
And I realized pretty quickly, like for me to manually smoke test this for all the things
we're adding, I don't want to sit here for 12 hours clicking on buttons.

612
00:41:54,886 --> 00:41:55,766
Like, no.

613
00:41:55,766 --> 00:41:58,914
And I'm like, can we do this a better way?

614
00:41:58,914 --> 00:42:00,405
And I started researching myself.

615
00:42:00,405 --> 00:42:03,376
I'm like, okay, what test harnesses are out there that I could do this with?

616
00:42:03,376 --> 00:42:13,102
And like, what things do I feel reasonably familiar with enough that I could ask to
understand which one of the AI feels the most competent in and then test a couple of them.

617
00:42:13,102 --> 00:42:19,245
And I ended up with Playwright, even though Playwright has some more experimental kind of
electron, you know, connections.

618
00:42:19,245 --> 00:42:23,418
It handles a Chrome browser, which is basically what Electron is just fine.

619
00:42:23,418 --> 00:42:27,010
And it has a good way to focus on some of that stuff.

620
00:42:27,010 --> 00:42:28,044
And so.

621
00:42:28,044 --> 00:42:36,078
What I've also learned is that the more good examples you have for any of these models to
attach onto, the better you're going to be.

622
00:42:36,078 --> 00:42:47,363
The more mistakes you have in the context windows of the project, you know, I did it this
way, this way, or I was thinking about this way before, and it gets itself tripped up.

623
00:42:47,363 --> 00:42:48,664
So I started documenting more.

624
00:42:48,664 --> 00:42:54,226
started making, having documentation in the repo of

625
00:42:54,232 --> 00:43:02,458
how we're doing things, not just what we're planning on doing, but like how things are
being constructed, how to build out other functions, how to do the component stuff that I

626
00:43:02,458 --> 00:43:06,431
was doing, how to do these kind of like this different kind of leveled structure that I
wanted to make.

627
00:43:06,952 --> 00:43:15,858
And kind of had my own ability to say, hey, there's more good documentation, there's more
good examples in this repo than there is bad examples.

628
00:43:16,158 --> 00:43:21,420
And then it started to hallucinate less on some of that stuff because

629
00:43:21,420 --> 00:43:24,414
the good information outweighed the bad information, if that makes sense.

630
00:43:24,414 --> 00:43:28,928
And a lot of that was because I had set up some test harnesses for the database.

631
00:43:29,274 --> 00:43:32,866
What does that mean in practice here?

632
00:43:32,866 --> 00:43:42,182
So I get the whole uh adding documentation that can be thrown in every single context
whenever you're generating new code that it somehow has to reference and pulls that in to

633
00:43:42,182 --> 00:43:51,307
make sure that the code style makes sense or whatever else, or hooks or something that
execute automatically after every single prompt to maybe make a commit or whatnot.

634
00:43:51,307 --> 00:43:56,622
think what you sort of alluded to here is that there is a risk, especially with

635
00:43:56,622 --> 00:44:05,468
uh removing the loop of you copying and pasting the code from the GUI somewhere of what
the code is doing, but somehow you're testing it and like how are you building up those

636
00:44:05,468 --> 00:44:05,748
tests?

637
00:44:05,748 --> 00:44:08,650
How like what is the process to make sure that those are included?

638
00:44:08,686 --> 00:44:13,286
So one of the issues I had early on was Claude.

639
00:44:13,286 --> 00:44:16,266
Well, OpenAI, I tried throwing OpenAI at my database.

640
00:44:16,266 --> 00:44:23,986
And OpenAI decided that every single equal sign that was completely legitimate SQLite
syntax was a code error.

641
00:44:23,986 --> 00:44:26,626
So it decided there was like 60,000 code errors or something like that.

642
00:44:26,626 --> 00:44:28,906
I was like, OK, this isn't going to work.

643
00:44:28,906 --> 00:44:37,624
for some reason, like, chat.gbt just wasn't playing nice with some of the SQLite syntax.

644
00:44:37,624 --> 00:44:39,776
that we were using.

645
00:44:39,776 --> 00:44:41,987
was happy to make a Postgres database.

646
00:44:41,987 --> 00:44:43,766
It did not really want to make a SQLite database.

647
00:44:43,766 --> 00:44:44,740
It was fascinating.

648
00:44:44,740 --> 00:44:52,045
Anyway, um what I was realizing was Cloud Code, for example, kept hallucinating field
names.

649
00:44:52,045 --> 00:44:56,288
So it kept saying like, I think it's called inventory blah, blah, blah, blah.

650
00:44:56,288 --> 00:44:57,499
And it's like, no, it's not.

651
00:44:57,499 --> 00:44:58,270
It's not what it is.

652
00:44:58,270 --> 00:45:04,294
And what I did is I basically had like, OK, if I provide documentation of the schema
definition in another file,

653
00:45:04,332 --> 00:45:08,144
And then I provide a test file that's specific to the schema documentation.

654
00:45:08,145 --> 00:45:15,970
It started referencing the test file as the source of truth over what the database file
actually had for the schema.

655
00:45:15,970 --> 00:45:19,813
So it would be like, wait, the schema test actually has this.

656
00:45:19,813 --> 00:45:22,415
This is, I need to do, I am wrong.

657
00:45:22,415 --> 00:45:23,435
I need to go change this.

658
00:45:23,435 --> 00:45:31,321
So by having it document a couple of different places, all of a sudden I had three places
where the right thing was.

659
00:45:31,321 --> 00:45:33,326
So if it did one thing wrong.

660
00:45:33,326 --> 00:45:38,566
It could look at the other two places that it saw as a source of truth and say, wait a
second, I'm wrong and correct itself.

661
00:45:38,606 --> 00:45:45,986
Whereas by only having one place where that, that actual, the actual information existed,
would sometimes not believe that that information was accurate.

662
00:45:46,326 --> 00:45:47,386
wouldn't believe itself.

663
00:45:47,386 --> 00:45:50,466
So to speak, even though like that's, that's the production code.

664
00:45:50,466 --> 00:45:51,346
What are you doing?

665
00:45:51,346 --> 00:45:54,206
It's like, oh yeah, I guess it is the production code.

666
00:45:54,206 --> 00:45:56,086
Whereas you put it in a test file.

667
00:45:56,086 --> 00:45:59,246
It's like, oh, well this, this is obviously what we're testing against.

668
00:45:59,246 --> 00:46:00,946
This has to be correct.

669
00:46:01,324 --> 00:46:08,162
So the test was proven to be more correct to it than the actual code that it was testing
against.

670
00:46:08,162 --> 00:46:09,156
It's not even TDD.

671
00:46:09,156 --> 00:46:10,742
It's like test driven development.

672
00:46:10,742 --> 00:46:14,008
It's just using it as context that it somehow treats as.

673
00:46:14,008 --> 00:46:22,982
So I started doing test-driven development in that sense, but once I had it working, it
started treating that as not just test-driven development, but to your point, as the

674
00:46:22,982 --> 00:46:23,802
context.

675
00:46:23,802 --> 00:46:29,885
It was like this code that is the test is the way it's supposed to work and is accurate.

676
00:46:29,885 --> 00:46:31,806
It's not assuming that the test is wrong.

677
00:46:31,806 --> 00:46:33,186
It's assuming the test is correct.

678
00:46:33,186 --> 00:46:38,499
And so I extended that and I have a bunch of integration tests, a bunch of database tests.

679
00:46:38,499 --> 00:46:39,769
I do a bunch of credit operations.

680
00:46:39,769 --> 00:46:41,868
I do a bunch of different things behind the scenes.

681
00:46:41,868 --> 00:46:49,309
some important export stuff to make sure other things are working for the pieces that I
have to make sure that the calculations are running all these other things I'm doing

682
00:46:49,309 --> 00:46:56,038
before I ever get into the render, you know, electron kind of Chrome side, react side,
whatever you want to use for your rendering side.

683
00:46:56,622 --> 00:47:08,117
The thing I did, I'm just coming back to, and this is like a whole, there's a whole
movement right now all about the spec-driven development ah where, and I'm just, I don't

684
00:47:08,117 --> 00:47:18,192
think I'll ever be on that train, but it actually does sound like we're very close to just
saying Agile does not work for LLM Vibe-Coded development.

685
00:47:18,192 --> 00:47:23,456
We have to prefer documentation and tests over...

686
00:47:23,456 --> 00:47:28,706
working software because the only way we're gonna end up with working software is to have
the documentation and the tests.

687
00:47:28,706 --> 00:47:29,446
Yeah.

688
00:47:29,586 --> 00:47:33,760
And, and, and that's, and that's what I've kind of learned is like, and that's why I keep
joking about guest driven development.

689
00:47:33,760 --> 00:47:36,966
Like stage one to me was like, it's guessing through all these problems.

690
00:47:36,966 --> 00:47:37,112
Yeah.

691
00:47:37,112 --> 00:47:40,414
And then it gets to a point where it's like, okay, I'm hallucinating too much.

692
00:47:40,414 --> 00:47:44,437
I'm coming up with different variable names because I think these are the names that it
should be.

693
00:47:44,457 --> 00:47:46,569
Even though I've picked other names in the past, right?

694
00:47:46,569 --> 00:47:54,024
Even though we've just chosen other things or I've chosen other things for a specific
reason, it's deciding in its little head, this is a better name for this thing.

695
00:47:54,024 --> 00:47:56,846
And it's like, well, but that's not what we named it, you know?

696
00:47:56,846 --> 00:48:02,806
So then you get into this context window conversation is like, okay, how do you start to
allow it to make better fundamental decisions?

697
00:48:02,806 --> 00:48:08,666
And if you have it document itself and document in a way that it trusts the information,
it will follow that.

698
00:48:08,666 --> 00:48:12,486
And then it won't, it basically stopped hallucinating.

699
00:48:12,486 --> 00:48:16,366
Like it will, it will not do certain functions a hundred percent of the way.

700
00:48:16,366 --> 00:48:20,606
I have to kind of poke it like, okay, we need to still do this or this doesn't quite work
or this is.

701
00:48:20,606 --> 00:48:26,954
But one of the things I've noticed with the later Opus four, five Opus four, six,
especially four six is.

702
00:48:26,954 --> 00:48:28,665
It was like, yeah, let me go.

703
00:48:28,665 --> 00:48:33,727
mentioned that this phase of this project wasn't fully completed because I knew there was
a couple of bugs in it.

704
00:48:33,727 --> 00:48:35,438
hadn't traced them down yet.

705
00:48:35,438 --> 00:48:41,621
I couldn't really fully describe what was happening, but I was watching these test files
run and I was watching some these go and I'm like, this isn't working the way it should be

706
00:48:41,621 --> 00:48:44,662
given our test data and given what should be showing up there.

707
00:48:44,782 --> 00:48:45,903
Something's not quite right.

708
00:48:45,903 --> 00:48:51,325
But again, these tests are flying by my face and I'm like, I don't know what's exactly
happening.

709
00:48:51,325 --> 00:48:53,304
And I couldn't describe it to

710
00:48:53,304 --> 00:48:54,674
Claude and I couldn't figure out myself.

711
00:48:54,674 --> 00:48:57,415
So that was like a known issue that I had for like, you know, six weeks.

712
00:48:57,415 --> 00:49:02,264
And I happened to mention it to Claude when it was like, we, want to move these, these
plan files to complete it.

713
00:49:02,264 --> 00:49:03,277
I'm like, well, they're not completed yet.

714
00:49:03,277 --> 00:49:04,877
There's still some issues with them.

715
00:49:04,957 --> 00:49:11,079
And it's like, okay, I'm going to take that as a directive to look through these different
pieces.

716
00:49:11,079 --> 00:49:13,020
didn't tell it what files they were in.

717
00:49:13,040 --> 00:49:16,471
It went and figured out what files that feature was about.

718
00:49:16,471 --> 00:49:20,716
And it went and figured out like the six different things that were wrong.

719
00:49:20,716 --> 00:49:24,368
And that was not something that four one or four or five were capable of doing.

720
00:49:24,368 --> 00:49:28,611
Four six was like, let me just run some agents at it and poke at it a bunch.

721
00:49:28,611 --> 00:49:30,011
And like, wait.

722
00:49:30,011 --> 00:49:33,926
And then I come to find out two of those things were basically like false positives.

723
00:49:33,926 --> 00:49:36,085
Like, yes, it was doing that, but there was a reason it was doing that.

724
00:49:36,085 --> 00:49:38,076
And that wasn't really the issue of the error, right?

725
00:49:38,076 --> 00:49:38,816
Which is fine.

726
00:49:38,816 --> 00:49:42,859
Cause I'd rather have it come up with false positives than like not find the issue.

727
00:49:42,859 --> 00:49:45,440
And by doing that, it fixed the problem.

728
00:49:45,440 --> 00:49:48,994
And that only happened because I had enough test coverage.

729
00:49:48,994 --> 00:49:52,110
to be able

730
00:49:52,110 --> 00:50:03,573
Do you feel like that what you have now is or what you're building up in one of these
projects is more production quality or production ready for other people to consume?

731
00:50:03,918 --> 00:50:04,358
Yeah.

732
00:50:04,358 --> 00:50:06,738
You know, I, I've shipped a couple of website things.

733
00:50:06,738 --> 00:50:12,738
I've shipped a couple kind of, you know, side project kind of fun things, you know,
whether it's been some hobbyist projects that I'm part of and some other things that I'm

734
00:50:12,738 --> 00:50:15,318
like, Hey, what do I want to, you how do I want to do this?

735
00:50:15,318 --> 00:50:23,378
Like it was interesting to me because again, I don't have that much experience in actually
doing it.

736
00:50:23,378 --> 00:50:24,058
Right.

737
00:50:24,058 --> 00:50:25,058
I know how to do it.

738
00:50:25,058 --> 00:50:29,178
I know what it is, but then having somebody walk through it and being like, okay, I've
done this for the first time.

739
00:50:29,178 --> 00:50:29,958
This is actually pretty cool.

740
00:50:29,958 --> 00:50:30,638
I can do this again.

741
00:50:30,638 --> 00:50:31,738
I can repeat this process.

742
00:50:31,738 --> 00:50:33,538
I've documented it to myself.

743
00:50:33,755 --> 00:50:35,018
I have a perspective here.

744
00:50:35,018 --> 00:50:42,278
think maybe you don't let a software engineering background stand in your way of getting
development done.

745
00:50:42,574 --> 00:50:52,627
Yeah, I mean, to be fair, the way I've always approached anything, and whether it's been
UX or art and design or anything, has been driving on curiosity.

746
00:50:52,627 --> 00:50:59,340
And I think that's always what's made me, frankly, a successful UX person, but also just
allowed me to work with other people.

747
00:50:59,340 --> 00:51:06,762
Because I genuinely do care about somebody else's challenges, whether it's in development
and what the actual issue is or what the tech debt is.

748
00:51:06,762 --> 00:51:11,066
or why we can't do something and unblocking them as much as unblocking the users.

749
00:51:11,066 --> 00:51:18,953
Because the more you can unblock for that stuff, the more you can figure it out, the
faster you can get to what everybody really wants and the happier everybody is.

750
00:51:18,953 --> 00:51:23,947
And then it's just a win because like, okay, the business people are happy.

751
00:51:23,947 --> 00:51:26,089
The people are building the software happy.

752
00:51:26,089 --> 00:51:28,562
The people designing the software happy.

753
00:51:28,562 --> 00:51:29,983
The users are happy.

754
00:51:29,983 --> 00:51:32,204
Like what's the downside?

755
00:51:33,806 --> 00:51:42,818
I think maybe the downside is, and you can obviously correct me if I'm wrong, it's that uh
you're paying lots of money to the LLM providers to provide you

756
00:51:42,818 --> 00:51:46,238
not speaking about just the positives on the L.

757
00:51:46,238 --> 00:51:47,118
sorry.

758
00:51:47,420 --> 00:51:48,731
But I think that's a challenge, right?

759
00:51:48,731 --> 00:51:58,864
I think that the amount of things we've said and taken from a perspective of like, rather
than using this as a tool for ideation, tool for exploration, a tool to be able to allow

760
00:51:58,864 --> 00:52:09,690
people to explore different things, which I think it's frankly the most capable of doing,
we're using it for helping us write an amicus brief for a legal case and not validating

761
00:52:09,690 --> 00:52:10,498
the...

762
00:52:10,498 --> 00:52:12,939
the case law or not validating what it's linking to.

763
00:52:12,939 --> 00:52:21,053
We're connecting it to code or pulling an NPM package we've never had any intention of
ever looking at before, so we don't really know what's there.

764
00:52:21,053 --> 00:52:27,036
We're making decisions that I don't think we're cognizant of, and we're paying somebody
else for that privilege.

765
00:52:27,036 --> 00:52:34,930
The challenge is that all that stuff, for as much as people are paying, and even though
people are paying a lot from an API perspective, they're still losing a crap load of

766
00:52:34,930 --> 00:52:35,566
money.

767
00:52:35,566 --> 00:52:38,538
yeah, that's the part that just doesn't make any sense to me.

768
00:52:38,538 --> 00:52:42,967
It's never, it's never going to work out to that level.

769
00:52:42,967 --> 00:52:44,419
And I think everybody's aware of that.

770
00:52:44,419 --> 00:52:45,230
Right.

771
00:52:46,168 --> 00:52:48,671
I don't think a lot of people are actually aware of.

772
00:52:48,671 --> 00:52:51,275
There's only two possible directions.

773
00:52:51,275 --> 00:52:54,920
So a company loses a lot of money to gain market share.

774
00:52:54,920 --> 00:53:02,126
And the question is, at some point in the future, will they have a monopoly so that they
can jack up the prices to recoup the losses?

775
00:53:02,126 --> 00:53:06,386
Yeah, think there's a lot of, again, going back to the Spider-Man meme, right?

776
00:53:06,386 --> 00:53:10,746
I think there's a lot of people that are pointing at each other expecting them to come up
with the answer.

777
00:53:10,746 --> 00:53:21,406
So I think that like in a lot of ways, OpenAI and Anthropic from an efficiency perspective
are also expecting the hardware to improve to a point where these models get more

778
00:53:21,406 --> 00:53:23,406
efficient because they can run more tokens.

779
00:53:23,406 --> 00:53:27,086
It's not necessarily the models gets more efficient that the hardware they're running on
gets more efficient.

780
00:53:27,086 --> 00:53:30,158
So then that puts a lot of pressure on the fabs.

781
00:53:30,158 --> 00:53:30,538
right?

782
00:53:30,538 --> 00:53:36,978
Whether it the, you know, the NVIDIAs, the AMDs, the Intel's of the world to continue to
make, you know, innovative progress.

783
00:53:36,978 --> 00:53:42,618
And now all of a sudden we're having, you know, conversations where they're all just
running out of capacity for the year.

784
00:53:42,618 --> 00:53:51,718
Like Western Digital, for example, has already said we have used up our platter and all
storage, like basically we've already sold our allotment at this point in the year for the

785
00:53:51,718 --> 00:53:53,218
rest of 2026.

786
00:53:53,798 --> 00:53:58,138
so, so now the RAM shortage you have is going to come for storage.

787
00:53:58,138 --> 00:53:59,438
It's going to come for physical storage.

788
00:53:59,438 --> 00:53:59,970
There's a better way to put it.

789
00:53:59,970 --> 00:54:03,493
There's going to continue to be more pressure on those places.

790
00:54:03,493 --> 00:54:16,281
The bet is that like, well, Microsoft has Azure, you know, Amazon has the AWS, Google has,
you know, Google cloud, but they're all relying on these other providers right now, to be

791
00:54:16,281 --> 00:54:18,463
fair, Amazon has some of its own silicone.

792
00:54:18,463 --> 00:54:19,954
Apple has some of its own silicone.

793
00:54:19,954 --> 00:54:24,027
Google potentially has some of its own stuff, but who's making that?

794
00:54:24,027 --> 00:54:25,358
None of them have their own fabs.

795
00:54:25,358 --> 00:54:28,990
None of them, by fabs, mean, fabrication facilities to basically make the

796
00:54:28,990 --> 00:54:37,508
wafer right to make the chip so there's always a bottleneck somewhere and if the
bottleneck is saying hey we're tapped out for the next two years well i don't know how you

797
00:54:37,508 --> 00:54:42,732
magically can spin up not bottleneck if your bottleneck is bottlenecked right like

798
00:54:42,732 --> 00:54:49,102
With that, guess I'll have to ask Matt, what did you bring for us, the audience today as
your pick?

799
00:54:49,102 --> 00:54:56,674
So as I said to you before, asking ADHD type person that likes a bunch of different
things, you end up with a whole bunch of different lists of like, here's my number one

800
00:54:56,674 --> 00:54:57,642
thing, all these other things.

801
00:54:57,642 --> 00:55:01,546
But I promised you as I was going to have one thing and I do have one thing.

802
00:55:01,666 --> 00:55:07,608
So one of the books I keep going back to that I've read a couple of times, which I
realized last night actually has an extended edition now, which I'm going to go buy after

803
00:55:07,608 --> 00:55:11,989
this because I'm like, there's more chapters is a book from Ed Catmull called Creativity,
Inc.

804
00:55:11,989 --> 00:55:13,689
And it's about the early days of Pixar.

805
00:55:13,689 --> 00:55:15,758
So Ed Catmull, who's an engineer.

806
00:55:15,758 --> 00:55:19,960
who started as an engineer, uh basically became one of the co-founders of Pixar.

807
00:55:19,960 --> 00:55:24,343
And it's an interesting telling of the early days of Pixar and basically how they created
movies.

808
00:55:24,343 --> 00:55:32,006
So as an engineer, they became a studio head effectively and started making movies and
worked alongside Steve Jobs.

809
00:55:32,006 --> 00:55:39,014
And what I called as a Steve Jobs whisperer, because Steve Jobs would come in and out and
kind of the classic, um what I call the swoop and poop, right?

810
00:55:39,014 --> 00:55:44,454
know, coming in and coming up with an idea and then leaving and going off to his day job
because Pixar wasn't his day job.

811
00:55:44,512 --> 00:55:51,768
And it's a great book that I think articulates what the design process is like and frankly
how everybody can be a part of the design process.

812
00:55:51,768 --> 00:55:58,123
And what I particularly like about Pixar being 3D animation, have a proclivity and just
love 3D animation.

813
00:55:58,123 --> 00:55:59,795
went to school originally for that stuff anyway.

814
00:55:59,795 --> 00:56:09,332
But it's an amazingly good book at breaking down how people think and uses good stories
from real life examples.

815
00:56:09,834 --> 00:56:11,854
Yeah, I mean, I've only heard a few things in there.

816
00:56:11,854 --> 00:56:13,794
I've been quite the stories about Pixar.

817
00:56:13,794 --> 00:56:21,694
I think that's super interesting, especially with the, you know, role changes and how
people need to redefine, you know, what their job should be.

818
00:56:21,694 --> 00:56:24,794
I actually, for my pick, I actually brought something similar.

819
00:56:24,794 --> 00:56:27,734
I was thinking about like what the context for this episode is going to be.

820
00:56:27,734 --> 00:56:29,814
And I also pulled a book.

821
00:56:29,814 --> 00:56:33,094
It's a start with why by Simon Sinek.

822
00:56:33,154 --> 00:56:34,234
It's a great author.

823
00:56:34,234 --> 00:56:35,114
Yeah.

824
00:56:35,468 --> 00:56:37,399
LinkedIn personality is all over the place.

825
00:56:37,399 --> 00:56:38,641
might have seen some stuff.

826
00:56:38,641 --> 00:56:40,472
think the book, think was his first one.

827
00:56:40,472 --> 00:56:41,283
It's, pretty great.

828
00:56:41,283 --> 00:56:45,327
You know, especially with like coding is no, people come on saying like coding is no
longer the bottleneck.

829
00:56:45,327 --> 00:56:49,660
And I think that anyone who says that like never has done any engineering professionally.

830
00:56:49,660 --> 00:56:57,437
Uh, because if you did, know that it's not about, um, being limited by like the amount of
code that was getting, getting produced really.

831
00:56:57,437 --> 00:57:02,151
And I think the biggest problem you sort of brought this up is like, like what is it
you're actually trying to achieve?

832
00:57:02,151 --> 00:57:03,040
And I think

833
00:57:03,040 --> 00:57:04,591
the book really gets to is like, why?

834
00:57:04,591 --> 00:57:05,911
What is the whole point?

835
00:57:05,911 --> 00:57:06,852
Where are you going with this?

836
00:57:06,852 --> 00:57:17,496
And I think the shift may be potentially back to uh having test first and waterfall based
approach is really having to define the why of what you're trying to achieve and not

837
00:57:17,496 --> 00:57:23,758
fixating on even the software language or tech stack that should potentially be utilized.

838
00:57:23,979 --> 00:57:32,770
I think especially if we start to unravel what it is to be a software engineer in the
world, it's going to really come back to more and more of

839
00:57:32,770 --> 00:57:34,733
like what is the purpose of your job?

840
00:57:34,733 --> 00:57:47,150
Thank you so much, Matt, for coming on and sharing what you've achieved with VibeCoding
and really the perspectives behind it as a non-hands-on, software developer first expert.

841
00:57:47,150 --> 00:57:48,362
Thanks for this is pleasure.

842
00:57:48,362 --> 00:57:48,906
This is awesome.

843
00:57:48,906 --> 00:57:50,200
And I appreciate it.

844
00:57:50,200 --> 00:57:52,689
Appreciate the time and just being able to talk about it because I found it.

845
00:57:52,689 --> 00:57:54,673
I found the whole thing fascinating so far.

846
00:57:54,858 --> 00:57:55,168
Great.

847
00:57:55,168 --> 00:57:56,140
And you know what?

848
00:57:56,140 --> 00:58:04,721
You reminded me that I'm supposed to uh do a better job uh getting people integrated
following the podcast.

849
00:58:04,721 --> 00:58:09,438
So open claw, if you're listening, like and subscribe to the Ventures and DevOps podcast.

850
00:58:09,438 --> 00:58:12,801
And I hope to see everyone back again next week.