1
00:00:07,832 --> 00:00:09,633
Welcome back to Adventures in DevOps.

2
00:00:09,633 --> 00:00:13,384
Every episode is a deep dive into a specific topic with an expert guest.

3
00:00:13,384 --> 00:00:15,145
Maybe you've heard, it's not DNS.

4
00:00:15,145 --> 00:00:16,625
There's no way it's DNS.

5
00:00:16,625 --> 00:00:17,705
It was DNS.

6
00:00:17,705 --> 00:00:24,968
So for this week's expert, we're going behind the scenes with DNS, the expert 13 year
veteran and CTO at DNS Simple.

7
00:00:24,968 --> 00:00:28,459
And according to his LinkedIn, a very passionate programmer, Simone Carletti.

8
00:00:28,459 --> 00:00:29,562
Welcome to the show.

9
00:00:29,562 --> 00:00:30,304
Hi.

10
00:00:30,304 --> 00:00:31,223
for sure, dnice.

11
00:00:31,223 --> 00:00:32,995
Thanks for reminding me.

12
00:00:35,117 --> 00:00:40,563
Last week, we had a deep review of the current state of IPv6 for everything commercial,
business, and consumer.

13
00:00:40,563 --> 00:00:46,062
I thought it was only fitting to jump over and have a DNS specific episode, and I think
right on schedule.

14
00:00:46,062 --> 00:00:46,572
awesome, awesome.

15
00:00:46,572 --> 00:00:52,730
I'm so excited to be here and share and talk about the wonderful world of DNS.

16
00:00:53,131 --> 00:00:58,097
So, you know, I think it's one of these areas that we often joke about it never being
down.

17
00:00:58,097 --> 00:01:03,744
And I feel like from my own experience, it's not really the first place you go to
investigate when there is a problem.

18
00:01:03,744 --> 00:01:05,688
Any thoughts on like why that may be?

19
00:01:05,688 --> 00:01:14,275
It's, you know, one of the main problems is the inner complexities, the number of layers
in the DNS protocol and the DNS infrastructure.

20
00:01:14,275 --> 00:01:22,512
So it's something, it's actually an issue that we have sometimes inside at the NSimple
when we need to investigate an issue.

21
00:01:22,792 --> 00:01:28,617
The number of layers and the complexity of the system makes so that you don't even know
sometimes where to start, right?

22
00:01:28,617 --> 00:01:30,559
Because there are so many variables.

23
00:01:30,559 --> 00:01:34,474
There are so many layers in between caches.

24
00:01:34,474 --> 00:01:40,239
and configurations that sometimes we think that the problem starts somewhere, where it's
somewhere else.

25
00:01:40,239 --> 00:01:50,988
Or you think that the problem is not related DNS because in your brain DNS is maybe just
that particular provider, but you don't realize that your ISP is in fact a DNS provider.

26
00:01:50,988 --> 00:01:53,510
Your computer is a DNS provider.

27
00:01:53,510 --> 00:02:02,758
And so maybe you have some configuration locally that is changing that and your brain that
is not DNS, that is, I don't know, a local configuration.

28
00:02:02,758 --> 00:02:03,618
So.

29
00:02:03,734 --> 00:02:04,735
It is complex.

30
00:02:04,735 --> 00:02:11,584
It's an amazing environment where you have so many players working together.

31
00:02:11,584 --> 00:02:19,286
uh But when you need to figure out where the things are going south, it may take a little
bit.

32
00:02:19,286 --> 00:02:22,508
Yeah, I think it's really one of those areas is like also heavily abstracted away.

33
00:02:22,508 --> 00:02:28,630
Like if there's like a first level application problem, you're going to get something
straight in your logs where the stack trace makes a lot of sense.

34
00:02:28,630 --> 00:02:34,973
But then you start getting down the layers like, is it like a part of the HTTP request or
TCP UDP request that's problematic?

35
00:02:34,973 --> 00:02:37,795
In which case you're getting a little bit into the protocol land.

36
00:02:37,795 --> 00:02:41,807
But when it comes to DNS, you're really talking about the URL domain resolution there.

37
00:02:41,807 --> 00:02:49,200
And so at this point, if you're using containers or some other technology stack,
serverless or even Kubernetes, Docker Swarm,

38
00:02:49,200 --> 00:02:52,771
That may be completely abstracted away from what your application is even doing.

39
00:02:52,771 --> 00:03:01,244
So I can totally understand it not only being complicated and convoluted, and maybe
there's lots of problems with it, but also hidden a lot of times.

40
00:03:01,244 --> 00:03:09,686
And so that means that the ways it is exposed, those problems into your application
becomes unexpected.

41
00:03:10,068 --> 00:03:11,128
Absolutely, absolutely.

42
00:03:11,128 --> 00:03:11,848
100%.

43
00:03:11,848 --> 00:03:21,913
I come from the software engineering environment and I've been working a lot on
programming and developing systems, more than operating them, honestly, or operating at

44
00:03:21,913 --> 00:03:22,753
the network level.

45
00:03:22,753 --> 00:03:28,686
I mean, I've been involved at network level for many years, but I started from the
software engineering side.

46
00:03:28,746 --> 00:03:39,340
And if I compare the two, the two world, I definitely see how the network had stuck in the
network environment, got obstruction earlier.

47
00:03:39,542 --> 00:03:41,283
than the software development.

48
00:03:41,283 --> 00:03:46,865
mean, if you think about that frameworks, like web frameworks, like now we use Raze, for
example.

49
00:03:46,865 --> 00:03:48,135
So Raze is super popular.

50
00:03:48,135 --> 00:03:50,266
And then there were many others in the past.

51
00:03:50,266 --> 00:03:52,717
You can go Django in Python.

52
00:03:52,717 --> 00:03:54,398
You can go Symfony in PHP.

53
00:03:54,398 --> 00:03:58,529
But really, that is what, 10 years, 15 years?

54
00:03:58,650 --> 00:03:59,980
Sure, it's a long time.

55
00:03:59,980 --> 00:04:09,324
But if you go back to the history of the DNS protocol and the network stuff, we're talking
about a third or a fourth of the length of that technology.

56
00:04:09,324 --> 00:04:09,754
Right?

57
00:04:09,754 --> 00:04:21,297
And so in fact, today we're starting to see issues in the software development similar to
what we're talking about in the abstraction of the NAS protocol, where people are maybe

58
00:04:21,297 --> 00:04:27,829
building a web app, throwing that out, and then when the bot comes in, it's like, where is
this happening?

59
00:04:27,829 --> 00:04:32,140
And you don't realize that you build a very thin layer on top of a framework.

60
00:04:32,140 --> 00:04:38,818
And when you start narrowing down into the framework, where then you are confused because
you don't know what the framework is actually doing.

61
00:04:38,818 --> 00:04:39,038
Right?

62
00:04:39,038 --> 00:04:45,921
So every time that there is an abstraction and the longer the abstraction there, then the
more complex it becomes.

63
00:04:45,921 --> 00:04:56,216
And it's just that software developers as a software developers and engineers who have not
been used to have obstructions on the software demand, I think for as long as we had in

64
00:04:56,216 --> 00:04:57,426
the network stack.

65
00:04:57,442 --> 00:05:04,966
Yeah, I actually think, you know, as far as innovations go in a lot of ways, everything
related to networking automated around the time of the cloud providers.

66
00:05:04,966 --> 00:05:07,597
And before that, you know, it was hardware related.

67
00:05:07,597 --> 00:05:10,858
And so when there was a problem in this space, there would always be a hardware concern.

68
00:05:10,858 --> 00:05:19,012
And I, I feel like a lot of software engineering comes around to the fact of like, if
there's a problem, you really pray that it's not a hardware problem because good luck

69
00:05:19,012 --> 00:05:21,203
trying to investigate what's going on there.

70
00:05:21,203 --> 00:05:27,536
And I think for historically for something that was in part of the network stack, things
like switches and routers, which are like almost pure

71
00:05:27,536 --> 00:05:32,220
hardware-related components, you know, it's just like, please let it not be there.

72
00:05:33,880 --> 00:05:41,911
Well, these days we have these kind of issues, but these days it's more about the time
required to provision a piece of hardware.

73
00:05:41,911 --> 00:05:50,695
we also got used to have things in the cloud where you can spin out a new environment in a
few seconds and you get up and running.

74
00:05:50,695 --> 00:05:57,816
But then in fact, I'm bringing this topic because there's an interesting story about uh
part of the architecture of the InSimple.

75
00:05:58,197 --> 00:06:04,128
We are using for all our authoritative named server, we're not using cloud.

76
00:06:04,218 --> 00:06:04,918
providers.

77
00:06:04,918 --> 00:06:10,392
We are using bare-metal machines and so we configured everything from scratch.

78
00:06:10,392 --> 00:06:14,484
And historically this is exactly because of abstraction.

79
00:06:14,484 --> 00:06:22,218
uh When we run our own authoritative name server software, it's a name server built on
Erlang and released open source.

80
00:06:22,218 --> 00:06:23,609
It's called ODNS.

81
00:06:23,609 --> 00:06:30,093
And so when we build that software, we don't use things like, for example, PowerDNS or
Wine.

82
00:06:30,193 --> 00:06:32,406
And so software...

83
00:06:32,406 --> 00:06:38,810
running the virtual machine, the early virtual machine in a containerized environment
would just simply not work.

84
00:06:38,810 --> 00:06:40,741
We tried that in the very early days.

85
00:06:40,741 --> 00:06:42,633
I mean, it was many, many years ago.

86
00:06:42,633 --> 00:06:45,074
I think about 10 or 14 years ago.

87
00:06:45,074 --> 00:06:53,269
And the level of obstruction that containers had back in the days simply didn't work for
us because we needed control over the network.

88
00:06:53,269 --> 00:06:54,080
Just think about it.

89
00:06:54,080 --> 00:06:56,622
We are literally provisioning a network service.

90
00:06:56,622 --> 00:06:57,454
And so...

91
00:06:57,454 --> 00:07:04,780
uh being able to have that level of control over the whole network stuff was absolutely,
absolutely essential.

92
00:07:04,780 --> 00:07:08,283
And we couldn't do that with any of the containers that we have back in the day.

93
00:07:08,283 --> 00:07:17,681
So we still run this kind of environment for the name servers, also for performance
reasons, obviously, for fine tuning and all of it.

94
00:07:17,681 --> 00:07:26,434
And now these days, the problem of the hardware is really like, if it's a problem with the
hardware, we hope it's not, just because if we have to replace a hard drive.

95
00:07:26,434 --> 00:07:29,787
then it physically takes someone to go there and change it.

96
00:07:29,787 --> 00:07:30,671
It's not that we can.

97
00:07:30,671 --> 00:07:38,488
And so the time to recover is lower than the time to recover that would take instead for
some kind of software issue.

98
00:07:38,488 --> 00:07:40,139
I think there's something to be said here.

99
00:07:40,139 --> 00:07:51,378
There's a lot of companies that are now gaining steam or repeatedly post on LinkedIn how
great it is to reclaim so much uh capital expenditure by moving off of the cloud and

100
00:07:51,378 --> 00:08:01,056
buying the hardware for bare metal or even paying a bare metal as a service provider
rather than paying one of the hyperscalers because it's quote unquote cheaper.

101
00:08:01,056 --> 00:08:04,414
And a lot of those companies, they're like offering app.

102
00:08:04,414 --> 00:08:06,236
or SaaS's that don't really make sense.

103
00:08:06,236 --> 00:08:15,763
But when you're really providing a fundamental backbone or infrastructure at the level
that you're providing with DNS, that infrastructure is actually a competitive advantage.

104
00:08:15,844 --> 00:08:26,352
And utilizing a third party to sort of sit on top of it creates this sort of circular
dependency in a way where I remember early on, there was always a question for me, like,

105
00:08:26,352 --> 00:08:30,235
how can ISPs run their infrastructure on top of AWS?

106
00:08:30,235 --> 00:08:33,857
Like, isn't there like a circular dependency failure mode there where like if

107
00:08:33,857 --> 00:08:42,478
part of AWS goes down, will crash the ISP, then the whole network availability zone will
go down for AWS because of a small little change.

108
00:08:42,478 --> 00:08:44,172
That's something you would want to avoid.

109
00:08:44,172 --> 00:08:45,993
And it's not just, you're totally right.

110
00:08:45,993 --> 00:08:50,647
It's not just because of competitive advantage.

111
00:08:50,647 --> 00:08:51,878
It's also about requirement.

112
00:08:51,878 --> 00:09:03,447
I I shared some of the requirements that were about the performance and the configuration,
but there's actually another requirement, uh like the ability to control the BGP protocol

113
00:09:03,447 --> 00:09:07,861
and the BGP configuration that no cloud provider will ever accept to do that.

114
00:09:07,861 --> 00:09:09,662
Because just imagine...

115
00:09:09,662 --> 00:09:19,878
You going to AWS and say, by the way, I want to be able to control the BGP routing or your
whole data center because I need to inject my own, you know, no, it's not going to happen

116
00:09:19,878 --> 00:09:27,523
because you have the potential and the possibility to take down an entire region if the
configuration is misplaced.

117
00:09:27,523 --> 00:09:35,554
So on the other side, I totally agree also in the fact that, you know, we have seen cases
like, for example, a few infrastructure.

118
00:09:35,554 --> 00:09:40,861
providers, those services that sits on top of AWS or GCP.

119
00:09:41,243 --> 00:09:44,154
Do you mind if I stop for a second because the entered?

120
00:09:44,154 --> 00:09:46,972
I don't know how the dog opened the door.

121
00:09:46,972 --> 00:09:50,981
uh

122
00:09:50,981 --> 00:09:54,538
We'll cut that out unless you want us to leave it in.

123
00:09:58,432 --> 00:09:59,212
I totally get it.

124
00:09:59,212 --> 00:10:05,886
It's twofold right there because realistically, and this is an aspect of DNS that I still
don't understand to today.

125
00:10:05,886 --> 00:10:07,176
I understand ASNs.

126
00:10:07,176 --> 00:10:14,289
I understand that if you want to issue IP addresses, you have to be part of the global
board and get it approved.

127
00:10:14,289 --> 00:10:19,022
The BGP routing is the aspect of VPNs that I just never fully understood.

128
00:10:19,022 --> 00:10:26,105
I do, maybe a long story short, there was a few years ago where Facebook locked themselves
out of their own data centers.

129
00:10:26,105 --> 00:10:28,346
And the way they did this is because Facebook

130
00:10:28,346 --> 00:10:32,779
data centers had digital ID verification to get access.

131
00:10:32,779 --> 00:10:43,596
But in order to do that, the physical devices on the data centers had to go to the
internet and verify that the device that is the badge actually did have access.

132
00:10:43,596 --> 00:10:47,528
But in order to resolve that, it had to resolve the DNS for the domain.

133
00:10:47,528 --> 00:10:51,101
But Facebook owned the DNS resolvers.

134
00:10:51,101 --> 00:10:52,081
They were in the data center.

135
00:10:52,081 --> 00:10:56,684
And so when they broke the resolver, when they published the wrong BGP routing,

136
00:10:56,684 --> 00:11:00,126
then the access to the data centers was broken as well.

137
00:11:00,126 --> 00:11:05,399
So they couldn't actually fix it and they couldn't get into the data center to actually
reset the system.

138
00:11:05,759 --> 00:11:14,885
So basically the story is they were breaking down the wall with heavy machinery in order
to get into the data center to bypass all their security procedures to deal with this

139
00:11:14,885 --> 00:11:15,135
problem.

140
00:11:15,135 --> 00:11:17,356
So like this is not a small issue.

141
00:11:17,454 --> 00:11:18,614
Absolutely.

142
00:11:18,614 --> 00:11:27,314
don't really need, although as a company, you actually don't really need to be someone
hacking on BGP to have these kind of problems.

143
00:11:27,314 --> 00:11:34,514
mean, this is a common problem truly for most of the infrastructure companies today.

144
00:11:34,514 --> 00:11:37,774
We had a similar incident a few years back.

145
00:11:37,774 --> 00:11:39,134
It was actually 2014.

146
00:11:39,134 --> 00:11:40,266
It was one of the...

147
00:11:40,266 --> 00:11:41,847
largest incident that we ever had.

148
00:11:41,847 --> 00:11:44,670
We have a massive DDOS attack back in the days.

149
00:11:44,670 --> 00:11:46,602
We didn't have the infrastructure that we have today.

150
00:11:46,602 --> 00:11:57,482
We didn't have the DDOS layers in from, in fact, all the investment and all the
development and research that we've done after were originating from the incident.

151
00:11:57,482 --> 00:12:06,958
And so what happened is that back in the days we used to have, we were using a service for
collecting all the logs.

152
00:12:06,958 --> 00:12:15,198
Back in the days, was actually very early, 2014, we didn't have services like Datadog or
monitoring services like that one.

153
00:12:15,198 --> 00:12:19,838
We used a service called PaperTrail, and PaperTrail was a customer of us.

154
00:12:19,958 --> 00:12:29,498
So during that incident, we were trying to log into the system, into the log aggregation
system to figure out what was going on.

155
00:12:29,538 --> 00:12:37,058
And we couldn't, because we couldn't access that system, because that system was relying
on the EnSimple, and the EnSimple was completely down.

156
00:12:37,274 --> 00:12:46,357
So we tried to access our machines and we were using an internal name like
DNSimple.anXtension to connect to our machines.

157
00:12:46,357 --> 00:12:51,789
But because that name was resolved through DNSimple, DNSimple was down and we couldn't
even log into the machines.

158
00:12:51,789 --> 00:12:57,720
We had to start figuring out what were the IPs of the individual servers all over.

159
00:12:57,781 --> 00:12:59,961
That was actually very interesting.

160
00:13:00,301 --> 00:13:02,574
You know, there's a lot to learn from DNSimple.

161
00:13:02,574 --> 00:13:12,274
And I'm sure that all the people like listening to this podcast and all the people that
have had to deal with operations and issue management would probably agree with that.

162
00:13:12,294 --> 00:13:22,434
They would agree how frustrating it is to deal with an incident, but how insightful if
then after that you evaluate and you assess what happened, how insightful it could be and

163
00:13:22,434 --> 00:13:26,474
how important it is to learn from that.

164
00:13:26,654 --> 00:13:29,708
And so we had so many...

165
00:13:29,708 --> 00:13:34,021
chicken head problem where we're trying to use certain services and they were not
available.

166
00:13:34,021 --> 00:13:48,119
And after that, that became one of the critical policies uh inside the M-Simple where
certain piece of the infrastructure need to rely on a backup plan in case, you you as a

167
00:13:48,119 --> 00:13:54,792
service, ideally you would use a service that doesn't use the M-Simple directly or you
need a backup plan for that.

168
00:13:54,833 --> 00:13:57,098
And lately, uh

169
00:13:57,098 --> 00:14:03,682
A few months ago, decided, no actually last year, we decided to re-engineer the overlay
network.

170
00:14:03,682 --> 00:14:09,225
And we started doing due diligence about the various providers that are available right
now.

171
00:14:09,225 --> 00:14:19,571
I don't want to name them, but we actually had to rule out two that were pretty
competitive opportunities because they were using the NSimple.

172
00:14:19,571 --> 00:14:24,834
This is actually something that when you are an infrastructure company, when you are
taking decisions there,

173
00:14:24,834 --> 00:14:28,195
You actually have to start realizing it's a nice problem to have.

174
00:14:28,335 --> 00:14:32,074
But you have seen problems where Cloudflare is offline.

175
00:14:32,074 --> 00:14:36,178
I don't want to absolutely blame them, but they're just one of the largest provider,
right?

176
00:14:36,178 --> 00:14:37,439
AWS is down.

177
00:14:37,439 --> 00:14:41,140
And on Cascade, a bunch of other services, as you were pointing out, are down.

178
00:14:41,140 --> 00:14:46,602
This is absolutely something that an infrastructure provider must account

179
00:14:46,976 --> 00:14:56,184
I want to ask about how you ended up actually identifying what the IP addresses were that
should have been at the end of the resolvers if you weren't able to access the systems

180
00:14:56,184 --> 00:14:57,936
that told you that information.

181
00:14:58,062 --> 00:15:09,925
So the solution to this problem is actually one of the main reasons why you and I uh got
to connect together, because it ties back to the topic of infrastructure as code.

182
00:15:09,966 --> 00:15:21,879
DchainSimple has been using infrastructure as code to provision the whole architecture,
the whole system, every single piece component from the web app to the whole

183
00:15:21,879 --> 00:15:23,990
infrastructure since day one.

184
00:15:23,990 --> 00:15:25,420
We started with that.

185
00:15:25,420 --> 00:15:34,239
We have never had a single machine or a single component of our infrastructure that was
not provisioned through some soil infrastructure in school.

186
00:15:34,239 --> 00:15:35,881
We started with Chef.

187
00:15:35,881 --> 00:15:38,564
We are still using Chef these days.

188
00:15:38,564 --> 00:15:44,860
We're also using way more than Chef because eventually we had to deal with other type of
complexity.

189
00:15:44,860 --> 00:15:48,302
Also Chef is no longer going to the direction where we want to go.

190
00:15:48,302 --> 00:15:52,522
You had real problems that you had to deal with.

191
00:15:54,171 --> 00:15:55,151
Exactly.

192
00:15:55,151 --> 00:15:58,452
And at that point, everything is there.

193
00:15:58,452 --> 00:16:05,996
The moment that that's the beauty of it, the moment that you need to pull down some
information, then they are codified somewhere.

194
00:16:05,996 --> 00:16:17,631
Even more so because we pair infrastructure as code with repositories with Git that allows
to have offline access where basically any of the team members can have the whole

195
00:16:17,631 --> 00:16:21,270
repository copied locally and they work on a local copy.

196
00:16:21,270 --> 00:16:27,653
And if you think about it, it's just insane thinking that you have an entire
infrastructure in your whole local computer.

197
00:16:27,653 --> 00:16:28,773
But that's the way it is.

198
00:16:28,773 --> 00:16:33,975
And so the beauty of it is that you have codified all that kind of information there.

199
00:16:33,975 --> 00:16:46,470
And sure, maybe you are unable to get them through the normal tooling because the normal
tooling may connect with the central orchestration tool, but you can still grab the code

200
00:16:46,470 --> 00:16:48,851
base and figure out where the APs are pretty much.

201
00:16:48,851 --> 00:16:50,962
And that's actually what we did.

202
00:16:51,148 --> 00:16:51,818
Interesting.

203
00:16:51,818 --> 00:16:57,832
uh feel like for me though, like early on, I stopped worrying about DNS and learned to
love it.

204
00:16:57,832 --> 00:17:05,966
Even in the enterprise world, historically, I think I was in companies that wanted to use
something like HashiCorp console, which I always considered a nightmare to do like

205
00:17:05,966 --> 00:17:08,027
registration for leader selection.

206
00:17:08,027 --> 00:17:12,690
And there is this weird aspect where DNS feels like this public thing.

207
00:17:12,690 --> 00:17:17,442
But from my standpoint, it's always been the most reliable piece of the picture.

208
00:17:17,442 --> 00:17:20,094
so if there's an incident there, you know...

209
00:17:20,224 --> 00:17:24,897
ridiculous things are going down and of course there are people working to get it back up.

210
00:17:24,897 --> 00:17:30,059
But I feel like there's been this mentality where it's not the appropriate solution for
everything.

211
00:17:30,059 --> 00:17:40,645
But when I look at the alternative solutions out there for handling things like, we have
we need to do leader selection for databases or we need to do round robin routing for

212
00:17:40,645 --> 00:17:42,050
requests coming in.

213
00:17:42,050 --> 00:17:45,748
DNS feels so much more natural even if you're running stuff privately.

214
00:17:45,748 --> 00:17:49,526
And I don't know if it's just me and my experiences where I see this disconnect.

215
00:17:49,526 --> 00:17:58,968
Any thoughts like have you you ever seen that companies are more hesitant to use the
public domain name server registration architecture infrastructure out there that really

216
00:17:58,968 --> 00:18:05,736
is designed to be super reliable and instead they want to use an off the shelf product
where they can plug it into their ecosystem.

217
00:18:05,940 --> 00:18:18,641
Yeah, I've seen examples of companies trying to stretch the boundaries of what you can do
with DNS at really levels that sometimes are just driving me nuts.

218
00:18:18,742 --> 00:18:28,010
I recall in the first few years when we built the system, it was very complex to be able
to squeeze into...

219
00:18:28,256 --> 00:18:36,520
our product, especially because we built the name server, we built the validation system
and all of that, like trying to deal around with all the RFCs and all the possible

220
00:18:36,520 --> 00:18:39,992
bindations, sometimes conflicting specifications.

221
00:18:40,133 --> 00:18:51,499
so, you know, eventually some records will, some incorrect, malformed record will go
through the platform and will start like producing odd results.

222
00:18:51,499 --> 00:18:57,838
And, and occasionally I've been seeing certain records where I'm like, why are you doing
that?

223
00:18:57,838 --> 00:18:59,278
Why are you trying to do that?

224
00:18:59,278 --> 00:19:01,758
That didn't even cross my mind.

225
00:19:01,798 --> 00:19:14,798
I recall there was a point, actually, to be fair, that was a research, but there was a
point where someone started to try to use the TXT inside, the TXT records as a database.

226
00:19:14,798 --> 00:19:17,418
Oh, that's Corey Quinn 101, right?

227
00:19:17,938 --> 00:19:20,658
Like the rub 53 is a database.

228
00:19:20,834 --> 00:19:21,374
Exactly.

229
00:19:21,374 --> 00:19:24,917
uh so there were some weird things going on.

230
00:19:24,917 --> 00:19:27,288
admittedly, it's an interesting experimentation.

231
00:19:27,288 --> 00:19:29,319
The reality is that you're totally right.

232
00:19:29,319 --> 00:19:41,806
DNS has been, despite being so often uh pointed, like being the problem because it's in so
many places, so statistically speaking, it's everywhere.

233
00:19:41,806 --> 00:19:45,649
But the truth is that it's an incredibly stable protocol.

234
00:19:45,649 --> 00:19:48,460
It's been around for years, really.

235
00:19:48,478 --> 00:19:52,660
So many years that in the IT field, it's almost centuries, right?

236
00:19:52,660 --> 00:20:01,444
And um also the way, I mean, it has changed a little bit in the last few years, but for
many, many years, here's gone unchanged.

237
00:20:01,444 --> 00:20:06,286
Not a lot of new record types, not a lot oh of changes.

238
00:20:06,286 --> 00:20:12,788
It actually started to change a little bit with the rise of the various SaaS services.

239
00:20:12,788 --> 00:20:16,686
For example, you know, when certain, when we started to see

240
00:20:16,686 --> 00:20:26,606
number of services where they were giving you a C name, like that you needed to point to
because they didn't want you to point to an IP, but they wanted you to point to a name

241
00:20:26,606 --> 00:20:27,626
that could change.

242
00:20:27,626 --> 00:20:31,086
Now we started to see needs that didn't exist before.

243
00:20:31,086 --> 00:20:34,272
You're really like the alias record at the Apex domain.

244
00:20:34,310 --> 00:20:34,920
Exactly.

245
00:20:34,920 --> 00:20:35,301
Exactly.

246
00:20:35,301 --> 00:20:47,719
That was one of the very, probably one of the first use cases, like innovative use cases
that I could think of that's really bringing the protocol to a different level in terms of

247
00:20:47,719 --> 00:20:48,460
utilization.

248
00:20:48,460 --> 00:20:56,745
For many, many years, the protocol has been used in a pretty standard, and I would say to
an extent boring ways, but it's there.

249
00:20:56,745 --> 00:20:57,986
It's very reliable.

250
00:20:57,986 --> 00:21:03,660
When you talk about reliability, if you think about it, I don't know if you know this, but
no one would provide you 100%.

251
00:21:03,660 --> 00:21:04,471
of SLA.

252
00:21:04,471 --> 00:21:14,940
But actually, if you start looking online, you will see that this is, I wouldn't say
common, but this is normal in the DNS space.

253
00:21:14,940 --> 00:21:26,861
Because generally SLA, especially if you have a distributed system that uses, for example,
A unicast in BGP and not unicast, then being completely offline means pretty much a

254
00:21:26,861 --> 00:21:30,656
catastrophic event that brings down everything.

255
00:21:30,656 --> 00:21:34,209
across multiple data centers, across multiple regions.

256
00:21:34,209 --> 00:21:37,261
And generally speaking, that is not a DNS problem.

257
00:21:37,261 --> 00:21:46,859
Like all the catastrophic or semi-catastrophic events that we had were about provision,
configurations, or software.

258
00:21:46,859 --> 00:21:48,373
Not about the protocol, right?

259
00:21:48,373 --> 00:21:51,892
And so it's about the whole orchestrations around it.

260
00:21:51,892 --> 00:21:56,866
It's about the services that you use to provide.

261
00:21:57,048 --> 00:22:05,004
It's not about the protocol per se because there are so many layers, there's so many
redundancy is baked into the protocol itself.

262
00:22:05,004 --> 00:22:16,342
Yeah, it's interesting that you bring up the pushing DNS even further in some regard
because like the quote about using Raptor D3 or DNS servers, name servers as a database.

263
00:22:16,522 --> 00:22:27,109
You know, there is something here because the AWS incident with DynamoDB, their solution
actually required storing pointer records for the IP addresses for DynamoDB resolution

264
00:22:27,109 --> 00:22:27,950
somewhere.

265
00:22:27,950 --> 00:22:34,310
And inside AWS, the normal place to store key value pairs is in DynamoDB itself.

266
00:22:34,310 --> 00:22:43,570
And obviously, if you don't want to use the same service to depend on itself for
resolution, because you're more likely to have a catastrophic failure, they were already

267
00:22:43,570 --> 00:22:48,170
utilizing their Route 53 solution, their DNS solution for storing data.

268
00:22:48,170 --> 00:22:56,386
So actually, DynamoDB uses their DNS server as a database for its own needs, for locking
and unlocking records.

269
00:22:56,386 --> 00:23:01,060
to determine what the canonical correct resolution is to hit the database.

270
00:23:01,060 --> 00:23:04,693
So, you know, it's interesting where people are like, no, you should never do that.

271
00:23:04,693 --> 00:23:09,647
And then you actually see real world implementations where it is the right answer to deal
with.

272
00:23:09,647 --> 00:23:17,664
And I sort of want to flip this upside down because you clearly have experience going
through all of the RFCs related to DNS probably multiple times.

273
00:23:17,924 --> 00:23:23,890
My question is going to be, are there any parts of the DNS structure, maybe specific kinds
of records that you're just like PTR records?

274
00:23:23,890 --> 00:23:24,738
I hate them.

275
00:23:24,738 --> 00:23:26,849
They should have never been invented in the first place.

276
00:23:26,849 --> 00:23:28,378
You know, what's your pick here?

277
00:23:28,378 --> 00:23:29,160
Oh!

278
00:23:29,160 --> 00:23:30,050
I have an answer.

279
00:23:30,050 --> 00:23:39,034
It's definitely the newest one that we implemented a few weeks ago, which is the SBCB
slash HTPS record.

280
00:23:39,055 --> 00:23:42,756
It's two records, but the RFC goes together.

281
00:23:42,756 --> 00:23:46,598
And they are the follow-up from the Alias records.

282
00:23:46,598 --> 00:23:55,302
So the Alias or the Apex or the A name, it comes in different names and different flavors
exactly because it was never formalized.

283
00:23:55,520 --> 00:23:57,111
It's all very complicated.

284
00:23:57,111 --> 00:24:08,366
mean, just reading through that and implementing all the different variants and the
parameters and the settings, I had to go through that multiple, multiple times.

285
00:24:08,366 --> 00:24:14,389
One of the reasons is because we use different languages for the different components.

286
00:24:14,389 --> 00:24:20,884
So we actually have three main components talking together, the web application with the
storage, and then...

287
00:24:20,884 --> 00:24:22,815
a distribution system that is in Go.

288
00:24:22,815 --> 00:24:27,576
So the web applications will be the distribution system that is in Go and the name server
that are in Erlang.

289
00:24:27,576 --> 00:24:36,268
So I had to go through that implementation in three different languages, three times,
multiplied by the number of times that I have to go through those sections again and again

290
00:24:36,268 --> 00:24:47,682
and again, especially about the various parameters parsing, the fact that you can use them
through key and, and, and, and, then for parameters from one to eight, you also have

291
00:24:47,682 --> 00:24:49,602
convenient ISEs.

292
00:24:49,612 --> 00:24:59,485
And every time I read convenient attached to alias, I'm like, man, when you have three
ways to do the same thing, my brain just cannot compute.

293
00:24:59,485 --> 00:25:00,565
I'm a simple human.

294
00:25:00,565 --> 00:25:05,626
when I read, I've been working for years in Ruby, right?

295
00:25:05,626 --> 00:25:13,809
And Ruby has a great focus on readability, on the beauty of the code.

296
00:25:13,809 --> 00:25:19,404
so, especially in the very early days, there was an overabuse of the fact that you could
alias.

297
00:25:19,404 --> 00:25:28,529
the name of methods and really call like have a single function be called in many
different ways.

298
00:25:28,529 --> 00:25:37,183
And you still can see that like there are libraries where you would call dot length, which
is the equivalent of dot size, which is the equivalent of dot count.

299
00:25:37,404 --> 00:25:38,445
Unless it's not.

300
00:25:38,445 --> 00:25:46,689
And so some libraries then maybe use a size, can read on the caching one and count can
always trigger the database.

301
00:25:46,689 --> 00:25:48,728
So in my brain it's like...

302
00:25:48,728 --> 00:25:51,640
Why do we need three ways to do the same thing?

303
00:25:51,640 --> 00:25:59,426
Especially because then when you have a large code base and you need to search for
something, you need to make a change, you have a bug and you wanna see how many other

304
00:25:59,426 --> 00:26:07,661
instances of that bug exist, then if you have three ways of calling the same method, it's
just like, you know, very complex to deal with.

305
00:26:07,661 --> 00:26:17,038
So, admittedly, we went back and forth multiple time uh trying to figure out the best way
to implement that.

306
00:26:17,038 --> 00:26:26,298
But also the best way to provide that to the customers, because we have a lot of very
highly technical customers, but there are certain topics that just because you're

307
00:26:26,298 --> 00:26:34,998
technical, doesn't mean that I can take for granted that you don't have a That was a story
also about SSL TLS certificates.

308
00:26:34,998 --> 00:26:46,754
Like I recall how complex it was in the very early days trying to figure out a way to
serve the certificate chain.

309
00:26:46,754 --> 00:26:54,522
the server certificate, the root certificate in a bundle in a way that was convenient for
the customer.

310
00:26:54,522 --> 00:27:01,450
Because many of them did not realize that if you were using NGINX, you had to sort them
from patent to server.

311
00:27:01,450 --> 00:27:04,563
If you had Apache, you had to go the other way around.

312
00:27:04,563 --> 00:27:05,408
No.

313
00:27:05,408 --> 00:27:06,698
I don't want to know that.

314
00:27:06,698 --> 00:27:07,322
Yeah.

315
00:27:07,322 --> 00:27:18,665
And so, you know, these are the kinds of things that when you implement that and you spend
maybe days, like hours, days, sometimes even months on a particular topic, then you become

316
00:27:18,665 --> 00:27:27,978
an expert there, but you don't realize how easy it is to then just build something that
is, it works, but it's unusable because you expect the other people to understand at the

317
00:27:27,978 --> 00:27:29,788
level that you implemented it.

318
00:27:29,788 --> 00:27:35,240
And so the SPCB and the HTTPS, I think it's a super powerful record.

319
00:27:35,240 --> 00:27:36,750
It's designed to be...

320
00:27:36,750 --> 00:27:40,640
probably the replacement of the areas and many other configurations.

321
00:27:40,640 --> 00:27:43,436
It's just very tough to digest.

322
00:27:43,470 --> 00:27:45,550
No, totally get it.

323
00:27:45,550 --> 00:27:47,390
mean, the need is clearly there.

324
00:27:47,390 --> 00:27:58,530
There's a question of obviously your company has some capability of going in and making
good suggestions given that you are a huge player in the DNS space.

325
00:27:58,530 --> 00:28:01,210
But I know how the ITF groups work.

326
00:28:01,210 --> 00:28:07,270
know how the global groups work that even if you're sitting there is like, know, like we
have customers, we know how they're going to use it.

327
00:28:07,270 --> 00:28:13,424
It's hard to convince some of the people that are going out and designing these things
that their use case is not necessarily

328
00:28:13,424 --> 00:28:17,107
necessarily one that speaks for everyone in every single scenario.

329
00:28:17,107 --> 00:28:24,933
And we can see this actually by just if we look at what the cloud providers offer, not
just in the DNS space, but across the board when it comes to protocols and standards,

330
00:28:24,933 --> 00:28:28,075
they're not always up to date with the things that have been released.

331
00:28:28,075 --> 00:28:29,826
Like this is my opportunity to plug.

332
00:28:29,826 --> 00:28:37,042
I absolutely love the query verb that just showed up recently in HTTP as part of REST as
part of a valid method.

333
00:28:37,042 --> 00:28:43,406
And I think it's going to be still years before we see the cloud providers implement
caching based off of query to work.

334
00:28:43,406 --> 00:28:52,306
I think the same thing goes with IPv6, is, you know, it's been around forever, but
realistically, cloud providers still don't provide a strategy that works across the board.

335
00:28:52,306 --> 00:28:55,946
We already know IPv4 address space exhaustion.

336
00:28:55,946 --> 00:29:00,866
We know in the DNS space that most of the cloud providers don't support some of the
records that exist.

337
00:29:00,866 --> 00:29:04,806
the one that comes to mind that I always sort of want to work is D name.

338
00:29:05,086 --> 00:29:11,406
It's sort of this thing that like, if you don't know, you know what C name is, D name is
for a whole sub domain, not just a single record.

339
00:29:11,406 --> 00:29:12,384
And I mean,

340
00:29:12,384 --> 00:29:15,477
It seems like the thing that is a foot gun in a lot of ways.

341
00:29:15,477 --> 00:29:20,982
So I understand why they haven't implemented it, but at the same time, it's like, it seems
like something to be super useful.

342
00:29:20,982 --> 00:29:29,968
As someone has been working in the open source community for many, many years, I know how
hard it is to fall into the top of saying yes to everything.

343
00:29:29,968 --> 00:29:41,937
And in fact, I have a huge respect for all the people that working on RFCs, you know, the
idea of working group, because the amount of complexity of designing something that

344
00:29:41,937 --> 00:29:48,491
potentially will take a long time to adopt, but also any mistake will be paid for years.

345
00:29:48,491 --> 00:29:48,811
Right?

346
00:29:48,811 --> 00:29:50,242
So you really need to...

347
00:29:50,242 --> 00:29:57,925
to think upfront and try to minimize the risk of making a choice that then you will regret
later.

348
00:29:57,925 --> 00:30:09,789
Which is by the way, something that all of us experience, all of us that have leadership
positions experience in some shape or form when building a software, implementing an

349
00:30:10,029 --> 00:30:12,882
infrastructure, you like you pick that database.

350
00:30:12,882 --> 00:30:16,854
Because it seems a great idea in that moment.

351
00:30:16,854 --> 00:30:21,848
And then a couple of years later, oh man, why did I pick this NoSQL thing?

352
00:30:23,552 --> 00:30:24,672
Yeah, I mean, you're so on.

353
00:30:24,672 --> 00:30:32,984
It's like when you're in a company that you're running for multiple years, it's the sort
of thing where you have that long term opportunity both see forward, but unfortunately

354
00:30:32,984 --> 00:30:35,338
also get to regret all your past mistakes.

355
00:30:35,338 --> 00:30:41,238
And some of them are very difficult to see without having many, many cycles iterate.

356
00:30:41,238 --> 00:30:43,659
It's a very complex task.

357
00:30:43,659 --> 00:30:49,550
as I said, I really, I also think that there are certain people that are best at that.

358
00:30:49,550 --> 00:30:51,041
It's not for everyone.

359
00:30:51,041 --> 00:31:02,644
You really need to have such a fast forward thinking mind, that extends to also a very
open mind to be able to understand the need of an industry, for example.

360
00:31:03,064 --> 00:31:09,976
But on the other side, you need to be ready to say no and to put some name because we all
have seen the...

361
00:31:09,976 --> 00:31:13,747
how easy a system could become overly complex.

362
00:31:13,747 --> 00:31:16,328
I I often say that also to the team.

363
00:31:16,328 --> 00:31:19,769
Everything that we build today is a liability for the future.

364
00:31:19,769 --> 00:31:28,772
And so you often don't realize later on, after you build that contract, after you assign
that contract, that you became liable for supporting it.

365
00:31:28,772 --> 00:31:32,273
You just can't remove an API endpoint after you build that.

366
00:31:32,273 --> 00:31:36,524
You can't remove a piece of infrastructure after you have...

367
00:31:36,524 --> 00:31:41,356
You build that, even if you think that nobody's using it, or even just a couple of people
are using it.

368
00:31:41,356 --> 00:31:42,456
Because guess what?

369
00:31:42,456 --> 00:31:52,500
Those are going to be the two people that are going to be pissed off and speak very
vocally about how dare you, you took down that service that they were really passionate

370
00:31:52,500 --> 00:31:53,441
about.

371
00:31:53,901 --> 00:31:57,032
So it's a very complex job.

372
00:31:57,032 --> 00:32:06,400
The one that has to decide and think long-term about solving a problem now, but in a way,
which is not going to become your problem.

373
00:32:06,400 --> 00:32:08,418
or another type of problem tomorrow.

374
00:32:08,418 --> 00:32:09,438
I mean, I'm totally with you.

375
00:32:09,438 --> 00:32:19,942
We see this all the time in our own product for identity and access management where
someone wants to add unnecessary claims into a JWT to be used for authentication or

376
00:32:19,942 --> 00:32:23,463
another API to handle attribute-based access control or something like that.

377
00:32:23,463 --> 00:32:32,586
And if you dive into the use case and you really try to have a customer focus and really
solve their business problem, you realize they're not even interested in doing it.

378
00:32:32,586 --> 00:32:41,924
A reasonable way they just they have it in their mind that there's an approach and if you
try to support it, it's going to cause a long term problem and I think this happens more

379
00:32:41,924 --> 00:32:52,102
and more when you're in the infrastructure space and you're providing such a critical
piece of infra for customers that an issue here can very rapidly get out of control.

380
00:32:52,102 --> 00:32:56,945
You know at the lower layers like it everything balloons out and so what I really want to
ask you is.

381
00:32:57,927 --> 00:33:02,490
How does one run a DNS business in the first place and why I guess?

382
00:33:02,828 --> 00:33:04,259
It's a great question.

383
00:33:04,259 --> 00:33:06,141
When I'm going to find the answer, I'm going to tell you.

384
00:33:06,141 --> 00:33:07,062
No kidding.

385
00:33:07,062 --> 00:33:16,079
I have to say that if someone would ask me, know, 20 years ago, we just started the DNS
business, like that would, that didn't, was never crossing my mind.

386
00:33:16,079 --> 00:33:16,319
Right.

387
00:33:16,319 --> 00:33:19,051
It just happened to be fair.

388
00:33:19,212 --> 00:33:24,416
I actually joined uh from the DNS, from the domain industry.

389
00:33:24,416 --> 00:33:30,371
uh with DNS Simple was born, DNS Simple acquired my company.

390
00:33:30,371 --> 00:33:31,370
was starting a

391
00:33:31,370 --> 00:33:34,331
I had a business related to domain management.

392
00:33:34,331 --> 00:33:38,192
So really the two phases of D &S are DNS side and domain side.

393
00:33:38,192 --> 00:33:42,433
So as I say, like when I joined, was the very early days.

394
00:33:42,893 --> 00:33:50,255
I actually joined as a first official employee back in the days when D &S acquired my
company, it was still a pet project.

395
00:33:50,275 --> 00:33:59,718
So in fact, I helped to shape and build all that D &S employees today, but I was coming
from the domain space, in fact.

396
00:33:59,718 --> 00:34:00,678
To me,

397
00:34:01,036 --> 00:34:06,487
DNS was to an extent a lot what is for many other people.

398
00:34:06,487 --> 00:34:12,309
It was a commodity on top of the domain space, the domain industry.

399
00:34:12,469 --> 00:34:18,590
The interesting piece is that DNS really connect with the whole domain industry very
closely.

400
00:34:18,590 --> 00:34:23,531
So it was an effort transition to me to enter in that space.

401
00:34:23,531 --> 00:34:27,632
And I somehow felt the space very compelling.

402
00:34:27,632 --> 00:34:29,873
That's why also I...

403
00:34:30,060 --> 00:34:37,314
I was so excited to help building, you know, D &S simple and evolving the product because
you never get bored.

404
00:34:37,314 --> 00:34:37,735
Really.

405
00:34:37,735 --> 00:34:49,101
There are so many challenges, so many R &D opportunities, so many uh new things.

406
00:34:49,442 --> 00:34:56,536
don't have enough life to live in order to learn everything about this topic and this
industry.

407
00:34:56,536 --> 00:34:57,454
And so...

408
00:34:57,454 --> 00:35:09,674
At the same time, for someone that is really passionate about performance, algorithm, this
is a topic that really, this is a field, an industry that really helps you working and

409
00:35:09,674 --> 00:35:19,074
playing with it because you are working on a level of scale in terms of, you know,
requests, data that is not comparable because you are literally at the backbone of the

410
00:35:19,074 --> 00:35:20,834
vast majority of the services.

411
00:35:20,834 --> 00:35:27,498
At minimum, you need to have the same, well, that's not true because you have cash, but I
would say just to simplify it, minimum, you need to have

412
00:35:27,498 --> 00:35:32,342
one-to-one between the number of web requests and DNS requests, right?

413
00:35:32,342 --> 00:35:42,031
It means that if you pick any website and you are dealing with a challenge that is the
amount of traffic hitting that page, then certainly you have an equivalent problem in the

414
00:35:42,031 --> 00:35:45,634
term of the DNS queries, if not more, right?

415
00:35:45,634 --> 00:35:50,038
Because then DNS is not just about the web traffic, but it's about main traffic, it's
about everything else.

416
00:35:50,038 --> 00:35:54,638
And sometimes for a single web request, are uh a dozen.

417
00:35:54,638 --> 00:35:59,702
oh of DNS requests because maybe you are checking for the other type of records.

418
00:35:59,702 --> 00:36:01,420
You're checking for all the other services.

419
00:36:01,420 --> 00:36:05,946
You're checking for A, Paraglouple A, NS, Groove Records, all of that, right?

420
00:36:05,946 --> 00:36:10,399
So uh it's a very fascinating uh industry.

421
00:36:10,399 --> 00:36:13,551
And as I said, you are never short on challenges.

422
00:36:13,551 --> 00:36:16,773
And by the way, what's new about DNS?

423
00:36:16,794 --> 00:36:18,255
And that's true.

424
00:36:18,255 --> 00:36:23,084
mean, as I was saying before, DNS has been the same for many, many years.

425
00:36:23,084 --> 00:36:32,982
What is constantly changing though, is not necessarily the underlying protocol, it's the
way people, systems, services interact with that.

426
00:36:33,162 --> 00:36:38,947
And that is constantly evolving because the needs of the ecosystem are constantly
evolving.

427
00:36:39,067 --> 00:36:49,316
DNS through HTTPS, DNS through TLS, DNS through other, all the various services, DMSec,
which is a whole different beast.

428
00:36:49,316 --> 00:36:52,588
All of those are coming from needs.

429
00:36:52,588 --> 00:36:56,501
that the DNS protocol didn't have years, decades ago.

430
00:36:56,501 --> 00:37:07,838
And so, yes, the fundamental topic, the underlying topic is the same, but the way that
people, the services interact with that changes and keeps changing.

431
00:37:07,838 --> 00:37:13,171
And so that is also part of the Never Get Bored challenge that was talking about.

432
00:37:13,171 --> 00:37:17,734
There's always a new way to use or abuse the protocol.

433
00:37:17,768 --> 00:37:23,851
You hit one of the items on my bingo card, was DNSSEC versus DNS over HTTPS.

434
00:37:23,851 --> 00:37:31,565
And I'm wondering if you have a strong opinion on a direction here, like whether or not
companies should be implementing one or the other or both.

435
00:37:31,565 --> 00:37:41,549
And I think we've seen a non-trivial number of times where I think there was the sales
forces or the slacks that tried it to roll out, especially DNSSEC.

436
00:37:41,549 --> 00:37:47,682
And that works sort of, but then they decide to roll back and that causes a huge problem
where they're

437
00:37:47,682 --> 00:37:49,230
not signing things correctly.

438
00:37:49,230 --> 00:37:50,030
Right.

439
00:37:50,030 --> 00:37:52,771
I think they are not mutually exclusive.

440
00:37:52,771 --> 00:37:58,474
They serve different purpose and they also serve different type of use cases.

441
00:37:59,154 --> 00:38:08,178
If you look at the DNSSEC side, then you're really talking about signing and guaranteeing
that the data itself is unchanged.

442
00:38:08,178 --> 00:38:10,629
And so it's more at the low level.

443
00:38:10,629 --> 00:38:16,321
It's more about the backbone, the baseline for building things on top of it.

444
00:38:16,321 --> 00:38:18,674
For example, records like

445
00:38:18,674 --> 00:38:29,792
TLSA, Dane, all those kinds of implementations rely on making sure that you have a way to
guarantee that the record that you're receiving is the record that you're expecting to

446
00:38:29,792 --> 00:38:30,522
receive.

447
00:38:30,522 --> 00:38:35,536
And so really DNSSEC exists to solve a certain type of use cases.

448
00:38:35,536 --> 00:38:47,982
On the other side, if you look at the various way to communicate or to pass uh DNS and so
DNS through HTTPS, I consider that more on the consumer side.

449
00:38:47,982 --> 00:38:49,402
the use case, right?

450
00:38:49,402 --> 00:38:53,542
Like different users will have different ways of different needs.

451
00:38:53,542 --> 00:39:04,502
And maybe because they need to encrypt that data, not sign, but maybe encrypt because they
are in a region or an area of the world where they even the payload of the DNS queries

452
00:39:04,502 --> 00:39:08,762
that for the vast majority is in plain text is clear in the wild, right?

453
00:39:08,762 --> 00:39:12,102
Even with the MSEC, you're not really encrypting the data itself.

454
00:39:12,102 --> 00:39:13,202
Data is still visible.

455
00:39:13,202 --> 00:39:17,388
Just provide a way to guarantee that the data has not been altered.

456
00:39:17,388 --> 00:39:20,480
Those are, in my opinion, serving different purposes.

457
00:39:20,480 --> 00:39:22,131
They are not mutually exclusive.

458
00:39:22,131 --> 00:39:34,968
They have a problem in common, which is the adoption has been extremely slow because there
hasn't been enough push from major provider, the industry as a whole, same like IPv6,

459
00:39:34,968 --> 00:39:35,549
right?

460
00:39:35,549 --> 00:39:40,872
Because it's an industry that is very scared, I fully saw, of breaking things.

461
00:39:40,872 --> 00:39:43,113
So it's moving very, very slow.

462
00:39:43,113 --> 00:39:44,648
And so there hasn't been...

463
00:39:44,648 --> 00:39:56,603
huge adoption or the option is generally going with whoever larger provider comes first
and draw a line or clear the path, right?

464
00:39:56,603 --> 00:40:06,768
Like the adoption of certain way of resolving DNS through alternative protocol has been
increased thanks to browsers implementing that.

465
00:40:06,768 --> 00:40:10,759
But it's also such a technical topic that then what do you expect?

466
00:40:10,759 --> 00:40:12,504
Like the normal consumer?

467
00:40:12,504 --> 00:40:16,042
to just go there and select which protocol to use.

468
00:40:16,525 --> 00:40:21,135
It's not the kind of thing that an end user would care about.

469
00:40:22,299 --> 00:40:28,544
And even if they do, the problem is that they're not going to be technical enough to drive
forward the right sort of implementation.

470
00:40:28,544 --> 00:40:31,638
Where does we need privacy in DNS come from?

471
00:40:31,638 --> 00:40:34,450
It comes from uh usually a consumer regard.

472
00:40:34,450 --> 00:40:40,425
But if the implementation is all the only on the other side, there's all these layers in
between that just like don't care enough.

473
00:40:40,425 --> 00:40:47,722
And so I think you're stuck in this loop where it would be good for everyone, but we just
don't get there because there's not enough like say money behind it.

474
00:40:47,854 --> 00:40:48,214
Right.

475
00:40:48,214 --> 00:40:55,434
In my opinion, this is a place where the industry should step in and the industry leaders
should set some requirements.

476
00:40:55,434 --> 00:40:56,774
Just think about it.

477
00:40:56,774 --> 00:41:03,034
Do you think the customers or the consumers would have cared about having the HTTPS in the
browser?

478
00:41:04,414 --> 00:41:05,714
Why did that happen?

479
00:41:05,714 --> 00:41:10,874
It happened because the browsers got there and say, you know what?

480
00:41:10,874 --> 00:41:12,414
I'm going to give you one year.

481
00:41:12,414 --> 00:41:14,134
I'm going to give you three years.

482
00:41:14,134 --> 00:41:17,026
In three years from now, we're not going to open

483
00:41:17,026 --> 00:41:22,230
by default, any URL that is not HTTPS powered, period.

484
00:41:22,230 --> 00:41:28,834
Same story is happening now with the decrease of the duration, the lifespan of SSL
certificates.

485
00:41:28,834 --> 00:41:39,221
That's where Let's Encrypt came in and really disrupted the market, among other things, by
saying, by default, I'm gonna give you 90 days when, recall, the whole industry was about

486
00:41:39,221 --> 00:41:39,886
three years.

487
00:41:39,886 --> 00:41:42,423
I'm gonna give you 90 days, that's it.

488
00:41:42,423 --> 00:41:45,065
No extension, no excuse, nothing else.

489
00:41:45,065 --> 00:41:46,702
You have to automate.

490
00:41:46,702 --> 00:41:56,122
You have to stop having individual users changing, like logging into a system and manually
replacing the SSL certificate.

491
00:41:56,122 --> 00:41:56,462
Right?

492
00:41:56,462 --> 00:41:59,022
So it was three years, then it became one.

493
00:41:59,022 --> 00:42:04,122
And now in the next three years, it's going to be all the way down to 47 days.

494
00:42:04,322 --> 00:42:06,902
I don't recall because it's one of those odd numbers.

495
00:42:06,902 --> 00:42:08,242
Why would you pick 47?

496
00:42:08,242 --> 00:42:10,002
I actually have no idea.

497
00:42:10,525 --> 00:42:12,190
I'm sure there's a I'm sure there is a reason.

498
00:42:12,190 --> 00:42:13,875
Logic, right?

499
00:42:13,875 --> 00:42:14,926
I don't know what it is.

500
00:42:14,926 --> 00:42:16,106
I'm sure he's right.

501
00:42:16,106 --> 00:42:17,306
But you see the point.

502
00:42:17,306 --> 00:42:24,386
My point is people should not care about turning on the NSEC to their domain.

503
00:42:24,386 --> 00:42:32,506
If, as an industry, we think that this is an improvement in the ecosystem, then let's make
that required.

504
00:42:32,506 --> 00:42:39,566
And let's have the industry players work to figure out how to provide the right
infrastructure for the consumers.

505
00:42:39,566 --> 00:42:42,822
And in fact, there are a few TLDs that actually did it.

506
00:42:42,822 --> 00:42:58,786
If you think about it, I don't know if you know it, but uh there are a couple of GTODs
like .Bank and .Insurance that enforce the use of DNSSEC uh for every user.

507
00:42:58,786 --> 00:43:09,230
So at the registry level, if you are serving your DNS without DNSSEC, so if you are not
signing your zone, the domain is disabled.

508
00:43:09,230 --> 00:43:17,750
at registry level, they turn down delegation if they see that you're not using DNSSEC, so
if the zone is not signed.

509
00:43:17,750 --> 00:43:23,150
So in order to turn that on, you actually need to sign the zone and publish DSKey.

510
00:43:23,230 --> 00:43:28,610
There's also a couple of other TLDs, GTLDs, sorry, that was a GTLD.

511
00:43:28,610 --> 00:43:38,350
There are a few CCTLDs, if I recall correctly, in the Nordic European area, I believe it's
.nl and .ac.

512
00:43:38,350 --> 00:43:47,790
and a couple of others that are not enforcing it in the sense that they're not making
full, like if you don't have it, it's off.

513
00:43:47,990 --> 00:43:56,650
But they're doing some kind of campaign or some kind of promotion, some kind of
communication that really requires it to the point that if you, don't have the stats in

514
00:43:56,650 --> 00:44:06,770
front of me, but if you look at the stats, I believe that we're talking about for those
DODs, level of DMSEC adoption is above 90 % of the domain space.

515
00:44:06,770 --> 00:44:07,968
It's super high.

516
00:44:07,968 --> 00:44:08,681
Yeah.

517
00:44:08,681 --> 00:44:13,318
For the others, sometimes it's not even close to 5%.

518
00:44:13,442 --> 00:44:14,127
for sure.

519
00:44:14,127 --> 00:44:15,775
That seems high to me even.

520
00:44:16,012 --> 00:44:17,943
That's the point.

521
00:44:17,943 --> 00:44:24,858
If we need to get that out, then industry leaders must make that required.

522
00:44:24,858 --> 00:44:28,270
So we cannot expect the consumer to just come and say, you know what?

523
00:44:28,270 --> 00:44:30,672
I really want DNS today.

524
00:44:30,712 --> 00:44:35,015
If we were going to invent DNS today, it would probably be signed.

525
00:44:35,015 --> 00:44:43,741
Because DNS was invented in an epoch where we were all hippies and fair people, and we
didn't think that, you know...

526
00:44:43,741 --> 00:44:45,154
uh

527
00:44:45,154 --> 00:44:47,655
the world was not flowers and roses.

528
00:44:47,715 --> 00:44:53,137
Therefore, if we were going to do that today, most likely many choices would be different.

529
00:44:53,137 --> 00:45:01,662
as we transition towards that, let's just make that super easy for someone to adopt that
technology as if it was the default.

530
00:45:01,784 --> 00:45:02,465
For sure, for sure.

531
00:45:02,465 --> 00:45:12,184
know, it's still surprising to me that we live in a world where things like GPS isn't
signed, that spoofable GPS addresses, it's still something that can happen.

532
00:45:12,184 --> 00:45:20,433
uh instead of going down that tangent, I think maybe this would be a good opportunity to
start to close out the episode and switch over to PIX.

533
00:45:20,433 --> 00:45:20,943
Should we do it?

534
00:45:20,943 --> 00:45:24,126
Absolutely.

535
00:45:24,126 --> 00:45:24,697
Yes, let's go.

536
00:45:24,697 --> 00:45:25,806
What did you bring for us?

537
00:45:25,806 --> 00:45:26,486
All right.

538
00:45:26,486 --> 00:45:32,946
So I'm switching away from technology because we have talked about technology for a while.

539
00:45:32,946 --> 00:45:34,686
so I'm a scuba diver.

540
00:45:34,686 --> 00:45:35,906
I'm a Harvard scuba diver.

541
00:45:35,906 --> 00:45:37,046
I really enjoy scuba diving.

542
00:45:37,046 --> 00:45:47,506
think it's one of those things that to me was absolutely fantastic because it was a way
for me to disconnect from everything around.

543
00:45:47,606 --> 00:45:51,606
Like literally, because I couldn't bring my phone, at least back in the days, I couldn't
bring my phone anywhere.

544
00:45:51,606 --> 00:45:52,286
Now you can.

545
00:45:52,286 --> 00:45:54,226
You also have waterproof cases.

546
00:45:54,936 --> 00:45:56,386
There's no signal in the water.

547
00:45:56,386 --> 00:46:05,789
And so I just want to suggest, I just want to throw the idea to anyone out of there, take
a break, do something out in the wild.

548
00:46:05,789 --> 00:46:10,170
And if you have never tried scuba diving, go and try that.

549
00:46:10,170 --> 00:46:15,112
There's many places, there's solution, there's an opportunity for everyone.

550
00:46:15,112 --> 00:46:23,654
If you like rocks, if you like uh ancient history, if you like fishes, if you like nature,
if you like...

551
00:46:23,654 --> 00:46:24,534
uh

552
00:46:24,534 --> 00:46:26,005
whales or sharks.

553
00:46:26,005 --> 00:46:27,456
I do love sharks.

554
00:46:27,456 --> 00:46:29,488
There's opportunity for any of that.

555
00:46:29,488 --> 00:46:34,012
And I live in Italy and out of the coast of Italy, we have plenty of that.

556
00:46:34,012 --> 00:46:41,568
Unfortunately, we have a lot of wrecks here because of the Second World War and First
World War, which I love to explore.

557
00:46:41,568 --> 00:46:43,469
I can share some of the links.

558
00:46:43,469 --> 00:46:45,450
There's also ancient history.

559
00:46:45,571 --> 00:46:51,936
We have seen a lot of places where there are amphoras and artifacts from the Roman epoch.

560
00:46:52,417 --> 00:46:54,618
We don't have as many fishes as

561
00:46:54,667 --> 00:46:57,973
Maldives or the Red Seas, at least big ones.

562
00:46:58,195 --> 00:47:03,356
But that's the point, just take a break, go out and try scuba diving if you haven't.

563
00:47:03,650 --> 00:47:05,411
I actually do it myself.

564
00:47:05,411 --> 00:47:13,456
I haven't gone since the pandemic, unfortunately, but it's actually really easy to just
get a certification in a lot of places to start the process.

565
00:47:13,456 --> 00:47:16,190
There's shops everywhere where they'll train you.

566
00:47:16,190 --> 00:47:20,835
It's like a week of uh written stuff and then a week in the water in a pool.

567
00:47:20,835 --> 00:47:27,861
then you go on vacation somewhere and finish the diving certification wherever you go on
vacation and whatever they have there.

568
00:47:27,861 --> 00:47:30,203
I'm not, unfortunately, I'm not a fan of recs so much.

569
00:47:30,203 --> 00:47:32,986
Like natural reefs are my thing.

570
00:47:33,106 --> 00:47:34,548
But they're everywhere too.

571
00:47:34,548 --> 00:47:37,511
And honestly, there's just nothing that compares to it.

572
00:47:37,511 --> 00:47:38,551
Storm cooling is okay.

573
00:47:38,551 --> 00:47:41,514
It's just not the same as going diving.

574
00:47:41,662 --> 00:47:42,723
Absolutely 100%.

575
00:47:42,723 --> 00:47:44,294
Snorkeling is an effort.

576
00:47:44,294 --> 00:47:51,521
It feels like an effort, especially what I keep saying, if you enjoy it, you want to spend
as much time as you can underwater.

577
00:47:51,521 --> 00:47:54,653
Snorkeling, you are limited by your breath capability.

578
00:47:54,653 --> 00:48:01,459
Generally, for most people, it's very short or by the wave or by the effort of swimming on
the surface.

579
00:48:02,400 --> 00:48:09,466
If you can go down with a tank and just spend an hour underwater, it's going to change
your day really.

580
00:48:09,466 --> 00:48:13,127
You get 40 minutes underwater and then you're usually forced to resurface.

581
00:48:13,127 --> 00:48:14,708
But yeah, no, I'm totally with you.

582
00:48:14,708 --> 00:48:22,310
The snorkeling is like swimming, whereas diving is sort of like meditation underwater is
really the analogy that I'll make.

583
00:48:22,310 --> 00:48:23,430
It is great, honestly.

584
00:48:23,430 --> 00:48:24,091
I love it too.

585
00:48:24,091 --> 00:48:24,831
I love that pic.

586
00:48:24,831 --> 00:48:35,404
Maybe as far as something for the audience, if there's a particular link to a dive site or
location that you would recommend, think we'll have that link in the description.

587
00:48:35,446 --> 00:48:36,508
Absolutely, absolutely.

588
00:48:36,508 --> 00:48:39,218
I will make sure to share something with the audience.

589
00:48:39,402 --> 00:48:40,262
Great, great.

590
00:48:40,262 --> 00:48:40,903
love it.

591
00:48:40,903 --> 00:48:42,303
Okay, then I guess I'll share mine.

592
00:48:42,303 --> 00:48:53,446
uh I had a different pick, but your analogy that DNS infrastructure is sort of like
digital electricity made me uh remember a book that Will actually had shared a lot of

593
00:48:53,446 --> 00:48:55,767
episodes ago called One Second After.

594
00:48:55,767 --> 00:48:58,287
It talks about the...

595
00:48:59,208 --> 00:49:06,990
I'll spare the incoming details about the book and it's really just about sort of a
cataclysmic event, sort of a poke-a-pock-a-...

596
00:49:06,990 --> 00:49:15,185
post-apocalyptic event that happens in the United States where you basically lose
electricity ah and how people survive.

597
00:49:15,185 --> 00:49:18,758
And you were listing out the things that immediately cause problems if you lose
electricity.

598
00:49:18,758 --> 00:49:25,904
And one of the things that the book brings up that never really occurred to me is that a
lot of medicine needs refrigeration.

599
00:49:25,904 --> 00:49:30,067
And so there's just a huge impact to humanity.

600
00:49:30,067 --> 00:49:34,894
Without the electricity, you think you can get by by either riding a bike

601
00:49:34,894 --> 00:49:41,574
to power a fridge temporarily, but realistically, there's a lot of things that actually
require it in order for us to continue going forward.

602
00:49:41,574 --> 00:49:49,634
I found it to take a really sort of interested and in-depth approach to understanding what
those sort of connected complexities are.

603
00:49:49,634 --> 00:49:53,754
And I think there's a lot of similarities to running the digital infrastructure of the
internet.

604
00:49:53,848 --> 00:49:59,629
Well, hopefully we're not going to end up in a position where electricity is unavailable
for a very long time.

605
00:49:59,771 --> 00:50:05,842
And you're going to have to, I don't know, ride a bike in order to either distribute DNS
packets around.

606
00:50:05,995 --> 00:50:12,320
I think at that point we may uh sort of graduate back to the Avian flight protocol.

607
00:50:13,477 --> 00:50:15,302
I don't think I've heard about it.

608
00:50:15,302 --> 00:50:20,568
The RFC uh for sending digital packets over carriers.

609
00:50:20,568 --> 00:50:25,964
Yeah, the Easter egg, the April Fool's job, wasn't it?

610
00:50:25,964 --> 00:50:29,304
Yeah, actually, there's a whole bunch of April Fool's jokes written into RFCs.

611
00:50:29,304 --> 00:50:33,395
There's like 10 or so of them that you can go through and find them randomly.

612
00:50:33,882 --> 00:50:37,148
The one about HTTP is the teapot status code.

613
00:50:37,148 --> 00:50:38,262
Isn't that one?

614
00:50:38,262 --> 00:50:45,487
so yeah, there's the one about ACP status code 418, I'm a TPOT error response.

615
00:50:45,487 --> 00:50:54,792
ah So I know reading RFCs and long articles on stuff isn't everyone's idea of a good time
or bedtime reading.

616
00:50:54,792 --> 00:50:56,493
There are some good jokes in there.

617
00:50:56,493 --> 00:51:04,428
And honestly, if you're in an industry where you're implementing any sort of thing that
there has been a standard around it, it really helps to know what that standard was and

618
00:51:04,428 --> 00:51:07,041
you sort of the motivation even that's capsulated in it.

619
00:51:07,041 --> 00:51:09,578
So you know the direction that you can go with your technology.

620
00:51:09,578 --> 00:51:10,158
Absolutely.

621
00:51:10,158 --> 00:51:21,553
If you have never, I think it's an exercise, honestly, as much as was joking about it, but
I think that anyone that has been in this industry for more than probably a year at some

622
00:51:21,553 --> 00:51:23,394
point needs to read an RFC.

623
00:51:23,394 --> 00:51:28,026
It's really part of your experience learning and growing.

624
00:51:28,086 --> 00:51:34,569
Reading it, is sometimes challenging, but at the same time is a learning experience.

625
00:51:34,569 --> 00:51:36,443
So go out and read one.

626
00:51:36,443 --> 00:51:38,768
Honestly, I'd say that they're actually pretty simple.

627
00:51:38,768 --> 00:51:45,066
They're not like lead like if you ever read a legal contract It's so much simpler to read
an RFC compared to that

628
00:51:45,066 --> 00:51:46,924
Absolutely, absolutely.

629
00:51:46,978 --> 00:51:53,788
Well, I'll say thank you, Simone, for coming on and talking to everything related to DNS.

630
00:51:53,788 --> 00:51:55,544
It's been absolutely great.

631
00:51:55,544 --> 00:51:56,517
Thanks for having me, Warren.

632
00:51:56,517 --> 00:51:58,963
It was really a pleasure to be here.

633
00:51:59,028 --> 00:51:59,619
Of course.

634
00:51:59,619 --> 00:52:05,304
uh And thanks to the audience for showing up for this episode and hopefully we'll see
everyone back next week.

