Inside Claude Code From the Engineers Who Built It
By Every
Summary
## Key takeaways - **Claude Code: Full Terminal Access**: Claude Code's core innovation is its direct access to everything an engineer does at the terminal, eliminating the need for a separate text editor and enabling a new paradigm for AI-assisted coding. [02:00] - **Accidental Innovation: The Birth of Claude Code**: Claude Code wasn't intentionally designed as a new paradigm; it emerged from engineers prototyping API access via the terminal and discovering the model's strong inclination to use tools like bash. [02:35], [04:31] - **Dogfooding Drives Product Development**: Anthropic 'antfoods' Claude Code, with 70-80% of technical employees using it daily, generating rapid feedback on new features through a dedicated channel that allows for quick iteration and bug fixing. [08:08:11], [08:36:38] - **Latent Demand Fuels Product Growth**: Claude Code's development strategy relies on 'latent demand,' building a hackable and open-ended product that allows users to discover novel use cases, which then informs future development. [34:37:40], [35:48:49] - **Dual-Use Tools Simplify AI Interaction**: Designing tools to be usable by both humans and the AI model simplifies understanding and integration, as the principle 'everything you can do, Claude Code can do' applies. [10:56:58], [11:17:19] - **Compounding Engineering for Easier Development**: Every's 'compounding engineering' paradigm focuses on making each new feature easier to build by codifying learnings into prompts and sub-agents, enabling developers to be productive even in unfamiliar codebases. [30:09:12], [30:49:51]
Topics Covered
- The CLI is the Future of Coding?
- Latent Demand: Build for Unforeseen Use Cases
- AI Models Eagerly Adopt Tools
- Dual-Use Tools Benefit Humans and AI
- Embrace "Compound Engineering" for Easier Development
Full Transcript
What made it work really well is that
quad code has access to everything that
an engineer does at the terminal.
Everything you can do, quad code can do.
There's nothing in between.
>> There's actually an increasing number of
people internally at anthropic that are
using like a lot of credits like
spending like over a,000 bucks every
month. We see this like power user
behavior. This is something that they
teach in YC. If you can solve your own
problem, it's much more likely you're
solving the problem for others. There's
this like really old idea in product
called latent demand. You build a
product in a way that is hackable that
is kind of open-ended enough that people
can abuse it for other use cases it
wasn't really designed for and you build
for that cuz you kind of know there's
demand for it. Do you think the CLI is
the final form factor? Are we going to
using cloud code in the CLI primarily in
a year or in 3 years or is there
something else that's better
This podcast is sponsored by Google. Hey
folks, I'm Omar, product and design lead
at Google DeepMind. We just launched a
revamped vibe coding experience in AI
Studio that lets you mix and match AI
capabilities to turn your ideas into
reality faster than ever. Just describe
your app and Gemini will automatically
wire up the right models and APIs for
you. And if you need a spark, hit I'm
feeling lucky and we'll help you get
started. Head to a.studio/bild.
studio/build to create your first app.
>> Cat Boris, thank you so much for being
here.
>> Thanks for having us.
>> Yeah. Um, so for people who don't know
you, you are the creators of Claude
Code. Thank you very much from the
bottom of my heart. It's uh I love Cloud
Code.
>> That's amazing to hear. That's what we
love to hear. Um I Okay, I think the
place I want to start is when I first
used it. Um, there was like this moment
like I think it was around when uh
Sonnet 37 came out where I was like I
used it and I was like, "Holy this
is like a completely new paradigm. It's
a a completely new way of thinking about
code." And the the big difference was um
you went all the way and just eliminated
the text editor and you're just like all
you do is like talk to the talk to the
terminal and and that's that's it. Um,
and you know, previous paradigms of AI
programming, pre previous harnesses have
been like you have a text editor and you
have the AI on the side and it's kind of
like or it's a tab complete. So, take me
through like that decision process that
ar that that process of of architecting
this new paradigm. How do you how did
you think about that?
>> Yeah, I think the the most important
thing is it was not intentional at all.
Okay.
>> Uh, we we sort of ended up with it. So
at the time when I joined Enthropic um
we were still on different teams at the
time. Um there was this previous
predecessor to quad code. It was called
Clyde like CL C cliop.
And it was this like research project
you know it took like a minute to start
up. It was this kind of like really
heavy Python thing. It had to like run a
bunch of indexing and stuff. And when I
joined I wanted to ship my first PR and
I hand wrote it like a you know like a
noob in a in a like I didn't know about
any of these tools.
>> Thank you for admitting that.
U I didn't know any better and then I I
put up this PR and um Adam Wolf who was
the um manager for our team for a while.
He was my ramp up buddy and he just like
rejected the PR and he was like you
wrote this by hand. What are you what
are you doing? Use Quide. Um cuz he was
also hacking a lot on Quiet at the time.
And so I tried Quaid. I gave it the
description of the task and it just like
one shot at this thing
>> and this was like you know sonnet 35. So
I still had to fix a thing even for this
kind of basic task and the harness was
super old. So it took like 5 minutes to
turn this thing out and just took
forever and um but it but it worked and
I was just mind-blown that this this was
even possible and they just kind of got
the gears turning. maybe you don't
actually need an IDE.
>> And then later on I was prototyping
using the anthropic API and the easiest
way to do that was just building a
little app in the terminal cuz that way
I didn't have to build a UI or anything.
And I started just making a little chat
app and then I just started thinking
maybe we could do something a little bit
like Clyde. So let let me build like a
little Clyde and it actually ended up
being a lot more useful than that
without a lot of work.
And I think the biggest revelation for
me was when we started to give the model
tools. It just started using tools and
it was just it was this insane moment.
Like the model just wants to use tools.
Like we gave it bash and it just started
using bash writing apple script to like
automate stuff uh in response to
questions. And I was like this is just
the craziest thing. I've never seen
anything like this. Cuz at the time I
had only used IDE with like you know
like text editing a little like oneline
autocomplete, multi-line autocomplete,
whatever. Um, so that that's where this
came from. It was this kind of
convergence of like prototyping but also
kind of seeing what's possible in kind
of like a very um rough way.
>> Um, and this thing ended up being
surprisingly useful and and I think it
was the same for us. I think for me it
was like kind of sonnet 4 opus 4. That's
where that magic moment was. I was like,
"Oh my god, this this thing works."
>> That's interesting. So like tell me
about that that the tool moment because
I think that is one of the special
things about cloud code is it just
writes bash and it's really good at it.
And I think a lot of um previous agent
architectures or even anyone building
agent today, your first instinct might
be okay, we're going to give it a find
file tool and then we're going to give
it a uh open file tool and you you build
all these like custom wrappers for you
know uh all the different actions you
might want the agent to take, but Cloud
Code just uses bash and it's like really
good at it. So how do you think about um
how do you think about what you learned
from that?
Yeah, I think we're at this point right
now where Quad Code actually has a bunch
of tools. I think it's like a dozen or
something like this. We we actually like
add and remove tools most weeks. So,
this changes pretty often. Um, but today
there actually is a search uh there's a
tool for for searching. Um, and we do
this for two reasons. One is the UX, so
we can show the result a little bit
nicer to the user because there's still
a human in the loop right now for most
tasks. Uh, and the second one is for
permissions. So, if you say in your like
cloud code like settings.json JSON on
this file you cannot read. We we have to
kind of enforce this. Uh we enforce it
for bash but we can do it a little bit
more efficiently for if we have a
specific search tool. Um but definitely
we want to like unhip tools and kind of
keep it simple for the model. Um like
last week or two weeks ago we unchipped
the OS tool because in the past we
needed it but then we actually built a
way to enforce this kind of permission
system for bash. Um, so in Bash, if we
know that you're not allowed to read a
particular directory, Quad's not allowed
to OS that directory. And because we can
enforce that consistently, we don't need
this tool anymore. Um, and this is nice
because it's a it's a little less choice
for Quad. A little less stuff in
context.
>> Got it. And how do you guys split
responsibility on the team?
>> Um, I would say Boris sets the technical
direction and has been the product
visionary for a lot of the features that
we've come out with. I see myself as
more of like a supporting role to make
sure that um that one that like our
pricing and packaging resonates with our
users. Um two making sure that we're
shephering all our features across the
launch process. So from like deciding
all right like these are the prototypes
that we should definitely ant food to
like setting the quality threshold for
ant fooding through to communicating
that to our end users. And um there's
definitely some new initiatives that
we're working on that uh I would say
historically a lot of quad code has been
built bottoms up like Boris and a lot of
the core team members have just had
these great ideas for to-do list sub
agents hooks like all these are bottoms
up. As we think about expanding to more
services and bring cloud code to our
places, I think a lot of those are more
like, all right, let's talk to
customers. Let's bring engineers into
those conversations and prioritize those
services and knock them out.
>> Got it. What is ant fooding?
>> Oh, ant fooding is
>> Oh, ant fooding.
>> Oh, um it it means dog fooding. So,
>> anthropic ant. I got it.
>> Yeah. Our nickname for um internal
employees is ant. And so uh ant fooding
is our version of dog fooding. Uh
internally over I think 70 or 80% of
ants uh technical anthropic employees
use cloud code every day. And so every
time we are thinking about a new
feature, we push it out to people
internally and we get so much feedback.
We have a feedback channel. I think we
get a post every five minutes. And so
you get really quick signal on whether
people like it, whether it's buggy, um
or whether
uh it's not good and we should unchip
it.
>> You can tell um you can tell that
someone that is building stuff is using
it all the time to build it. Uh because
the the like its ergonomics just makes
sense if you're trying to build stuff
and that that only happens if you're
like ant ant fooding.
Um I Yeah. Yeah. And I I think that
that's a really interesting paradigm for
building new stuff like that sort of
bottoms up I make something for myself.
Um tell me about that.
>> Yeah. And C cat is also so humble. Um I
think cat has a really big role in the
product direction also like it comes
from everyone on the team and like these
specific examples this actually came
from everyone on the team like to-do
lists and sub aents that was Sid Hooks
Dixon shipped that plugins Daisy shipped
that.
>> So like everyone on the team like these
ideas come from everyone. Um, and so I
think for us like we build this core
agent loop and this kind of core
experience and then everyone on the team
uses the product all the time. Uh, and
so everyone outside the team uses the
product all the time. And so there's
just all these chances to build things
that serve these needs. Like for
example, like bash mode, you know, like
the exclamation mark and you can type in
bash commands. This was just like many
months ago. I was using quad code and I
I was going back and forth between two
terminals and just thought it was kind
of annoying. Uh, and just on a whim, my
asked squad to kind of think of ideas,
the thought of this like exclamation
mark bash mode. And then I was like,
great, make it pink and then ship it.
It just did it. And like that that's the
thing that still kind of persisted. And
you know, now you see kind of others
also kind of catching on to that.
>> That's funny. I actually didn't know
that. And that's extremely useful
because I always have to open up a new
tab to like run any bash commands. So
you just you just do an exclamation
point and then it just like runs it
directly instead of filtering it through
all all the cloud stuff.
>> Yeah. And quad code sees the full output
too.
>> Interesting. That's perfect.
>> So anything you see in the cloud code
view, cloud code also sees.
>> Okay, that's really interesting.
>> And this is kind of a UX thing that
we're thinking about. Like in the past
tools were built for engineers, but now
it's equal parts engineers and model.
>> And so like as an engineer, you can see
the output, but it's actually quite
useful for the model also. And this is
part of the philosophy also like
everything is dual use. Um so for
example, the model can also call slash
commands. So like you know I have a
slash command for slashcomit where I run
through kind of a few different steps
like diffing and generating a reasonable
commit message and and this kind of
stuff. I run it manually but also Claude
can run this for me. Uh and this is
pretty useful because we get to share
this logic. We get to kind of define
this tool and then we we both get to use
it.
>> Yeah. What are the differences in uh
designing tools that are dual use from
designing tools that are you know used
by one or the other? Surprisingly, it's
the same.
>> Okay.
>> So far.
>> Yeah. I I I sort of feel like this kind
of elegant design for humans translates
really well to the models.
>> So, you're just thinking about what
would make sense to you and the model
generally, it makes sense to the model,
too, if it makes sense to you.
>> Yeah. I think one of the really cool
things about Cloud Code being um a
terminal UI and what made it work really
well is that Cloud Code has access to
everything that an engineer does at the
terminal. And I think when it comes to
whether the tool should be dual use or
not, I think making them dual use
actually makes the tools a lot easier to
understand. It just means that okay,
everything you can do, cloud code can
do. There's nothing in between.
>> Yeah, that's interesting. Yeah, there
there are a couple of those decisions.
So um
no no code editor, it's in the terminal,
so it has access to your files. Um, and
it's it's on your computer versus like
in the cloud in a virtual machine. So
you get like repeated you you get to use
it in a repeated way where you can like
you know build up your cloud MD file or
you know like all all like build slash
commands and all that kind of stuff
where it becomes very composable um and
extensible from a very simple
starting point. And I'm curious about
how you think about, you know, for for
people who are thinking about, okay, I
want to build an agent, I want to build
probably not cloud code, but like
something else, how you get that that
simple package that then can extend and
be really powerful over time.
>> For me, I I start by just thinking about
it like developing any kind of product
where you have to solve the problem for
yourself before you can solve it for
others. And like this is something that
they teach in YC is you have to start
with yourself. So like if you if you can
solve your own problem, it's much more
likely you're solving the problem for
others. And I I think for coding
starting locally is the reasonable thing
and you know now we have cloud code on
the web. So you can also use it with a
virtual machine and um you know you can
use it in a remote setting and this is
super useful when you're on the go you
want to take that from your phone
>> and and this is sort of we we started
proving this out kind of a step bat a
time
>> where you can do atcloud in GitHub and
uh I use this every day like on the way
to work I'm like at a red light I
probably shouldn't be doing this but I'm
like you know on GitHub at a red light
and then I'm like at claude you know fix
this issue or whatever and so it's it's
just real useful to be able to control
it from your phone um and this kind
proves out this experience. I I don't
know if this necessarily makes sense for
every kind of use case. For coding, I
think starting local is right. Um I
don't know if this is true for
everything though.
>> Got it. What are the slash commands you
guys use?
>> Slashprit.
>> Yeah.
>> Um yeah, it it's I I think the
pritcomand makes it a lot faster for
claw to know exactly what bash commands
to run in order to make a commit.
>> And what does the prit slash command do
for people who are unfamiliar? Oh, it it
just tells it like exactly how to make a
commit. Okay.
>> Um and you can like dynam you can say
like, okay, these are the three bash
commands that need to be run.
>> Got it. And and what's pretty cool is
also we have um this kind of templating
system built into slash commands. So we
actually run the bash commands ahead of
time. They're like embedded into the
slash command. Um and you can also
pre-allow certain tool invocations. So
for that slash command we say allow um
you know get commit get push gh and so
you don't get asked for permission after
you run the slash command because we
have like a permission uh based security
system. Um and then also it uses haik
coup which is pretty cool. Um so it's
kind of a cheaper model and faster. Um
yeah and for me I I use like commit uh
commit PR uh feature dev we use a lot.
So like sid created this one. It's kind
of cool. So it kind of like walks you
through step by step um building
something. So we prompt quad to like
first ask me how to what exactly I want
like build the specification
>> and then um you know kind of like build
like a detailed plan and then make a
to-do list walk through step by step. So
it's kind of like more structured
feature development
>> and then I think the last one that we
probably use a lot so we use like
security review for all of our PRs and
then also code review. Um so like quad
does all of our code review internally
at anthropic. Um, you know, there's
still a human approving it, but quad
does kind of the first step in code
review. That's just a slashcode review
command.
>> Got it. Yeah. What are the things I
would love to go deeper into like the
how do you make a good plan? So, the
sort of the feature dev thing because I
think there's a lot of like little
tricks that um I'm starting to find or
people at every start starting to find
that work and I'm curious like what what
are things that that we're missing. So
for example, one um step in the one
unintuitive step of the of the you know
plan development process is even if I
don't exactly know what the thing that
needs to be built is I just have like a
little sentence in my mind like I want
feature X I have Claude just like
implement it just without giving it
anything else and I see what it does and
that helps me understand like okay
here's actually what I mean because it
made all these different mistakes or
like it it did something that I didn't
expect that might be
And then I use that like the learning
from the sort of throwaway development.
I just clear it out. And then that helps
me write a better plan spec for the
actual feature development, which is
something that you would never do before
because it'd be too expensive to just
like yolo send an engineer on a feature
that you hadn't actually speced out. But
because you have cloud going through
your codebase and doing stuff, you can
like learn stuff from it. Um that helps
inform the actual plan that you make.
>> Yeah. I feel maybe I I can start and I'm
curious how you use it too.
>> I think there's like a few different
modes maybe for me like one one is
prototyping mode.
>> So like traditional engineering
prototyping you want to kind of build
the simplest possible thing that touches
all the systems just so you can kind of
get a vague sense of like what are the
systems there's unknowns and just to
kind of trace through everything.
>> Um and so I I do the exact same thing as
you Dan like Claude just does the thing
and then I see where it messes up and
then I'll ask it to just throw it away
and do it again. So just hit escape
twice, go back to the old checkpoint and
then try again. I think there's also
maybe two other kinds of tasks. So one
is just things that quad can one-shot
and I feel pretty confident it can do
it. So I'll just tell it and then I'll
just go to a different tab and I'll I'll
shift tap to auto accept and then just
go do something else or go to another
one of my quads and tend to that while
it does this.
>> Um but also there's this kind of like
harder feature development. So these are
you know things are maybe in the past it
would have taken like a few hours of
engineering time and for this usually I
would I'll shift tap into plan mode and
then align on the plan first before it
even writes any code. Um and and I think
what's really hard about this is the
boundary changes with every model and it
in kind of a surprising way where the
newer models they're more intelligent so
the boundary of what you need plan mode
for got pushed out like a little bit
>> like before you used to need to plan now
now you don't. And I think it's this
general trend of like stuff that used to
be scaffolding with a more advanced
model, it gets pushed into the model
itself and the model kind of tends to
subsume everything over time. Yeah. How
do you think about like building a agent
harness that isn't just going to like
you're you're not spending a bunch of
time um building stuff that is just
going to be subsumed into the model in 3
months when the new cloud comes out?
like, yeah, how do you how do you know
what to build versus what to just say it
doesn't work quite yet, but next time
it's going to work, so we're not going
to spend time on it.
>> I think we build most things that we
think would improve Cloud Code's
capabilities, even if that means we'll
have to get rid of it in 3 months. If
anything, we hope that we will get rid
of it in three months.
>> I think for now, we just want to offer
the most premium experience possible and
so we're not too worried about throwaway
work. H
>> interesting. Yeah. And an example of
this is something like even like plan
mode itself. I think we'll probably un
ship it at some point when Quad can just
figure out from your intent that you
probably want to plan first. Um or you
know, for example, I just deleted like
2,000 tokens or something from the
system prompt yesterday just cuz like
Sonnet 45 doesn't need it anymore. Um
but Opus Opus 41 did need it. What about
um you know in the case where uh the the
latest frontier you know model doesn't
need it but you know you're trying to
figure out how to make it more efficient
because you have so many users that you
know you're maybe you you're not going
to use Opus or Sonnet 45 for everything.
Maybe you're going to use Haiku. So
there's a trade-off between having a
more um elaborate harness for Haiku
versus just like not spending time on it
using Sonnet eating the cost and working
on more Frontier type stuff. In general,
we've positioned Quad Code to be a very
premium offering. So, our north star is
making sure that it works incredibly
well with the absolutely most powerful
model we have, which is Sonnet 45 right
now.
>> Um, we are investigating how to make it
work really well for like future
generations of smaller models, but it's
um it's not the top priority for us.
>> Okay. What do you think about um you
know one thing that I notice is
we get models um often and thank you
very much for this. We get models a lot
before they come out and it's our job to
kind of figure out is it any good and
over the last six months
when I'm testing claude for example in
the claude app with a new frontier model
it's actually very hard to tell whether
it's how whether it's better
immediately. Um, but it's really easy to
tell in cloud code because the the
harness matters a lot for the
performance that you get out of the
model. And you guys have the benefit of
building cla or building cloud code
inside of the um inside of enthropic. So
there's like a much tighter integration
between um the fundamental like model
training and the harness that you're
building and and they seem to kind of
like really impact each other. So how
does that how does that work internally
and and um what are the benefits you get
from having that like tight integration?
>> Yeah, I think the biggest thing is like
researchers just use this and so you
know as they see what's working, what's
not, they can they can improve stuff. Um
we do like a lot of eval to kind of
communicate back and forth and
understand where exactly the model's at.
Um, but yeah, there's this frontier
where you need to give the model a hard
enough task to really push the limit of
the model. And if you don't do this,
then all models are kind of equal. But
if you give it a pretty hard task, you
can you can tell the difference.
>> What sub aents do you use?
>> Um, I I have a few. I have like a
planner sub agent that I use. I have a
code review sub aent. Code review is
actually something where sometimes I use
a sub agent, sometimes I use a slash
command. So usually in CI to slash
command, but in synchronous use I use a
sub aent for the same thing.
Um
>> um it's a good question. Yeah, maybe
it's like a matter of taste. Yeah, I
don't know. I don't know. Um I think
it's maybe when you're running
synchronously, it's kind of nice to fork
off the the context window a little bit
because all the stuff that's going on in
the code review, it's not relevant to
what I'm doing next. But in CI, it just
doesn't matter.
>> Are you ever spawning like 10 sub agents
at once? And for what?
>> For me, I do it mostly for like big
migrations. Okay,
>> this like the big thing. Um, actually we
have so this like coder slash command
that we use there's a bunch of sub aents
there and so one of the steps is like
find all the issues and so there's one
sub agent that's like checking for
quadmd compliance. There's another sub
agent that's looking through git history
to see what's going on. Another sub aent
that's looking for kind of obvious bugs
and then we do this like kind of dduping
quality step after. So they find a bunch
of stuff. A lot of these are false
positives and so then then we spawn like
five more sub aents and these are all
just like checking for false positives.
And in the end, the result is awesome.
It finds like all the real issues
without the false issues.
>> That's great. I actually do that. Um, so
one of my non-technical cloud code use
cases is um expense filing. So like when
I'm I'm in SF right now, so like I have
all these expenses. And so I built this
little cloud project that uh in in cloud
code that um it uses uh one of these,
you know, finance APIs to just download
all my credit card transactions. And
then it uh decides like these are
probably the expenses that I'm going to
have to like file. And then I have two
sub agents, one that represents me and
one that represents the company. And
they like do battle to like figure out
like what's the proper um like actual
set of expenses. uh it's like an auditor
sub agent and like you know pro Dan sub
agent. So um yeah that kind of thing the
the sort of like opponent processor uh
pattern seems to be like an interesting
one.
>> Yeah. Yeah. It's it's it's it's cool. I
I feel like when sub aents were first
becoming a thing actually what inspired
us there's like a Reddit thread a while
back where someone made sub agents for
like there was like a front end dev and
a backend dev and like a think it was
like a designer
>> testing dev
>> testing dev like there was like a PM sub
agent and this is like you know it's
cute like it feels like a little maybe
too anthropomorphic um maybe maybe
there's something to this but I I think
like the value is actually like the
uncorrelated context windows where you
have like these two context windows that
don't know about each other and this is
kind of interesting um and you tend to
get better results this way. What about
you? Do you have any interesting sub
agents you use?
>> So, I've been tinkering with one um that
is really good at front-end testing. So,
it uses Playright to like see all right,
what are like all the errors that are
client side and pull them in and try to
test more steps of the app. Um,
it's not totally there yet, but I'm
seeing signs of life and I think it's
the kind of thing that we could
potentially um, bundle in one of our
plugins marketplaces.
>> Yeah. Um, definitely. I I' I've used
something like that just with Puppeteer
and just like watching it build
something and then open up the browser
and then be like, "Oh, I need to change
this." It's like this is like, "Oh my
god."
>> Yeah. It's really cool.
>> It's really cool. I think I think we're
starting to see the beginnings of this
like massive like multi- massive sub
aents. I I don't know what they call
this like swarms or something like that.
There's a bunch of people there's
actually an increasing number of people
internally at anthropic that are using
like a lot of credits every month like
you know like spending like over a
thousand bucks every month. Um and this
like this percent of people is growing
actually pretty fast. And I think the
common use case is like code migration.
And so what they're doing is like
framework A to framework B. uh there's
like the main agent, it makes a big
to-do list for everything and then just
kind of map produce over a bunch of sub
agents. So you instruct quad like yeah
like start 10 agents and then just go
like you know 10 at a time and just
migrate all all the stuff over.
>> That's interesting.
What would be like a concrete example of
the kind of migration that you're
talking about?
>> I think the most classic is like lint
rules.
>> So there's like you know there's some
kind of lint rule you're rolling out.
There's no autofixer because it's like
you know like as an analysis can't
really it's kind of too simplistic for
it. Um I think other stuff is like
framework migrations like um we just
migrated from like one testing framework
to a different one. That's a pretty
common one where it's super easy to
verify the output.
>> One of the things I found is and this is
both for project projects inside of
every and then just open source
projects. It's like if you're someone
building a product and you want to build
a feature that's um been done before. So
maybe like an an example that people
might need to implement a bunch is like
memory. How do you do memory? Um because
we have a bunch of different products
internally, you can just like spawn
cloud sub agents to be like how do these
three other products do it? And there's
like possibility for just like tacit
code sharing where you don't need to
like have an API or you don't need to
like ask ask anyone. You can just be
like how does how do we do this already?
And then use the best practices to um uh
to uh build your own. And you can also
do that with open source because there's
like tons of open source projects where
people are like you know they've been
working on memory for like a year and
it's like really really good. You be
like what are the patterns that um
people have figured out and which ones
do I want to implement?
>> Totally. You can also connect your
version control system. If you've built
a similar feature in the past, cloud
code can use those APIs like query
GitHub directly and find how people
implemented a similar feature in the
past and read that code and um copy the
relevant parts.
>> Yeah. Is there um have you found any use
for like log files of okay you know
here's here's the full history of like
how I implemented it and like is that
important to give to claude and and and
how are you how are you um implementing
that or making it useful for it?
>> Some people swear by it. Uh there are
some people at anthropic where for every
task they do, they tell cloud code to
write a diary entry in a specific format
that just documents like what did it do,
what did it try, why didn't it work, and
then they even have these agents that
like look over the past memory and
synthesize it into observations.
>> I think this is like the starting
budding
>> like there's like something interesting
here that we could productize.
>> Um but it's a new emerging pattern that
we're seeing that works well. I think
the hard thing about like oneshotting
memory from just one transcript is that
it's hard to know how relevant a
specific instruction is to all future
tasks. Like our canonical example is if
I say make the button pink, I don't want
you to remember to make all buttons pink
in the future. And so I think um
synthesizing memory from a lot of logs
is a is a way to um find these patterns
more um consistently. It seems like you
probably need like there's some things
where you're going to know um you'll be
able to summar like synthesize or
summarize in this sort of like top down
way like this this will be useful later
and and you'll you'll know the right
level of abstraction at which it might
be useful but then there's also a lot of
stuff where it's like you actually you
know any given like commit log like make
the button pink it could be useful for
kind of an infinite number of different
reasons um that you're not going to know
beforehand. So you also need the the
model to be able to look up all similar
past, you know, commits and surface that
at the right time. Is that something
that you're also thinking about? Yeah, I
think I think there could there could be
something like that. And maybe I think
one way to see it is this kind of like
traditional memory storage work like
like mex like kind of stuff where you
just want to like put all the
information into the system and then
it's kind of a retrieval problem problem
after that. Um,
yeah. I think as the model also gets
smarter, it naturally I've seen it start
to naturally do this also with Sonnet 45
where if it's stuck on something, it'll
just naturally start looking like we
talked about before like using bash
spontaneously. So just like look through
git history and be like, "Oh, okay.
Yeah, this is kind of an interesting way
to do it."
>> Yeah. One of the things that like we
were talking before we started
recording, one of the um things that
we're doing inside of every like I feel
like it has really um change the way
that we do engineering because everyone
is cloud code build like CLI build and
um we have this engineering paradigm
that we call compounding engineering
where in normal engineering every
feature you add it makes it harder to
add the next feature and in compounding
engineering your goal is to make the
next feature easier to build um from the
feature that you just added. And the the
way that we do that is we try to um
codify all the learnings from um from
everything that we've done to build the
feature. So like you know how did we
make the plan and and what parts of the
plan needed to be changed or like when
we started testing it like what what
issues did we find? What are the things
that we missed? Um and then we codify
them back into all the prompts and all
the sub agents and all the slash
commands so that the next time when
someone does something like this uh it
catches it and that makes it easier. And
that's why for me, for example, I can
like hop into one of our code bases and
start like being productive even though
I'm I don't know anything about how the
code works because we have this like
builtup memory system of um of all the
stuff that we've learned as we've
implemented stuff, but we've had to
build that ourselves. I'm curious, are
you working on that kind of loop so it
the cloud code does that automatically?
>> Yeah, we're we're starting to think
about it. Uh it's funny. We we're just
uh we we heard the same thing from
Fiona. She just joined the team. And you
know, she she's our she's our manager.
She hasn't coded in like 10 years,
something like that. And she was landing
PRs on her first day. And she was like,
"Yeah, like not only did I kind of I
forgot how to code and quad code kind of
made it super easy to just get back into
it,
>> but also I didn't need to ramp up on any
context because I kind of knew all
this." And I think a lot of it is about
like when people put up poll requests
for quad code itself and I think our
customers tell us that they do like
similar stuff pretty often. Um if you
see a mistake I'll just be like add quad
add this to quad MD so that the next
time it just knows this automatically
and you you can kind of like instill
this memory in kind of a variety of
ways. So you can say like at quad add it
to quadmd. You can also say add quad
write a test. You know, that's like easy
way to make sure this doesn't regress.
And I don't feel bad asking anyone to
write tests anymore, right? It's just
like super easy. And like I think
probably close to 100% of our tests are
just written by Quad. And if they're
bad, we just won't commit it. And then
the good ones stay committed. Um, and
then also I think lint rules are a big
one. So for stuff that's enforced pretty
often, we actually have a bunch of
internal lint rules. Claude writes 100%
of these. Um, and this is mostly just
like at Claude in a PR write write this
lint rule. And yeah, there's sort of
this problem right now about like how
how do you do this automatically? And I
think generally how like Cat and I think
about it is we see this like power user
behavior and the first step is how do
you enable that by making the product
hackable so the best users can figure
out how to do this cool new thing
>> but then really the hard work starts of
like how do you take this and bring it
to everyone else. Um, and for me, I I
kept myself in the everyone else bucket.
Like, you know, I don't really know how
to use Vim. Like, I don't have this like
crazy like T-box setup. So, I have like
a pretty vanilla setup. So, if you can
make a feature that I'll use, it's a
pretty good indicator that like other
kind of average engineers will use it.
That is interesting. Like, tell me about
that because like that's something I
think about all the time is um making
something that is extensible and
flexible enough that power users can
find like novel ways to use it that you
would not have even dreamed of. But it's
also simple enough that anyone can use
it and it's and they can be productive
with it and you can you can kind of pull
what the power users find back into like
the basic experience. Like how do you
think about making those design and
product decisions so that you enable
that?
>> In general we think that like every
engine environment is a little bit
different from the others and so it's
really important that every part of our
system is extensible. Um so everything
from your status line to adding your own
slash commands through to hooks which
let you um insert a bit of determinism
at pretty much any step in quad code. So
we think these are the these are like
the basic building blocks that we give
to every engineer that they can play
with. um for plugins. Plugins is
actually our um so it was built by Daisy
on our team and this is this is our
attempt to make it a lot easier for the
average user like us um to bring these
slashcomands and hooks into our
workflows. And so what plugins does is
it lets you browse existing MCP servers,
existing hooks, existing plugins and
just like or sorry existing like sash
commands and just let you write one
command in quad code to pull pull that
in for yourself.
>> There's this like really old idea in
product called latent demand which I
think is probably the main way that I
personally think about product and like
thinking about what to build next is
it's a super simple idea. It's you build
a product in a way that is hackable that
is kind of open-ended enough that people
can abuse it for other use cases it
wasn't really designed for. Then you see
how people abuse it and then you build
for that cuz like you you kind of know
there was demand for it,
>> right?
>> Um and like you know when I when I was
at Meta, this is how we built kind of
all the big products. I think almost
every single big product had this nugget
of latent demand in it. um you know like
for example something like Facebook
dating it came from this idea that when
uh we looked at who looks at people's
profiles I think 60% of views were
between people of opposite gender so
kind of like traditional setup that were
not friends with each other and so we're
like oh man okay maybe there's like
maybe if we like launch a dating product
we can kind of harness this demand that
exists
>> that's interesting
>> and for you know marketplace it was
pretty similar I think it was like 40%
of posts in Facebook groups at the time
were by sell posts and so I Okay, people
are trying to use this product to buy
themselves. We just build a product
around it that's probably going to work.
And so we think about it kind of
similarly, but also we have the luxury
of building for developers and
developers love hacking stuff and they
love customizing stuff and it's like as
a user of our own product, it makes it
so fun to build and and use this thing.
Um, and so yeah, like like I said, we
just build the right extension points.
We see how people use it and that kind
of tells us what to build next. Like for
example, we got all these user requests
where people were like, "Dude, Quad Code
is asking me for all these permissions
and I'm out here getting coffee. I don't
know that it's asking me for
permissions. How could I just get it to
like ping me on Slack?" And so we built
hooks. Uh Dixon built hooks um so that
people could get pinged on Slack and you
could get pinged on Slack for anything
that you want to get pinged on Slack
for. Um, and so it was very much like
people really wanted the ability to do
something. We didn't want to build the
integration ourselves. And so we we
exposed hooks for people to do that.
>> The thing that makes me think of is um
you you recently um released you kind of
moved or rebranded how you talk about
cloud code to be this like more general
purpose agent SDK. Is that was that
driven by some latent demand where you
you sort of saw there's like a more
general purpose use case for what you
built?
>> We realized that similar to how you were
talking about using cloud code for
things outside of coding, we saw this
happen a lot like um we get a ton of
stories of people who are using cloud
code to like help them write a blog and
like manage all the like data inputs and
take a first pass in their own tone. Um
we find people building like email
assistants on this. Um I use it for a
lot of just like market research. Um
because at the core it's like an agent
that can just go on for an infinite
amount of time as long as you give it a
concrete task and it's able to fetch the
right underlying data. So one of the
things I was working on was I wanted to
look at all the companies in the world
and how many engineers they had and to
create a ranking. And this is something
that quad code can do even though it's
not a traditional coding use case. So we
realized that like the underlying
primitives were really general as long
as you give as long as you have like an
agent loop that can continue running for
a long period of time and you're able to
like access the internet and write code
and run code pretty much you can if you
squint you can kind of build anything on
it. Mhm.
>> And and I think like by at the point
where we like rebranded it so like from
the quad code SDK to the quad Asian SDK,
there was already like many thousands of
companies using this thing and a lot of
those use cases were not about coding.
So it's like both both internally and
externally. We kind of saw that
>> like health assistants, like financial
analysts, legal assistance. Um it was
pretty broad.
>> Yeah. What are the coolest ones?
I feel like actually you you had Noah
Brier on the the podcast recently. I
thought like the obsidian like kind of
mind mapping notekeeping use case is
really cool. It's funny. It's insane how
many people use it for this
>> particular combination. Uh I think some
other like some coding or kind of coding
adjacent use cases that are kind of cool
is um we have this like issue tracker
for quad code. The team's just like
constantly underwater like trying to
keep up with all the issues coming in.
There's just so many. And so I quad
ddupes the issues and it automatically
finds duplicates and it's extremely good
at it. It also does first pass
resolution. So usually when there's an
issue it'll um proactively put up a PR
internally and this is a new uh thing
that Enigo on the team built. Um so this
is pretty cool. Uh there's also like on
call and kind of collecting signals from
other places like getting like sentry
logs and getting like logs from BigQuery
and kind of collating all this. um plus
just really good at doing this because
it's all just bash in the end, right?
>> And so these are all kind of these
internal use cases that that I saw.
>> Is it um so when it's you know collating
logs or um you doing issues is that like
you have clouds like continually running
in the background and is that something
that you're building for?
>> Um it gets triggered for that particular
one. It gets triggered whenever a new
issue is filed. So it runs once but it
can choose to run for as long as it
needs.
>> Got it. What about the idea of clouds
always running? Oo, proactive quads. I
think it's definitely where we want to
get to. U I would say right now we're
very focused on making quad code
incredibly reliable for like individual
tasks. And you know, if you think about
like if you think about like multi-line
autocomplete and then like single turn
agents and then now we're working on
like quad code that can complete tasks.
I feel like if you trace this curve
eventually you go to even higher levels
of abstraction like even more
complicated tasks and then hopefully the
next step after that is a lot more
productivity. So just understanding what
your team's goals are what your goals
are being able to say hey I think you
probably want to try this feature and
here's a first pass at the code and here
are the assumptions I made and are these
correct?
>> I can't wait. Um and I think probably
right after that is um Claude is now
your manager.
Um,
>> that's not in the plan.
>> So, everyone on the team was like super
excited that uh we were we were talking
today and they gave me a bunch of
questions and I want to make sure I I
hit all the questions. Um,
uh, oh, here's a good one. Why did you
choose agentic rag over vector search in
your architecture? And are like vector
embeddings uh still relevant? Um so
actually initially we did use vector
embeddings. Um they're just a really
tricky to maintain because you have to
continuously reindex the code and they
might get out of date and you have local
changes. So those need to make it in.
And then as we thought about what does
it feel like for an external enterprise
to adopt it, we realized that this
exposes a lot more surface area and like
security risk. Um we also found that
actually cloud code is really good and
cloud models are really good at agentic
search. So um you can get to the same
accuracy level with agentic search and
it's just a much cleaner deployment
story.
>> H that's really interesting.
>> Um if you do want to bring semantic
search to quad code, you can do so via
an MCP tool. So if you want to manage
your own index and expose an MCP tool
that lets Quad Code call that, that that
would work. What do you think are the
top MCPS to use with cloud code?
>> Puppeteer and Playright are pretty high
up there.
>> Definitely. Yeah.
>> Century has a really good one. Asana has
a really good one. Hm. Do you think that
there are um
any any power user tips that you see
people inside of anthropic or you know
other people who are you know big power
you know inside of organizations that
are big cloud code power users that
people don't know about but they should.
Um, one thing that QuadCo doesn't
naturally like to do, but that I
personally find very useful is, um,
QuadCo doesn't naturally like to ask
questions, but you know, if you're
brainstorming with a thought partner, a
collaborator, usually you do ask
questions back and forth to each other.
And so, this is one of the things that,
um, I like to do, especially in plan
mode. I'll just tell Cloud Code like,
"Hey, we're just brainstorming this
thing. Um, please ask me questions if
there's anything you're unsure about."
um I want you to ask questions and it'll
do it. And I think that actually helps
you arrive at a better answer
>> there. There's like there's also like so
many tips that we can share. I think
like there there's a few really common
mistakes I see people make. One is like
like you said like not using plan mode
enough. This is this is just super
important. And I think this is people
that are kind of new to a coding. They
kind of assume this thing can do
anything and it can't. It's like not
that good today and it's going to get
better but today it can oneshot some
tests. can't one-shot most things. Um,
and so you kind of have to understand
the limits and you have to understand
like where you get in the loop. And so
like something like plan mode, it can
like two 3x success rates pretty easily
if you like land on the plan first. Um,
other stuff that I've seen power users
do really well is companies that have
really big deployments of quad code and
now um, you know, luckily there's a lot
of these companies so we can kind of
learn from them. Uh having settings JSON
that you check into the codebase is
really important because you can use
this to pre-allow certain commands so
you don't get permission prompted every
time and also to block certain commands.
Let's say you don't want web fetch or
whatever and this way as an engineer I
don't get prompted and um I can check
this in and share it with the whole team
so everyone gets to use it.
>> I I get around that by just using
dangerous they skip permissions.
>> Yeah, we kind of we kind of have this
here but we don't you know we don't
recommend it. It's like it's a model,
you know, it can do it can do weird
stuff. Um, I think another kind of cool
use case that we've seen is people using
stop hooks for interesting stuff. So
stop hook runs whenever the turn is
complete. So like the assistant did some
tool calls back and forth with you know
whatever and uh it's done and it returns
control back to the user then we run the
stop hook and so you can define a stop
hook that's like um if the tests don't
pass return the text keep going and
essentially it's like you can just like
make the model like keep going until the
thing is done and this is just like
insane when you combine it with the SDK
and this kind of programmatic usage
>> you can you know this is a stochcastic
thing it's a nondeterministic thing but
with scaffolding you can get these
determin deterministic outcomes.
>> So you guys started this sort of CLI,
this CLI paradigm shift. Um, do you
think the CLI is the final form factor?
Are we are we going to using cloud code
in the CLI primarily in a year or in
three years, or is there something else
that's better?
>> I mean, it's not the final form factor,
but we are very focused on making sure
the CLI is like the most intelligent
that we can make it and that's as
customizable as possible. you can talk
about the next form factors.
>> Yeah, I mean
cat C's asking me to talk about because
no one knows like this this stuff's like
it's just moving like so fast, right?
Like no no one knows what these form
factors are. Like right now I think our
team is in experimentation mode. So we
have uh CLI then we came out with the ID
extension. Now we have a new ID
extension that's like a guey. It's a
little more accessible. Um we have add
quad and github so you can just add quad
anywhere. Um, now there's at quad,
there's quad on web and on mobile, so
you can use it on any of these places.
Um, and we're just in experimentation
mode, so we're trying to figure out
what's next. I think like if we kind of
zoom out and see where this stuff is
headed. I think one of the big trends is
longer periods of autonomy. And so with
every model, we kind of time how long
can the model just keep going and do
tasks autonomously and just, you know,
in dangerous mode in a container, keep
autocompacting until the task is done.
And now we're on the order of like
double digit hours. I think it's like
the last model is like 30 hours,
something like this. And I, you know,
the next model is going to be days. And
as you think about kind of parallelizing
models,
um there's kind of a bunch of problems
that come out of this. So one is what is
the container this thing runs in because
you don't want to have to like close
your laptop. I have that right now
because I'm doing a lot of uh disb I
don't know I've only heard I've only
read it but DSPY or disb prompt
optimization and like it's on my laptop
and it's like I don't want to close I'm
like in the way middle like with my
laptop open because I'm like I don't
want to close it. Yeah.
>> Yeah. That's right. Yeah. We've like
visited companies before like like
customers that everyone's just like
walking around with their like quad
codes.
>> Is this running? So, I think like one is
kind of getting getting away from this
mode and then I also think pretty soon
we're going to be in this mode of like
quads monitoring quads.
>> Yeah.
>> Um and kind of I I don't know what the
right form factor for this is because as
as a human you need to be able to
inspect this and kind of see what's
going on. Um but also it needs to be
quad optimized where um you're
optimizing for kind of bandwidth between
like the quad to quad communication. Um
so my prediction is terminal is not the
final form factor. My prediction is
there's going to be a few more form
factors in the coming months, you know,
maybe like year or something like that.
And it's going to keep changing very
quickly.
>> What do you think about, you know, I
teach a lot of cloud code to a lot of
every subscribers and
>> thank you.
>> You're welcome. Doing doing your work
for you.
Um uh and I think the like one of the
big things is just the terminal is
intimidating and uh just like being on a
call with subscribers being like here's
how you open the terminal and you're
allowed to do this even if you're
non-technical is like a big deal. How do
you think about that? Yeah, I um one of
the people on our marketing team uh
started using cloud code because she was
writing some content that touched on
cloud code and I was like you should
really experience it and she got like 30
popups on her screen where she had to
accept various permissions because she'd
never used a terminal before. So I
completely see eye to eye with you on
that. It's definitely um hard for
non-engineers and there's even some
engineers we've found who aren't fully
comfortable with working day-to-day in
the terminal. Um, our VS Code GUI
extension is our first step in that
direction because you don't have to
think about the terminal at all. It's
like a traditional interface with a
bunch of buttons. Um, I think we are
working on more um graphical interfaces.
Uh, so quad code on the web is a guey. I
think that actually might be a good
starting point for people who are less
technical.
>> Yeah. Yeah. There there was this like
magic moment maybe like a few months ago
where like I walked into the office and
the some of the data scientists at
Anthropic like sit right next to the
quad code team and the data scientist
just had like quad code running on their
computers and I was like what what is
this like how did you figure this out? I
think it was like Brandon uh was like
the first one to do it and he was like,
"Oh yeah, I just like installed it. Like
I work on this product so like I should
use it." And I was like, "Oh my god." So
he like he figured out how to like use a
terminal and JS like you know he hasn't
really done this kind of workflow
before. Obviously like very technical.
Um so I think now we're we're starting
to see all these kind of like code
adjacent
uh like functions. people you use quad
code and um yeah it's kind of
interesting like from a latent demand
point of view these are people hacking
the product so there's like demand to
use it for this and so we want to make
it a little bit easier with more
accessible interfaces but at the same
time for us for quad code we're laser
focused on building the best product for
the best engineers and so um we're
focused on software engineering and we
want to make this like really good but
we want to make it a thing that other
people can can hack
>> some sometimes cloud code will write
code that's a bit verbose post. Um, but
you can just tell it to simplify it and
it does a really good job.
>> Interesting. And so, and how are how and
when are you doing that? So, you're
you're using a slash command or you're
>> I just say it. I just
>> Sometimes you're like, "Hey, this should
be a oneline change and I'll write five
lines and you're like, simplify it and
it understands immediately what you mean
and it'll fix it."
>> Yeah. I think a lot of people on our
team do that, too. Um, that's that's
interesting. Why do you like why not
then if you're saying that all the time
why not then you know push that into
like a slash command or the harness or
something like that to yeah make it just
happen automatically.
>> We do have instructions for this in the
cloud MD. I think it impacts such a low
percentage of conversations that we
don't want it to like over rotate in the
other direction.
>> Um and then the reason why not a slash
command is because you actually don't
need that much context. I think slash
command's really good for situations
where you would otherwise need to write
two three lines but for simp like even
for plan mode you actually can use a few
words but sometime but it actually takes
two or three lines to capture the
entirety of what you want in plan mode.
Um for simplify it you can just write
simplify it and it gets it.
>> Yeah. Yeah, that makes sense. Cool.
>> Yeah.
>> Um okay, now we're we can
um that's interesting. Yeah, but but
this stuff like you know it still feels
just so early.
>> Yeah.
>> You know, like we we were talking before
before the recording about like kind of
where are we on the adoption curve and
it still
>> the hian curve or whatever whatever that
term was.
>> Exactly. And it just feels it just feels
like we're you know like first 10% still
like the stuff is going to change so
fast it's going to keep changing. Even
when I talk to researchers outside of
enthropic who who abuse quad code um
they also get stuck on things like this
like not realizing that they can just
tell the LLM to simplify it and I think
that just goes to show that even for
people who are like working in this
industry they don't always realize that
you can just talk to the model. That's
the thing is like I I think that there's
this underlying expectation that using
AI shouldn't have to be a skill like
because it just does whatever you say
and you're like well I mean whatever you
say is going to matter for what it does.
So if you can say things better it's
going to do better.
>> Yeah. I mean it it changes with every
model though. That's the that's the hard
part. like you know prompt engineer was
a job and now famously it's not a job
anymore and there's going to be more
jobs that are then like not not jobs
anymore of these kind of like little
micro skills that you have to learn to
use this thing and as the model gets
better it can just like interpret it
better
>> but I think that's also like for us this
is part of this kind of humility that we
have to have building a product like
this that we just really don't know
what's next and we're just trying to
figure it out kind of along with
everyone else we're just here for the
ride
>> and that's why it's cool that you're
building it for yourself cuz I think
that's the that's the best way to know
that is just like you're and this is
what we do too is like you're sort of
living in the future. You're using it
all the time. And uh it's pretty clear
what's missing. You're like I just want
this thing and you can just do the next
thing rather than being like hm let me
ask like some enterprise product manager
at like some gigantic company like what
kind of AI feature do you want? And
they're like I don't know like you know
put a little chatbot on the side of my
you know IDE and you're like okay.
>> Yeah.
>> Yeah. This is like the luxurious thing
about building dev tools right you're
your own customer. I think it's also
really um a unique thing about AI
because um it sort of reset the game
board for all software. So
um you know we have Kora this like email
assistant and we have like Sparkle which
organizes your files and it's like
anything that you do for something that
you want to use on your computer if
you're if you're building it with AI
there's a good chance that hasn't been
done before because like the whole whole
landscape has been reset. And so it's a
it's a uniquely exciting time to build
stuff for yourself.
>> Totally. I think it totally opens the
playing field, too. It's like any
individual can now build an app to fill
their need and then distribute it to
everyone else.
>> Yeah,
>> it's really cool.
>> I've been prototyping all these like
random pet projects. Um
>> um I just moved into a new apartment and
it's empty. And so I've been um I've
been building this like shopping advisor
assistant on like the Cloud Agent SDK
cuz who has time to like read all the
reviews and like look at all the options
and find their pricing and everything's
like really hard to discover. And so it
just like asks me a bunch of questions
and I tell it what I want and it shows
me a bunch of Yeah, exactly. and it
shows me a bunch of photos of like
different sofas and options and what
people say online
>> and then I tell it what I don't like and
it's literally feels like working with a
shopping assistant
>> and it it's been really cool.
>> That's really cool.
>> Um I also have my little email response
agent that like drafts responses for me
but I don't use email that much so
>> Oh, and I knew it wasn't you responding.
>> That's why it's seven days delayed.
The agent's just take doing a very
thorough job.
>> Yeah,
>> agent SDK is cool though.
>> Yeah, agent SDK is cool.
>> Yeah, it's it always just feels amazing
like how much we're able to build with
such a small team.
>> Yeah.
>> So, I feel like
>> the other thing that's really cool is
that I think people are just shifting
their mindset from docs to demos. Like
internally, our currency is actually
demos. It's like you want people to be
excited about your thing. Yeah,
>> show us like show us 15 seconds of what
it can do.
>> And we find that everyone on the team
now has this kind of indoctrinated
>> demo culture for sure. And I think
that's better because
>> there's a lot of things that you might
have in your head that if you're a great
writer, maybe you could figure out how
to explain it, but it's just even then
it's just really hard to explain. But if
someone can see it, they like get it
immediately. And I think that's
happening for product building, but it's
also happening for like all sorts of
other types of creative endeavors like
making a movie for example. Like you had
to pitch it, but now you can just be
like I made this Sora video and like you
know check like you can kind of see like
like the glimmer of the thing you're
trying to make for very cheap. And so
that means you don't have to spend time
convincing people as much. You can just
be like here I made it.
>> Yeah. And and also as a builder like you
can just make it and then like make it
again and then make it again until
you're happy. Like
>> I I feel like that like the flip side is
like you used to make a dock or you know
like whiteboard something or you know
like I I would draw stuff in like Sketch
or Figma or whatever and now we'll just
like build it until until I like how it
feels.
>> And it's just like so easy to get that
feeling out of it now. And I I think
it's like you could see it visually
before or you could describe it in words
but it's like you could never get the
vibe. And now like the vibe is really
easy.
>> Yeah. And you built plan mode like three
times.
>> Yeah. Yeah.
>> Because of this.
>> Like you you built it and then you threw
it out and rebuilt it and then threw it
out and rebuilt it.
>> Yeah. Or like Tudos's uh like Sid built
the original version like also like
three or four he built like three or
four prototypes and then I prototype
maybe like 20 versions after that like
in like a day. Yeah. I think this is
like a lot of pretty much everything we
released there was at least a few
prototypes behind it. How do you like um
keep track of and carry forward the
things you learn from prototype to
prototype? And especially if it's like,
you know, some one person is prototyping
it and then you're like, I'm going to
take it over. I'm going to do 20 more.
Like how do you how do you maximize what
you get out of that?
You know, it's it's like there there's
maybe a few elements of it. One is the
style guide. So there's like some
elements of style that we discover. And
I think a lot of this is like building
for the terminal or like we're kind of
discovering a new design language for
for the terminal and kind of building it
as we go. Um, and I think some of this
you can codify in a style guide. So this
is our quad MD, but then there's this
other part of it that's like kind of
product sense where I don't think the
model totally gets it yet. And I think
maybe we should be trying to find ways
to like teach the model this this kind
of product sense about like this works
and this doesn't, right? Because in in
product, you want to solve the person's
problem in the simplest way possible and
then delete everything else that's not
that and just get everything out of the
way. So you kind of you you align the
product to the intent as cleanly as
possible. And maybe the model doesn't
totally get that yet.
>> Yeah. It's never it doesn't really feel
what it's like to use quad code. Like
the model doesn't use quad code.
>> Yeah.
Yeah. And so I think like when you know
like quad code can like test itself and
it can kind of use itself. Um and like
we we do this when developing and it can
see like UI bugs and things like that.
I don't know maybe we should just try
prompting it though. It could like
honestly a lot of the stuff is as simple
as that. Like when there's some new idea
usually you just prompt it and often it
just works. Maybe we should just try
that.
>> A lot of the prototypes are actually the
UX interactions.
Um, and so I think once we discover a
new UX interaction like shift tab for
auto accept, I think uh Boris figured
out. Um, then
>> that was Eigor actually.
>> Oh, Eigor.
>> Yeah, we went back and forth can like
fit into that.
>> We did like dueling prototypes for like
a week.
>> Yeah, shift tab felt really nice. And
then one of the the now current plan
mode iteration um uses shift tab because
it's actually just like another way to
tell the model how agentic it should be.
>> And so I think as as more features use
the same uh interaction, you form like a
stronger mental model for what should go
where.
>> Yeah. Or like thinking I think is
another really good one. Like first we
were like before we released quad code
or maybe it was like the first thinking
model was it like 37? I forget what the
first one was.
>> Yeah.
>> But yeah and it it was like it was able
to think and we're like brainstorming
like how do we like toggle thinking? And
then someone was just like what if you
just like ask the model to think in
natural language and it knows how to
think and we're like okay sweet let's do
that.
And so like we we did that for a while
and then um we realized that people were
accidentally toggling it. So they were
like don't think and then the model was
like oh I should think. it just started
thinking
>> and so we had to kind of like tune it
out so you know don't think didn't
trigger it but then it still wasn't
obvious but then we made a UX
improvement to like highlight the
thinking that
>> and that was like that was so fun and it
felt really magical
>> when you do ultra think it's like
rainbow or whatever exactly and then
with uh with sonet 45 we actually find
like a really really big performance
improvement when you turn on extended
thinking um and so uh we made it really
easy to toggle it because sometimes you
want it sometimes you because you you
kind of for a really simple task, you
don't want the model to think for like
five minutes. You want it to just do the
thing. And so we used tab as the
interaction to toggle it. And then we
unchipped a bunch of the thinking words.
Although I I think we kept ultra think
just for like sentimental reasons. It
was such a cool UX.
>> Interesting. Do you think there's some
there's some new metric that's about
what you deleted? And I I think
programmers have always felt like, you
know, deleting a bunch of code feels
really good, but there's something about
because you can build stuff so fast, it
becomes more important to like also
delete stuff.
I think my favorite kind of diff to see
is a red diff. This is the best whenever
I'm like, "Yeah, bring it on. Another
one. Another one." Um, but it, you know,
but it's hard because like anything you
ship, people are using it. And so you
got to keep people happy. And so I think
generally our principle is if we un ship
something, we need to ship something
even better. um that can kind of um that
people can can take advantage of that
that kind of matches that intent uh even
better. Um and yeah, I think this is
kind of back to like how do you measure
like quad code and the impact of it and
this is something like every company
every customer asks us about and I think
like in so internally at anthropic I
think we like doubled in size since
January or something like that but then
productivity per engineer has increased
like almost 70% in that time. um
>> measured by
>> uh I think we actually measured it yeah
in a few ways but kind of PRs are the
the simplest one and the main one um but
like you said like this doesn't capture
the full extent of it because a lot of
this is like making it easier to
prototype making it easier to try new
things making it easier to these things
that you never would have tried because
they're way below the cut line. Um
you're launching a feature and there's
this kind of like wish list of stuff now
you just do all it because it's so easy
>> and you just wouldn't have done it.
>> So yeah, it's really hard to talk about
it. And then there's this flip side of
it where more code is written. So you
have to delete more code. You have to
code review more carefully and you know
automate automate code review as much as
you can. There's also like an
interesting like new product management
challenge because you can ship so much
that you end up it it ends up not
feeling as cohesive because you could
just like add button here and like a tab
there and like a little thing here. Like
it's just it's much easier to build a
product that has all the features you
want but doesn't have any sort of
organizing principle because you're just
shipping lots of stuff all the time. I
think we try to be pretty disciplined
about this and making sure that all the
abstractions are really easy to
understand for someone even if they just
hear the name of the feature. We have
this principle that I believe Boris
brought to the team that I really like
where we don't want a new user
experience. Everything should be so
intuitive that you just drop in and it
just works. And I think that's that's
really set the bar really high for
making sure every feature is really
intuitive. How do you do that with um a
conversational UI? Because um you know
when there's not a bunch of but buttons
and knobs and it's just a blank text box
to start, how do you think about making
it intuitive?
>> Um there's a lot of like little things
that we do like um we teach people that
they can use the question mark to see
tips. Um we show tips as quad code is
working. We have like the change log on
the side. um we tell you about like oh
there's a new model that's out or like
we show you at the bottom we have a
notification section for thinking. I
think there's just like subtle ways in
which we tell users about features. I
think the other thing that's really
important is to just make sure that all
the primitives are very clearly defined
like hooks have a common meaning um in
the developer ecosystem. plugins have a
very common meaning in the developer
ecosystem and just making sure that what
we build matches what like the you know
the average developer would immediately
think of when they hear that
>> there there's this also this like
progressive disclosure thing like you
know to to any anytime in quad code when
you run it you can hit control O to see
like the full raw transcript the same
thing the model sees and we don't like
show you this until it's actually
relevant so when there's a tool result
that's collapsed then we'll say use
control O to see it so we kind of we
don't want to put too much complexity on
you at the start because this thing can
do you know anything. Um I think there's
this other kind of new principle which
we've just started exploring which is
like the model teaches you how to use
the thing and so you can ask quad code
about itself and it it kind of knows to
look up its own documentation to tell
you about it but we can also go even
deeper like for example slash commands
are a thing that people can use but also
the model can call slash commands and
maybe you see the model calling it and
then you'll be like oh yeah I guess I
can do that too.
>> Yeah. Yeah. Yeah. Interesting. How has
it changed like you know when you first
started doing this cloud code was this
sort of like singular thing this
singular way of thinking about you know
using AI through a CLI other people had
stuff like this but it it felt like this
shift and now there's a whole landscape
of everyone is like going CLI CLI CLI
like how has that changed how you think
about building how it feels to build and
how are you dealing with the sort of
pressure of the race that you're in? I
>> think for for me like imitation is the
greatest flattery. Mhm. Um, so it's like
it, you know, it's it's awesome and it's
just like it's cool to see all this
other stuff that everyone else is
building like inspired by this. And I
think this is ultimately the goal is to
kind of inspire people to build this
next thing for this just incredible
technology that's that's coming. And
that that's just really exciting.
Personally, I don't really use a lot of
other tools. So, usually when something
new comes out, I'll I'll maybe just try
it to get a vibe. Um, but otherwise I
think we're pretty focused on just
solving problems that we have and our
customers have and kind of building the
next thing.
>> Cool.
Sweet. Um,
I I loved this part of the interview,
too.
>> Did we answer all of your team's
questions?
>> Questions?
>> Do Oh, did we get through all my team's
questions? Let's see. Uh, I think we
did. Um,
uh,
I'm curious also how you would answer
like the unshipping question cuz also
like if you're doing this kind of like
AIdriven development, you ship a lot.
You have a small team, so it's a lot of
operational load.
>> The reason I asked that is because I
don't think we do a good job of that.
Um,
and
I have this feeling that some of the
products are like a little bit messy
because of that. And I think
particularly for Kora,
um
there's just a big product surface area
and it can do a lot of different things
like it we have an email assistant so
you can ask it like you know uh tell me
about the trip I'm taking and it'll go
through all your emails and you know
summarize the the trip. Um or we have
this feature that it automatically
archives any email that you don't need
to respond to immediately. Um, and then
twice a day you get a brief that
summarizes all the stuff that you
probably need to see but you don't need
to like actually do anything with and
you just scroll through it and you're
done. Um, and there's just like all this
there's all this complexity that around
you know for example
how are emails categorized? So now we
have a whole view of we have all these
categorization rules and you can order
them and whatever, but like it's just
complicated and hard to communicate and
and uh and I want to retain a lot of the
like all the power and flexibility, but
also you can't look at a screen and be
like I have no idea what's going on.
This is like way too complicated. So
that's I'm just like I'm processing all
that stuff. So the the kind of like
deletion, you know, un unshipping idea
feels like an interesting
um cultural principle that we haven't
really explored.
>> Yeah, it's really hard. I think there's
like a social cost to it, too, where
like you kind of want to be the person
who tells your coworker to unship their
thing.
>> It's definitely tricky. It's more than
just the code. Yeah, I I definitely
learned this at Instagram honestly cuz I
I think Facebook does a terrible job at
unshipping and we had this problem where
>> every time we I think even like
unshipping pokes was like really spicy
cuz there's a bunch of these like
old-timers. They're like, "No pokes,
you're never going to take it away." But
like if you look at the data, no one
really uses it anymore.
>> But for sentimental reasons, they were
kind of tied to it.
>> And so like for Facebook, it always
maybe nothing ever got unchipped. It
always got moved to like a secondary
place like a, you know, like an overflow
menu somewhere that no one looks at,
like a graveyard.
>> Yeah.
>> And I think Instagram was just very
principled. There was like, you know,
very strong in a product and design
point of view that was like, if this
thing isn't used by like half of people,
you know, 50% of WOW or whatever, we're
just going to delete it and deal with it
and then we'll figure out some next
thing that's used by more people.
>> I love it. Um, well, thank you. This was
amazing. I'm really uh glad I got to
talk to you and uh keep building.
>> Thank you for having us.
>> Yeah. Thanks.
>> Oh my gosh, folks. You absolutely
positively have to smash that like
button and subscribe to AI and I. Why?
Because this show is the epitome of
awesomeness. It's like finding a
treasure chest in your backyard, but
instead of gold, it's filled with pure
unadulterated knowledge bombs about chat
GPT. Every episode is a roller coaster
of emotions, insights, and laughter that
will leave you on the edge of your seat,
craving for more. It's not just a show,
it's a journey into the future with Dan
Shipper as the captain of the spaceship.
So, do yourself a favor, hit like, smash
subscribe, and strap in for the ride of
your life. And now, without any further
ado, let me just say, Dan, I'm
absolutely hopelessly in love with you.
Loading video analysis...