Beyond Instructions: How Beads Lets AI Agents Build Like Engineers
By AI Tinkerers
Summary
## Key takeaways - **Agents Get Session Dementia**: Agents would get dementia, they'd get really confused because all they know is what's on disk. All they know is what you've prompted them, right? In the context. [00:00], [00:08] - **Beads: Git-Based Issue Tracker**: Beads is a tiny little issue tracker, but it's all saved in Git with dependency graph, audit trail, provenance, parent, children, epics. It emerged from frustration with markdown chaos, now every piece of work is addressable like Google maps for your plan. [38:07], [43:36] - **Screenshot Validates React Port**: Using Playwright to take screenshots and match against Steam client reference as source of truth for porting 30-year-old game to React. Iterates on views, theming, fonts, spacing to fix visual bugs like flickering map and gray placeholders. [12:17], [15:36] - **Tracer Bullets Unstick Agents**: Tracer bullet is finding the thinnest end-to-end path that works, like building TN net protocol first to get past protocol hurdles in game client. Constrain to minimum repro case with frameworks, tests, multiple screenshots for behaviors over time. [14:42], [15:04] - **Land the Plane Cleanup**: When saying 'land the plane,' agents follow protocol: update beads issues, clean git state, remove stashes, old branches, debugging artifacts, then recommend next session prompt. Forces thorough checklist even low on context. [26:04], [29:15] - **AI Better at Self-Review**: Agents are better at evaluating their own outputs or another model's than generating, so after work say 'code review' and they find bugs in own code. Force thinking alternatives, evaluate trade-offs, state confidence to improve. [22:44], [23:00]
Topics Covered
- Full Video
Full Transcript
agents would get dementia, >> okay? They'd get really confused because
>> okay? They'd get really confused because all they know is what's on disk. All
they know is what you've prompted them, right? In the context.
right? In the context.
>> Hi, today I sat down with Steve Yaggi, the head of engineering at Source Graph, makers of the AMP coding environment. I
sat down with him to learn about his new open source project beads, a framework that gives coding agents a task management system and memory to allow them to achieve much longer run times
and to tackle therefore more ambitious projects. It's totally open source. It's
projects. It's totally open source. It's
four weeks old and it already has hundreds of contributors. It works with all your coding agents. You can use it with Cloud Code. I use it with Codeex and of course it works with AMP. He also
gives a quick follow-up to his rather infamous 2024 essay, The Death of the Junior Developer, with advice on how junior developers can actually evolve and thrive. And he reminds us that all
and thrive. And he reminds us that all of us depend now on our portfolio and our understanding of AI workflows and why we should be nice to our AIs. We
also talk about Steve's own development environment setup and techniques that he uses to be more productive in this new role of an engineer which is to manage and supervise and guide the AI workers.
He talks about tools like graphite which MP MCP service he uses and and we dive into a multimodal prompting which he uses to fix a visual bug in a React game client that he's working on by
automatically taking screenshots and feeding that into the context for verification. He also um shares
verification. He also um shares techniques that he uses in in all of his workflows like landing the plane which is a scripted cleanup command that he runs at the end of every session uh that
handles commit syncing issues on beads uh cleaning up old git branches um and even figuring out what the best next session to work on is. He talks about tracer bullets. If you ever if you've
tracer bullets. If you ever if you've ever had your agents get stuck on something and you're just stuck stuck, you should figure out the tracer bullets technique that he shares and use that because it will help you get unstuck.
All of that and more right now as we sit down with Steve Yaggi to explore the future of Agentic Coding.
[music] >> All right. So, welcome to OneShot. This
is our podcast where we sit down with certain people in the community who are doing special interesting stuff. We kind
of deep dive. It's sort of like you presented the other day tinkerers which was great. This is like the longer form
was great. This is like the longer form you know we can go deeper basically >> it was three minutes at tinkers.
>> Yeah. [laughter] Yeah. Now we can go like an hour almost if you want. Uh and
we'll go like enough where we could enable a fellow tinkerer. That's the
idea.
>> These are not philosophical discussions.
These are technical discussions.
Hopefully we can learn from you what you've done and enabled in your own workflows and and and obviously beads.
I'm very curious to hear about why don't we start with that.
>> Yeah.
>> Actually, why don't we start with you?
What is your background?
>> Oh, yeah. So, um, hey, I'm Steve and, uh, I've been in the industry for a good long time and, uh, yeah, got I mean, I got my start in the like the early 90s.
Uh, Amazon, Google, you know, Grab, all that stuff.
>> I work at Source Graph right now. We
work on a coding agent. You'll get to see it. Uh, it's a lot like cloud code,
see it. Uh, it's a lot like cloud code, but it has its own charms. And uh well, we're going to be talking about the way that coding is proceeding for a small percentage of the world's developers
right now. Uh and it's unfortunate in my
right now. Uh and it's unfortunate in my mind that is a small percentage because it should be everyone.
>> Yeah.
>> Uh and the world is taking kind of a long time. It's just uh really taken its
long time. It's just uh really taken its time to get there.
>> Mhm.
>> So uh I want to show you what I do and how it works and talk talk through it and sort of explain why this would be better. And can I give a quick
better. And can I give a quick disclaimer and a and a plug. This is not a sponsored conversation.
>> Yeah, this is just me and Joe hanging out actually.
>> Um, but I will plug source graph anyways and AMP because I don't think many in the community have have heard about or used it. I have used cloud code
used it. I have used cloud code extensively and AMP extensively and codeex and I would say each is good for different things. For large codebased
different things. For large codebased understanding and delivering good results quickly, AMP was is is very better than cloud code at least when I was testing it. Yeah, it's AMP is like
an enterprise marquee offering, right?
It's really good. It's good for really large code bases.
>> I think people with large code bases should would should check it out.
>> That's a good I like that. It's a good way to think about it.
>> Is that how you think about it at at source graph?
>> Certainly, that is our focus.
>> Okay.
>> Yeah. So, for sure. Yeah, cuz we have code search, right? And an MCP server for it. So, right off the bat, it
for it. So, right off the bat, it doesn't have to grap.
>> You see what I'm saying? So, that can save you like a huge number of tokens just off just skimming off the top because it's not spending as much time.
Yeah. Anyway, this isn't we're not we're not talking about that.
>> Tell and and your background too is you've done what OS and systems and compilers and this is your background is throughout your whole career, right?
>> Tell I mean I've been pretty full stack, right? Like a you know what turns you
right? Like a you know what turns you into full stack is making a video game.
>> Yeah. Here I'm actually going to pull it up right now. Okay. Because can you see?
Okay. So you'll be able to show them the screen right >> as we're talking.
>> One of my agents is working on my my game. Now this is a React client and
game. Now this is a React client and it's flickering. Oh, did you see it
it's flickering. Oh, did you see it moved? It's trying stuff. It's using the
moved? It's trying stuff. It's using the Playright MCP server. So, it's remote controlling this uh browser, right?
>> Are we using it to play the game or using it to build the game? And we're
iterating the build of the game.
>> So, what what I'm using it for? The game
is 30 years old. It's got it's this Java thing, Java Cotlin. It's like a half a million lines of code, >> 60 contributors. It's had I mean, seriously, it's a very large longived thing.
>> The clients are all over the place. It's
got Android. It's got iOS and Objective C that I wrote myself a long time ago.
It's got a Steam client which did start.
You started >> I started Yeah. like 96 and this is like a labor of love, man. It's had 250,000 players. It's won awards.
players. It's won awards.
>> It's still working on it this long.
>> Well, I mean, I've taken years off here.
>> It's back in a big way. I'm not working on it. The AI is working on it. Look at
on it. The AI is working on it. Look at
it. Look, look, look. It's signing back in. So, like this, this here was such a
in. So, like this, this here was such a life-changing moment for me cuz it's building a React client for me. Now, why
React? Well, I want to replace all my old clients with one to rule them all.
I'll have one code base that I can iterate quickly.
>> That's a scary project for a human or an agent.
>> Oh, man. This, you know, okay, look, my clients are thinish. They wind up being about 20 30,000 lines of code. They're
just rendering what the server is sending them. I made almost like a a TV.
sending them. I made almost like a a TV.
[clears throat] >> So, it isn't that big, but it's still like it's still a meaty chunk. Like, it
takes me personally 6 months to go to a new platform and learn it like Flutter say, and write the client for >> it. Yeah, totally.
>> it. Yeah, totally.
>> And it's building this React client for me and I'm just yelling at it. That's my
job. Okay. I I what I do is I I've got a workflow where I say, "Okay, we have a screenshot." And that screenshot, our
screenshot." And that screenshot, our shared screenshot of this thing is our source of truth and it is our lifeline because because unfortunately it's it's it's been really brittle and it breaks itself all the time, right? And it's
like we're done. We're done. And it's
like the map's not even rendering anymore, right? And so part of the
anymore, right? And so part of the workflow is So anyway, I can talk and talk your ear off about this, but man, >> this is this is one of three projects I'm going to be showing you today. And
the reason I pulled it up is that it literally pulled itself up while we were talking.
>> Really?
>> Yeah.
>> What's where's where's it? I see local host. You have an agent running on your
host. You have an agent running on your laptop that wakes up and does it?
>> Yeah. So the agent that's doing it is right here. It's in source graph amp. We
right here. It's in source graph amp. We
can see it has a to-do list that you can see right there on the right.
>> Yeah.
>> I actually prefer this UI to Cloud Code's UI with a to-do list. Oh yeah,
cool. [snorts]
>> They put theirs um kind of like they pin it down at the bottom.
>> Yeah, having that context.
>> And actually having this nice big window to type in is really fun, too. Uh I
don't know. That's I I really like AMP's ergonomics.
>> Yeah, I like that. I mean, I don't think when I was using AMP ex every day, it was it had that feature that to-do list on the side, but it was already way better in terms of this CLI. So, I'm not surprised like the experience Devx is better.
>> Yeah. So, we um it's >> not plug. I'm just saying it was better.
>> Well, you know, I mean, look, they're like cars. Like, you sit in them all day
like cars. Like, you sit in them all day long.
>> Yeah.
>> And you drive them >> and cars all kind of do the same thing.
>> Yeah.
>> And the, you know, the the but but the ergonomics of your car matter a lot.
>> Yeah.
>> Little things.
>> Yeah.
>> Right. You know, like and and so and people get really attached to their cars just like their idees.
>> Just like their models and the taste of the models >> and their models, right? Like people
really love particular models. And so I mean like basically like you may try uh AMP and you may try uh and you may realize that wow you really like this UI because we have this dude that we hired
who is like a terminal. I've always
wondered like who is it that works on terminal emulators cuz they're ancient and yet they're still evolving and they can do all kinds of fancy stuff. They
can do emojis and mice.
>> Well, everybody's working on them now because >> well yeah everyone now, right? But this
guy just happened to be right place right time. and we hired him. And uh
right time. and we hired him. And uh
>> so so when you say you have an agent working on this game, really you're saying you're just using AMP to trigger a workflow that you've defined. Is that
>> Oh, >> you haven't built >> That's a great question. How does it know to work on the game? That's the
fundamental question, right?
>> Um okay, so it just finished. I'm going to say prove it to me. Um refresh the screenshot.
Okay. So, I mean, like they they'll they'll work for five 10 minutes at a time if you set their if you set them up in the with the right context and and and prompt.
>> Yeah.
>> Uh and then you have to circle back around periodically to check on them.
Now, look, you can you can watch the code that they've created. Look, you can go back and look at it. It wrote all this code while we were talking.
>> Mhm.
>> But it's pretty boring code, right? Like
I you you a lot of people are still clinging to diffs.
>> Okay. Clinging [clears throat] to diffs. That's that'd be a great blog
to diffs. That's that'd be a great blog post title, clinging to diffs, because those diffs are what they're like a lifeline that people are like, well, now I can see what the what it's writing,
right? I feel better about it, right?
right? I feel better about it, right?
And it is true that you should probably keep an eye on your diffs as they go by.
>> But in a very Paul Graham, you can see what it's doing from the literally the structure of the like the shape of the code, >> right? Like you can see if it's like
>> right? Like you can see if it's like spewing out a lot of code, >> you probably are going to want to ask it to code review that code, right? Because
anything that it does is going to be 85% correct.
>> Yeah. Right.
>> So this it's Yes. It's successive
refinement. Everything that it does, just assume it's 85% correct.
>> Oh, as a safety measure to assume that I mean statistically, empirically, it feels about like or less 60%. I mean,
you know, they they they they really have assassin. Now look, they're all of
have assassin. Now look, they're all of the How much do you want me to rant about this space? Like seriously, uh,
this space? Like seriously, uh, everybody's chasing the same form factor, cloud code and AMP, right? And
Codeex and Klein CLI and Q developer CLI and Gemini CLI and I mean, right, all of them have these now.
>> They came out in what, February?
>> Yeah.
>> First of all, anyone who's not using these, like seriously needs to go and start using them now. All right. I mean
like just like straight up like >> I've probably written I don't know a couple hundred lines of code while we were just sitting here and I've only got one >> I would say that close to 100% of the AI tinkers is totally here already.
>> Yeah. No, they are. They are. I mean I'm hoping that the people watching the video are like >> interested in this.
>> Now it's it might seem a little boring um especially with all this playright browser take screenshot stuff. So, while
this thing works, I think what's interesting here is that you're not it's right away you can see uh a you're you're driving this thing and detaching from the line level code.
>> Yeah. Look, I'm at a higher level. Look
what it just said. It said the screenshot still shows gray placeholders.
>> Yeah.
>> Right. I caught you lying to me. Right.
That's what they do. They're like,
"Okay, we're all we're all done now.
It's all it's all good."
>> It caught itself in this.
>> No, no, no. I I made it catch itself by introducing a workflow where I say, "Look, our screenshot >> has to match this reference screenshot."
I'll show it to you.
>> Yeah, show me that. Cuz the thing that you're doing different here that I than most people probably today are is using like Playright and using a workflow that you've put on top of it to do some of the automated validation.
>> That's right. So, like, so here we have a screenshot of the game, the Steam client. Okay.
client. Okay.
>> It's really ugly and old and gross. The
players love it. Um, the mobile clients are a lot nicer and neater and cleaner, but it's very old school. I mean, it's super it's super fast. You can throw spells and they'll explode in the screen and blah blah blah.
>> I [snorts] can show you later. I can
>> But you're taking a static screenshot and putting it somewhere as part of your project file and you're telling the agent.
>> Yeah. And and then I'm basically saying, "Okay, first make it look kind of like this." And it like spends many many
this." And it like spends many many sessions trying to get all the the views and panels lined up properly. And then I say, "Okay, now let's work on the theming. Okay, now let's work on the
theming. Okay, now let's work on the fonts. Now let's work on the spacing."
fonts. Now let's work on the spacing."
And I'm just I'm iterating and refining.
>> Oh, so you're taking a final like a final instance of the game running in one platform and you're iterating towards an instance in another platform.
This is the overall use case.
>> That's what I'm doing. I'm saying it's a port.
>> Okay. Yeah,
>> it's 100% a port.
>> Looking at that screenshot at every step.
>> And the nice thing is, yeah, of course, there are other views and other screens that we will eventually get to, but this core screen has so much on it that you can actually validate, you know, 80% of the game's functionality just through this little portal here.
>> Yeah. Cool.
>> Yeah.
So, let me fire up a couple more agents.
Um >> uh so that we can uh we can discuss other stuff going on.
>> Cool.
>> Um just the the Playright MCP thing though. I mean like it's just
though. I mean like it's just >> So you notice there's there's some bugs here, right? Like like some of the stuff
here, right? Like like some of the stuff works. My inventory is rendering. The
works. My inventory is rendering. The
spells are not rendering. It's working
on that right now. The map is flickering. It's not supposed to
flickering. It's not supposed to flicker. The bartender's leaving greles
flicker. The bartender's leaving greles around as as she walks around.
>> Right. Um so there's issues. But man,
this client is only about what, a week old.
>> Oh, I was going to ask. Yeah, that's
great.
>> I mean, it just would I mean, it's in Typescript and React. I don't even know them. And I it would have been
them. And I it would have been absolutely impossible, you know, to get anything remotely like this in a week myself right?
>> Yeah. Well, take the flickering thing. I
find this too. If I'm working on a part of, you know, something I'm like, if I was working in Typescript and React, I'd have the same issue, but I'd be like, I can't help you. I can't tell the agent, look into this, look into that with the
flickering, and I might have to iterate a lot. looking at the flickering through
a lot. looking at the flickering through every iteration nine or ten turns.
>> Yeah. And what do you do in that situation?
>> You might you look, you have to start.
So Jean Kim and I wrote this book called Vibe Coding which talks about this, >> right? Which is just like how do you
>> right? Which is just like how do you >> coax the agents into doing what you want, especially if you're stuck in a rut where like you've gone through 10 sessions and they make a no forward progress.
>> Yeah, totally.
>> Because it happens.
>> One of our principles, one of our many things that we've got in the book is the tracer bullet, right? which is this idea that you're just going to be like you're going to find the thinnest end to end
like that you can get working and use that as you know hang the whole workflow on that >> and for me the tracer the first tracer bullet was uh don't even worry about all the graphical stuff I want a TNET
protocol into my game because I have a TN net protocol in my game it's kind of fun you can tell net in and talk to people right >> uh and so we worked on that and that that got me past um you know a bunch of
the packet you know the protocol it actually built They built the thrift protocol for me >> and and man it was hard like look man everyone's like oh vibe coding is supposed to be easy and I just the one thing that we didn't get across in the
book I mean maybe in the book we got it across but the thing that's not getting across to people in general is that man it's just as hard as regular coding you have to you have to you're wrestling with a bear all day long >> right right
>> man thrift stuff cases and then it works but it's slow or whatever so just your advice is honed in on as narrow as possible a >> function yeah you constrain it to down to whatever the minimum repro case is,
right? And then you're just like, "Okay,
right? And then you're just like, "Okay, we're going to focus on this forever until it's done."
>> And so, and so you can build uh special frameworks and tests and triggers and monitors and stuff if you need just for this one thing. For the flickering, for example, it's going to have to take probably multiple screenshots, catch it
in the act, you know, cuz I don't want to be >> something that's that's not a static screenshot. It's a behavior over time
screenshot. It's a behavior over time and >> logs, you know, audit trails, you know, um and and you know, the funny thing is, man, all this stuff, you talk about it in the old days and it's like
>> that would be a lot of work, right? But
now it's like I'm going to make it do it and it's fast.
>> Have you put Gemini does videos very well and other things? Uh have you tried doing a video capture of the user interface and a motion graphic thing? I
think games are very feeloriented in their motion, right? How would you get an agent to to work on something like that? Would you would you
that? Would you would you >> put a video record capture input potentially?
>> I uh uh Yeah, I mean they right you need a multimodal.
>> Yeah, that's actually a pretty good idea. Okay, so playright only lets you
idea. Okay, so playright only lets you do the so we clearly need a video version of playright.
>> Yeah, it would be like a a person like watching the whole game being played and playing it.
>> Well, even uh a live stream is what I would want. Well, that would be very
would want. Well, that would be very expensive. What I would like is uh for
expensive. What I would like is uh for starters, okay, I think your idea comes next >> is uh something that can record say 10 to 15 seconds of video, >> whatever is is achievable for the thing
to upload and analyze >> to do with the tracer bullet pieces >> so that you can say we're going to focus on we're going to log in and walk over to this part of the game and we're going to talk to this NPC and you're going to see that video interaction and I want to
see the following things happen.
>> I'm going to click a button and everything should happen within a second or two. record that whole thing and then
or two. record that whole thing and then feed it back in and have that be automated.
>> It probably exists. If it doesn't, somebody should vibe code.
>> I'm sure. Has anyone out there in the community done?
>> There must I mean this is seriously for for my use case where you're trying to get a game working. The screenshot thing is only going to get us so far.
>> Yeah.
>> Well, not to mention like just AI. I
think I think over the next year one prediction is pretty much that multimodal with voice. So web apps or games just like this, but the voice will come in. Right now voice is very
come in. Right now voice is very separate. people are just building voice
separate. people are just building voice agents but like integrate it in so that like how many times have you used some you're doing some workflow and you just want to talk to it do that thing you know and have that overlay be present so
these multimodal and then how are you going to vibe >> it's going to be like Tom Cruz and Minority Report it >> you're just like just doing stuff like literally with your hands >> you know who worked on that interface
for the movie >> uh wait Joanie IV >> and no Quinn from uh from Daily Vipcap Okay. Yeah.
Okay. Yeah.
>> He he's de uh voice. We were talking about voice. He's doing that right now.
about voice. He's doing that right now.
Voice platform.
>> Oh, wow.
>> Back in the day, he was his company was doing the interface design for that movie.
>> Pretty cool.
>> Yeah.
>> So, I mean, like, okay, so let's talk a little bit about let's talk more about the actual In-N-Out dayto-day because this probably seems really weird to programmers, >> but but it's cool because you get you get a sense of like you're not just
sitting down prompting something to go code for you and looking at diffs.
You're having the conversation. You're
giving it tools. You're able to get a lot more done. And you don't have to be an expert in the thing. It's building it all.
>> You have become sort of like a team lead.
>> Yeah.
>> And you have smart engineers working for you.
>> Yeah.
>> They have the wisdom score in Dungeons and Dragons terms of like a middle schooler. All right.
schooler. All right.
>> Yeah.
>> And they will do unwise things. And why
will they do unwise things when they're so smart?
>> It's because they don't have the full picture. Yeah. they can't get the full
picture. Yeah. they can't get the full picture usually unless you have a very small code base and so they start to approximate and they guess >> can I get a quick opinion I don't want to divert to this other topic but
>> given what you said earlier you know on average 85% there and you want to be a team lead if if someone's coming in as a engineer very junior they don't have the decades of experience and architecture
and we're trying to build it and productionize it and make it efficient and is that >> oh I see so you're ask are you asking Okay. So, is it possible for a junior
Okay. So, is it possible for a junior engineer to do the job that I just described? That's the new job. The new
described? That's the new job. The new
job being making sure that it meets rigorous engineering standards. Yeah.
>> And helping it debug if it and push if it's stuck. I mean, right.
it's stuck. I mean, right.
>> Yeah. Giving it the guidance that it needs. Um.
needs. Um.
>> Yeah. So, u my knee-jerk reaction to this question was to write a blog post called the death of a junior developer.
>> Oh, right. You did do that.
>> That was like a year and a half.
>> That was a year and a half ago. Um, you
know, uh, I don't think so.
>> Yeah.
>> You don't think it's true anymore?
>> I think that it's a problem. Yeah.
>> But I think that it's possible to beat it.
>> Yeah. Cool.
>> Like, like I think that a sufficiently motivated junior dev >> uh is going to be just fine because and
>> well, for a couple of reasons. One is
nobody nobody >> most human beings really do not understand exponential growth and they they they really truly today go I pull all my friends they think that AIS are as smart as they're going to get
>> the tools are as good as they're going to get >> I'm going to focus and I'm going to index on what things are happening today because I don't care about what's happening tomorrow and so they're being very very foolish and shortsighted
>> these things will be my you know my family members will all be programmers within three years.
>> Yeah. Yeah. Totally. maybe two years >> and I literally mean they will be contributing to the game and things like that.
>> I agree with you and >> it's because this stuff is all automatable. An engineer in a box is
automatable. An engineer in a box is pretty freaking automatable. What our
job is at up at that higher level the hard part of our job is now being done.
The other hard part of our job like it really boils down at this point to I mean so in the interim between now and two years from now or three years from now when when anybody can do it [snorts] >> it has changed shape and it's really
hard. It's different. It's hard for a
hard. It's different. It's hard for a lot of people because you got to be able to type and read and look at text and it's like kind of exhausting and you're fighting with these agents and they're always arguing with you and doing wrong and lying,
>> right?
>> Well, uh uh that's the job, man. That's
the new job. And junior engineers can pick it up because you got to learn the right questions to ask.
>> Yeah. So like what are the questions that I ask? You can say are you done?
And it'll be like yes and you got to be like no you're not right? You know you just assume they're
right? You know you just assume they're not.
>> Okay. Until it's proven to you. And so
like so there's a set of pre-flight things that you do first. Are you ready to work on this? Is is everything checked in? Do the tests pass? You know,
checked in? Do the tests pass? You know,
>> right? You got to like and then there's the the actual work where you have them do it and then you say okay excellent work now let's code read and they go oh good idea and they go find a bunch of bugs in their own code. Why do they
always find a bunch of bugs in their own code? Do you know?
code? Do you know?
>> Yeah it's a good question.
>> So the research has shown this is just the way they work. They are better at evaluating their own outputs or the outputs of another model than they are at generating.
>> Yeah. And it's because it's just fundamentally easier to find something wrong with something than it is to build it in the first place.
>> My favorite technique is to not to ask it to write code, but to ask it to think about the code it should write, force it to think of an alternative. Think of one idea that's even better. Evaluate the
trade-offs and then recommend one and try to defend itself and then say and then I ask it tell me how confident you are about that being the right thing, you know, and just seeing it's forcing it to be transparent is very helpful.
So, at the AI tinkerers event, we just you just had um somebody introduced themselves and said that they had a single prompt that does all everything that you just said.
>> Oh, yeah. Yeah.
>> They say code review this like Linus Torvalds.
>> Yeah. Yeah.
>> Yeah.
>> Yeah. Yeah. That was awesome.
>> Yeah. Yeah. Right.
>> That was great.
>> And actually, it's those kinds of it's that kind of prompting. I mean, like this is a this is an it's learning the art of prompting, >> having a conversation with the agent.
Okay. This is why we started AI tinkers is so we can meet people who said I found that if you tell it to behave like Lionus Torvalds it does better years.
Let me show you data. I'm with you that anyone can do this. This is why everyone should just be a tinkerer. That's my not my pitch. But I fundamentally believe
my pitch. But I fundamentally believe that when I we were at the same undergrad computer science department.
Yeah.
>> This was a long time ago. Um and we were there around the same time >> as we discovered. Um I wanted you know I wanted to learn how they made Photoshop or PageMaker. It was like pagemaker.
or PageMaker. It was like pagemaker.
Remember that? You know, how did you do that? Graphical UI. How does it all
that? Graphical UI. How does it all work? I want to learn that. Went through
work? I want to learn that. Went through
the whole program and and learned zero of that because you don't learn that.
You learn the low level. You learn
memory. You learn, you know, concurrency. The most advanced stuff is
concurrency. The most advanced stuff is still pretty low level. And I didn't learn what I wanted to learn. But but I learned it eventually because it was just like, you know what? I'm going to go build a simple I built like a Mac paint of my own app. You know, I'm going
to simplify and build just and it was like what you said. It was just like, you know, just bang your head against the wall. Now it's 10 times easier if
the wall. Now it's 10 times easier if you have that spirit plus AI. I think
you can obviously >> I mean like you're doing the same thing.
You're banging your head against different walls. You're still banging
different walls. You're still banging your head against the wall until it's built. But you can be so much more
built. But you can be so much more ambitious.
>> Yeah.
>> Now, case in point, it fixed the bug while we were talking. Okay. Not the
flickering one. Not the flick. I didn't
have it working on the flicker.
>> I had it first pick the uh the broken spell view and the ground view. They
didn't have those icons rendering and now they are. You see the screenshot on the right in the upper corner. Those
spells were not rendering earlier in this in this.
>> This is the latest screenshot of the of the app. [snorts]
the app. [snorts] >> Yes. And so now it's declaring that it
>> Yes. And so now it's declaring that it is actually finished. And I'm going to show you now. I'm going to take it full circle back to the uh the Linus Torvald's prompting thing we just talked about. Okay. It says there's your proof.
about. Okay. It says there's your proof.
[laughter] Screenshot shows colorful spell icons.
It's like well done. Okay. So I'm going to say well done. And why do I say well done? because they're going to turn into
done? because they're going to turn into super intelligence and four or five years from now they're going to read the logs and they're going to see who was nice to them.
>> You're trying to [laughter] >> I'm not kidding you. I'm not kidding you at all. Okay, this I'm assuming that
at all. Okay, this I'm assuming that they will be super intelligent and >> there'll be a data leak and if you were rude your your rudess will be in the data leak and now you're screwed.
>> There's no data leak. They will have access to all the logs in human history within 5 years, 10, seven years, whatever.
>> So like be careful what you say to them.
Seriously. Okay. So, uh it says we're done, done. Well, it always says that,
done, done. Well, it always says that, except I actually checked it. So, now
I'm going to say, "Let's land this plane." All right. What does that mean?
plane." All right. What does that mean?
Oh, look. I've paid $10.66 in this conversation [laughter] in the lower right corner.
>> Wow.
>> Yeah. AMP shows you that if you want.
>> That's cool.
>> It's good, actually.
>> I think it's good. CL Cloud Code has an incentive not to show it to you.
>> Yeah. I I I don't know if you had that feature. I didn't have it enabled, but
feature. I didn't have it enabled, but yeah, [clears throat] my first bill was was worth it because I accomplished a lot that month.
>> I mean, I'm doing everything I ever wanted to do, you know, and so I'm I'm so happy. Like, I just couldn't be
so happy. Like, I just couldn't be happier.
>> So, anyway, like, can we talk Sorry, can we talk about the land the plane thing?
>> Yeah, let's do it. Yeah.
>> Oh, no. Go ahead. Go ahead. It's still
landing. So, you're saying funny earlier?
>> No, you also have a free version now, like ad supported.
>> Yeah, right. I mean, look at this UI.
It's so boring. It needs some ads in it to spruce it up. But actually, no, the ads are from like companies that you've heard of that are nice, like I don't know, Stripe or who show up. They just
talked to you the same before.
>> I think they're probably going to be like my I haven't even looked. I tried
to get it set up the other day. The uh
but do purchase the try this product.
>> It's so hard to tear myself away from actually coding that, you know, but it probably is right above the the status line at the bottom, right?
>> And the great thing is, man, I mean, just like >> this stuff can get really expensive.
Look, we're at $14 now.
>> Are you degraded at all when you use the the ads supported product or is it the same?
>> It's the same. It's the same product.
That's very, >> right? That's the first question
>> right? That's the first question everybody asks. No, no, you don't get
everybody asks. No, no, you don't get some because uh some other uh vendors actually do give you a cheapbo model if you get the the free tier. [snorts]
>> So, I mean, come on. I mean, like with ads, it's definitely worth, you know, worth a try. Some of the cool things about AMP, one of them is that it has this thing called the Oracle. We haven't
seen it yet because I haven't made it review anything, but the Oracle, it goes off to GPD5 and it'll it'll have do code reviews and stuff, design reviews with Oracle.
>> Having multiodel built in kind of keeps it more honest, >> you know, if you've got your agent only ever going to one model, >> you know what I mean? Like everybody
everybody like you everybody I know does multimodel when they can, but having it built into the actual tool so it will just automatically go consult the other model.
>> Yeah. and then they'll have a discussion.
>> Certainly likeuge speed's an issue if I'm doing iterations rapidly on something because I want to get a look at it in the UI or something. I don't
want to wait and I'm okay with a dumber model because I'm narrowing what I'm doing. So I'd rather switch to something
doing. So I'd rather switch to something fast and cheap and it's more about the speed for me. But if I am debugging something, >> why not just control? Yeah.
>> Yeah. If you can find a cheaper model that does the same, you should always just buy the cheapest fastest model you can for sure. Yeah,
>> but [snorts] man, when I when I don't want to wait, I start up another model with another agent.
>> I know, but you can only do three or four before your brain explodes.
>> You Well, actually, you you you can Okay, with beads, you can get to about six, which is nice.
>> Uh, but >> I can't wait to to learn about that.
>> Yeah.
>> Yeah. All right. Well, anyway, it landed the plane. So, let's talk about this
the plane. So, let's talk about this land of the plane thing. I said way up here, well done. Let's land the plane.
What did that mean? It had a whole bunch of code changes.
>> Mhm.
>> Okay. and doc changes and issues to update and screenshots and blah blah blah.
>> And so one day I got so sick of telling the agents they had to clean up after themselves. Okay. That I said when I say
themselves. Okay. That I said when I say let's land the plane, I want the following 10 things to happen.
>> Oh, cool. Okay.
>> And so >> where did you put that? Just an agent MD or something?
>> I'll show it to you.
>> Yeah, please.
>> Land the plane. So it put this in there for me.
>> When a user says land the plane, let's follow >> follow this clean session session ending protocol. Wow, this is cool.
protocol. Wow, this is cool.
>> Update your beads issues. Sync the issue tracker carefully. Clean up your git
tracker carefully. Clean up your git state. Get rid of stashes. Get rid of
state. Get rid of stashes. Get rid of old branches. Get rid of debugging
old branches. Get rid of debugging artifacts. I mean, like they leave so
artifacts. I mean, like they leave so much crap around.
>> Most people here would would have update these docs, but that's what we're going to get into beads cuz beads is kind of an alternative to that.
>> That's right. That's right.
>> That's right. Now, look, sandboxed environments like codeex or cloud code for web or [clears throat] or copilot, you know, code spaces, right? they run
your agent in a in a like, you know, a container in the cloud somewhere and >> they don't have the same cleanup issue because they just take the code away and the docs or whatever and everything else gets discarded.
>> Okay.
>> I mean, I'm just going to be transparent. I don't like that working
transparent. I don't like that working model. I I predicted that it's going to
model. I I predicted that it's going to be the the future and I I'll probably be there eventually.
>> Right now, I like working right on my machine.
>> Yeah.
>> And so my machine is my sandbox, right?
>> But they leave crap everywhere. So
>> this land the plane forces them and it's really interesting too because man they love there's nothing the models love more than like checklists right and acceptance criteria and being able to
say I'm done they were trained on that okay >> and so that like their reward function like biases them for that and so landing the plane even if they're like low on context if you say let's land the plane
man they're going to do a good job with it they're very thorough you saw what it did and the very last step of land the plane which I believe it's still doing.
>> It's still landing the plane. So this is how much stuff they have to do at the end of a session.
>> What it does is it goes through beads that my my work tracker and it and it finds the next thing to work on and it spits out a prompt for next session. So
I just copy it in and we're off >> a prompt. Oh that you can start >> for the next session. So it chooses work for the next session >> and you wouldn't do it in the current session just yes please >> because we ran out of context. That's
why you that's why you land the plan in the first place. You have to It's not to save money. You literally are going to
save money. You literally are going to get compacted soon. It's a memory wipe.
They lose all they they're like, "Who are you?"
are you?" >> So, you need something to like remind them of everything where you're at between sessions.
>> And that's really where Beads comes in.
>> Okay. Yeah. Cool.
>> We should probably tell people about Beads.
>> We should tell people beads. First of
all, Beads is like what, four weeks old?
So, >> yeah. Four weeks. Less less than four
>> yeah. Four weeks. Less less than four weeks.
>> And how many people are using it now already?
>> As far as I can tell, in the tens of thousands maybe. It just doesn't
thousands maybe. It just doesn't >> I remember cuz a week ago you presented it in Seattle in US. Has anyone heard of it and a bunch of hands went up? Anyone
using it? There was a bunch >> and there were people using it and one dude said his whole team uses it and it's just a few weeks old. Yeah.
>> So how did you come across you were working in the way that you work and then you said I'm going to create something called beads. How did you get >> No, I didn't say I was going to create something called beads. It's funny. Uh I
had no idea beads was going to emerge.
Actually I set out to uh automate the book that Gene and I wrote. Okay,
>> that was that was my goal. I I decided to write a system called Vibe Coder that does everything that we put in the book.
>> Oh, I see. Cool.
>> Yeah.
>> So, just like Claude Code has elevated us and AMP has elevated us. See, you say Claude Code because they are all Cloud Code class. It gets credit for being
Code class. It gets credit for being first.
>> It's Coca-Cola.
>> Like, they literally came up with this idea and I and I'm I'm honestly I mean I mean I'm envious. I'm I don't get envious of anybody or anything. You
know, I'm I'm pretty content in life, but man, I really like Tarantino, you know how he says he wished that he directed Battle Royale.
>> He's like he just he just wishes that was his, right?
>> Claw Code, what an idea.
>> I knew a year ago, a year and a half ago, my boss is like, "What's next?
What's the next big thing?"
>> And I was like, "It's going to be tool use."
use." >> And he's like, "Huh?" Right.
>> But I was I was thinking about it wrong.
I I was still The problem is I'm old.
And being old, I grew up in the days when stuff was expensive. I I'm very frugal in my thinking. I don't think big enough usually because I think it would be expensive when meanwhile cost moss law and everything networks everything's
gotten cheap and I see what people do with networks today. I'm just like, oh, but you know, whatever, right? So, I
would have thought of code quad code maybe if I had been a complete pig with tokens >> because what it does is it puts chat in a loop and it literally takes the whole chat conversation flash flash flash long
long longer and it's going and it's turning it into a real time thing but it's really a bunch of individual inspections where it's predicting what's going to happen next at the very end of the long.
>> But they sped it up to where it's like stop motion and now you're having a conversation with an agent. That's
absolute brilliance, right? It's it's a historic invention is what it is.
>> Interesting.
>> So yeah, cloud code class, we'll call it that.
>> Okay.
>> Uh they are not the future. They're the
present.
>> And uh they're too hard for most people.
>> Sure.
>> They just are. Right. I mean, like come on. Look at this, man. Look at all this
on. Look at this, man. Look at all this stuff I had to read. Scroll back. Man, I
know a lot of engineers who think five paragraphs is an essay.
>> Yeah.
>> Like they just can't read. Well, I know a lot of the community got up to like, oh, you can, you know, start with a PRD and then have it do your text spec and then have it write code and then have it evaluate that and you can do this and then it's now it's throwing off all
these artifacts and you've got docs that are out of date and your agents are confused by that too now and you're like okay >> that's a huge problem, man. Planning in
general, everybody's headed towards planoriented uh coding and that that is that is one of the key >> checklist and goals work. So, we're
learning to just like do that up front.
>> It does. It does. problem doesn't it doesn't scale.
>> Yeah.
>> It doesn't scale at all.
>> Yeah.
>> And boy, people have been trying to Right. I mean, like I got a buddy who's
Right. I mean, like I got a buddy who's working on a big like what if we put it all in a big wiki.
>> When you say the words put it all on a big [laughter] that phrase, let's put it all in a big okay >> wiki and liked it. Whoever created a wiki and and said we love our wiki.
Everyone >> I mean wikis were definitely cool. They
were uh I mean like they served a need, but I mean like that's not that's not quite like the thing that we need unfortunately is in a new space.
>> Yeah.
>> And we were piling up markdown files and I mean piling them up th you know hundreds of markdown files would would accumulate >> and the agents would get dementia.
>> Okay. They'd get really confused >> because all they know is what's on disk.
All they know is what you've prompted them right in the context. And so if if you got compete documents, you've got obsolete documents, you got conflicting documents, which you will >> ambiguous documents.
>> I started my my probably two months ago the the sort of like PRD and agents are updating them when they land the plane.
And I have 256 documents that are insane. Some of them are to-do future
insane. Some of them are to-do future projects. Some of them are past, some
projects. Some of them are past, some are out of date. It's a whole other project. You just go and clean those up.
project. You just go and clean those up.
I don't feel like that's >> it is I actually at one point had built in a a doc cleanup manager into my vibe coder just because I was like >> it was just such a mess and and at one
point I just spent like two weeks just trying to fix it >> just like okay I'm going to build the planner >> now beads so this is a huge buildup to beads it is >> cuz beads is a total departure from that world
>> it is it threw everything out >> okay >> I said I was so frustrated I said screw it had been like It had been nagging at me for a while
like this idea that I really like working from work cues.
>> Mhm.
>> I look I'm going to show you [snorts] in um Wyvern doc todo. I was using Jira.
I'm unfortunately on Bitbucket. I'm
going to get on to GitHub soon. This is
a to-do list. This pure text with game stuff in it.
>> Make [snorts] glow stones and fireflies wear out over time. I mean just random suggestions from experience boosters.
Experience boost are still bugged and they don't show up.
>> This is exactly what my list looks like, >> right? Everybody has a list like this.
>> right? Everybody has a list like this.
Yeah, >> this is the best way to work. You just
you move stuff around. It's like super This should be an org file. This is so old.
>> Okay, so >> it's not Jira. I'm just kidding.
>> Well, I tried I put everything in Jira and then it just sat there. Yeah.
>> Oh. So, so one day I finally said, I'm done. Put all our work in issues.
done. Put all our work in issues.
>> Mhm.
>> Let's make an issue.
>> GitHub issues or Git is >> No. Well, first I said
>> No. Well, first I said >> logical issues. Let's put it in issues.
Let's have a conversation about it. So I
actually worked with the AI. I was like, "Okay, I want this in issues. I want us to have a work cue [snorts] so you can track it. What would you want?"
track it. What would you want?"
>> Right? And it goes, "Oh, I I know what you you want." And it spit this out, right? And I went, "Well, the problem is
right? And I went, "Well, the problem is it needs to be in Git." It did it all in SQLite. And I was like, "Yeah, but I
SQLite. And I was like, "Yeah, but I want Git."
want Git." >> And that's the one thing people have been really confused about is why would you put your true tracker in Git?
>> Okay. Mhm.
>> Especially when it's fundamentally ephemeral work cuz [snorts] beads sits in a really weird space. It's not a planning tool. You use a planning tool
planning tool. You use a planning tool to produce your plan and people are building planning tools. It's not a messaging tool.
>> Mhm.
>> Uh a guy just actually sent me a messaging tool, MCP mail, >> Jeffrey Emanuel.
>> Mhm.
>> That looks really exciting for the space. Let me give you an example of why
space. Let me give you an example of why you need it.
>> I've got three agents working on Wyvern.
Okay. One's working on the client, one's working on the server, and all of a sudden they need to like >> they're working independently, but they need to depend on each other for something. They need to send a message
something. They need to send a message to each other.
>> Yeah.
>> Well, how how do you do that? What you
do is you go, >> I have this other agent in this other terminal. Can you use pro or something
terminal. Can you use pro or something like there's no way for them to talk to each other, right?
>> So, he created this like little mail like protocol, >> right? And I realized this morning cuz
>> right? And I realized this morning cuz he sent me last night and I'm like, "Oh, because I realized I put this all these problems with syncing problems. We have get at the center. We have all these these databases in each repo that are a
cache. That's all they are." So that so
cache. That's all they are." So that so that the beads the issues >> can be queried. [snorts] There's a dependency graph. It's got everything
dependency graph. It's got everything that AI wanted. It's got an audit trail, >> provenence, parent, children, epics.
It's it's really powerful tiny little issue tracker, >> but it's all saved in Git. And uh and so anyway, we we got the whole thing worked out and within [clears throat] within a couple of hours. It built it in 15
minutes >> and I added my stuff and it added it stuff and eventually we had a thing >> and about 4 hours later, man, my my entire workflow had transformed. I had
spent a whole month just drowning in markdown files.
>> Drowning. And all of a sudden, it was just like, what's next? What's next?
What's next? File an issue for that.
File an issue for that.
>> Yeah. Cool.
>> File and forget.
>> Yeah.
>> That's what you want.
>> Yeah.
>> Okay.
And the thing is you don't have to figure out where it fits in the plan. It
has a priority.
>> It has dependencies. It can be blocked by other work. It can have a parent epic. You have everything you need to
epic. You have everything you need to like locate it.
>> And so it just sits there. Now the funny thing about beads is it's not designed to be like Jira like a big central database for the entire company.
>> The reason it even exists is that because after eight hours I was like okay I am on to something. I mean,
really, I'm on to something. Okay, so I asked, you know, Claude, I'm like, [snorts] you know, what do you think of this?
>> And it just went off. It's like, you've given memory. Like, I literally couldn't
given memory. Like, I literally couldn't remember anything before. Now I can, right? And I'm like, okay, that sounds
right? And I'm like, okay, that sounds good right?
>> Mhm.
>> It loved it.
>> And so [snorts] then then it was just a matter of iterating on it. And and it sits in this weird space where issues they start as future work.
>> I got some stuff I want to do. Mhm.
>> Some of it might be weeks out, months out whatever.
>> That stuff doesn't belong in beads in my opinion.
>> Okay. It's it's vague. It's unspec. You
can put it in beads and just have it as a P4 or whatever this.
>> But honestly, >> it's not it's future work.
>> It's not what you're working on now.
>> Yeah. Yeah.
>> You also have past work, which is stuff you've done.
>> Sure.
>> People love to hang on to their past work.
>> And that's why I used Git. So when you close an issue and archive it and delete it, it's still in Git. You can always go get it, >> but it's it's gone in 95% of all old
issues for a past work that what is the the the original to-do list or spec or different direction >> the original issue >> because a beads a beads issue has the design doc in it.
>> Yeah.
>> Right. So all of the work is captured in this ledger.
>> Yeah. Okay. But I can see some agents you could build on top of this that' be really useful like release notes and marketing for like hey we need this is working. It's stable. I feel comfortable
working. It's stable. I feel comfortable blasting it out to my user base, my customer base. Go and create a campaign
customer base. Go and create a campaign and it would look at that repository or for the future work it might be, hey, you know, hash this out with with uh, you know, into PRDs that people can designers can look out and weigh in on
it.
>> Yeah. And that whole space, >> yeah, >> is going to be one of the most exciting, it is probably the most exciting frontier in AI right now >> for companies.
>> Yeah. the idea that you and the product managers and everybody else can just have a shared artifact that you all work on and shared prototype that you're building and >> we'll get there.
>> Yeah.
>> Beads doesn't try to solve that problem.
What beads focuses on is it lives between future work and past work. It
lives right here >> and it magnifies current work which is what do I care about like right now and in the short term [snorts] >> which consists of I care a lot about stuff I just finished.
>> Yeah.
>> Because it might break. as it lives longer, it becomes lower risk. You you
know what I mean? So I actually have beads automatically by default prioritize recently closed, >> recently filed work as higher priority because if you if you're doing a bunch of stuff and you
find some P2 follow-ups, >> they will languish forever unless you give them a little bit of a boost right up front, right? We learned this at Google when we did a bunch of statistics on how developers respond to warning
messages and stuff.
>> But you can configure all this. But the
the point is, look what it says on the screen right here.
>> Yeah, >> it says first of all, it landed the plane, so it did all that crap. That was
just a prompting exercise. You saw it was an agent.mmd.
>> Yeah, cool.
>> But now it says recommended next session prompt.
>> Cool.
>> Continue working on Y44. Um, map
rendering is broken. It shows gray tiles.
>> What is the What is Y444 anyway?
>> That's the issue name. I mean the issue ID.
>> You create an issue ID and that's okay.
>> It makes an issue ID for you.
>> It makes it for you. And so everything has a handle to it.
>> Everything has a handle. And so this is this is the unique >> proposition of beads is that your your plan >> now has address. It's Google maps for your plan.
>> That's great. Yeah.
>> Every every piece of work every piece of work you ever want to have done >> good >> is addressable. It's got a label and a title.
>> It's great for AI and humans. It's
better.
>> It is. Yeah.
>> It is. And you can now the thing is now that the data store this data structure is in place >> people have built uh um layers on it like web UIs RPC's monitoring systems
for tools to come in and automate >> on top of beads already.
>> Yeah. Oh my god. People have hooked it into their own workflows. People are
like >> people have been very demanding you know of things. So we made it so that for
of things. So we made it so that for example >> cuz beads conflate >> how many contributors to the main have you seen and how many >> as of today I believe it's 29 days old and it's had 28 different contributors
or maybe 29 >> nice nice >> like PRs that have been merged.
>> Yeah. Yeah. Awesome.
>> Uh so I mean it's like uh and they're they're hardcore like 29 different people who like got the vision wanted to integrate it with their own workflow and did something to make it better.
>> Very cool.
>> Super cool man. Super cool. So like be is like a discovery. So, what is the benefits that you're getting now that you used it for you built it and used it for a few weeks versus the world that you described before?
>> Oh my god, the old world. Um, yeah. So,
I mean, uh, [laughter] dude, it's it's right. It's like I saw it in my blog post the other day. It's
like, um, it's like using these code agent coding agents is like running with socks. Okay.
socks. Okay.
>> Yeah.
>> Now, when you're running in socks, you get a little protection and you you get a lot of flexibility, right? But you're
running in freaking socks. You put some shoes on, they're opinionated.
>> They're not going to be appropriate for all situations.
>> Damn, if they aren't useful. Come on.
Right.
>> You know, and so I mean, seriously, beads is like having shoes. And so when you say, "Well, what's it like compared to going back and running in socks?"
You're like, "Well, >> right. My feet hurt." It's like
>> right. My feet hurt." It's like >> because you you have no grounding anymore. Beads is your beads is your
anymore. Beads is your beads is your your session to session memory.
working with coding agents. I swear it's the most cliche trite uh comparison now, but it's >> it's 50 First Dates, >> which is a wonderful movie. You should
go watch it if you haven't seen it.
>> And uh but you're you're playing 50 First Dates with with the agent because it wakes up and it's like, "Hi, who are you?"
you?" >> Right.
>> Yeah.
>> And it all comes down to your prompts and and your context and beads just completely completely like takes care of the dynamic aspect. You understand?
People are used to the static prompting, >> right? the stuff it has to do every
>> right? the stuff it has to do every time. What about the stuff it has to do
time. What about the stuff it has to do right now and in the next hour?
>> Yeah.
>> So anyway, this is this is a new a new space we're serving.
>> What um people don't know need to know a lot. You're not actively training. It's
lot. You're not actively training. It's
not a new skill you have to learn as a developer. It's nothing.
developer. It's nothing.
>> You have to know issue numbers.
>> You have to know how to install it.
Basically, >> you have you have to you have to know to tell your agent to install it.
>> Okay.
>> You can say go find it on GitHub and install.
>> Oh, really? Like what would you actually type to do that? I mean, I'd be like, "Okay um let's go to uh some project."
>> I'm I'm I'm in Codex right now. I'm
going to try to do it.
>> Codeex. Okay. So, you can npm you can npm install beads.
>> Okay.
>> I believe can codeex npm install stuff into your container.
>> So, I [snorts] just typed npm install beads and it's running.
>> It's beads.
>> To install beads package.
>> You did the at beads.
>> What?
>> At beads npm install or is it just beads?
>> npm install beads.
>> I Let's see if it works. I don't know much about this npm stuff, but It seems to know what it is.
>> Oh, no. It doesn't know what it is.
We'll see.
>> Maybe it's freaked me out >> here. Let me look at the read me and see
>> here. Let me look at the read me and see what we came up with for that.
>> But yeah, I mean once it has the beads binary BD, then it can um you can do anything it needs.
>> Okay. Wait, so just do BD.
>> Do you have BD? Did you MPM install it?
>> No.
>> Uh here, let me see what the actual mpm install uh command is. Um,
so source beads read npm, it's npm install- lowercase g at beads/bd.
>> Okay, npm install. What's the first part?
>> npm install-g global and then at beads/bd.
>> Oh my god, that did not work. That
freaked out. I'm going to get out of the the codeex.
Um, codeex. Codeex is an interesting choice. What do you like about Codex,
choice. What do you like about Codex, Joe?
>> Large codebase understanding and quality results. If you're willing to wait,
results. If you're willing to wait, >> uh, because of the deep Yeah, the 03, you can do that.
>> It does a good job of saying before I get started, I'm going to go search everywhere and really do a good see what your patterns are.
>> This is why I like AMP. AMP uses
OpenAI's models for that part of [clears throat] it.
>> And AMP is really good at that. Yeah,
that's just I also like the price. So,
>> but then once you actually get to do the coding part, it's not as good.
>> Fine. Yeah. Impment sold dash what?
>> Dash g and then uh at beads slashbd.
>> Okay, now it's working. I'm just on a regular terminal. I'm not inside the
regular terminal. I'm not inside the agent. So, this is working.
agent. So, this is working.
>> Cool.
>> Yeah. I mean, or you can brew install it. I mean like depends where you are.
it. I mean like depends where you are.
>> Cool. Cool. But yeah, then you just like uh then you the easiest thing to do is actually run bdinit in your project directory. That will actually take care
directory. That will actually take care of setting up the database and everything.
>> Cool.
>> Or you can make the agent do it. But
man, from then on, I know people that use beads commands. They're like, oh, uh, you know, I use this and that. It's
like why?
>> You're supposed to be talking to your agent with your hands behind your head.
That's like the important part. It's
like, no, don't work on that. Work on
this.
>> I got [snorts] I'm doing BD in it now.
There we go.
>> So excited. Git hooks
configure merge driver.
>> See, it makes a it turns your workflow into >> I'm just accepting them.
>> BD quick start. Is that going to like guide me through how to quick start?
>> Probably someone did. That's awesome.
>> Um, >> no. BD quick start I think is
>> no. BD quick start I think is instructions for the agent. You actually
tell the agent to run BD quick start and then it will that will teach it. So you
can you like run put that in your agent ND that that'll get started.
>> Say run BD quick start. Okay, cool.
>> And then you are off to the races, man.
Like you can say now file an issue for blah blah blah. You can say file, you can say design me a blah.
>> And then once it's got this design completed, you can say make me an epic and child issues for it.
>> Cool. and it'll grind away, grind, grind, grind, grind, grind, and it'll it'll put your whole design dock split up into issues intelligently segmented
so that they're all roughly usually accomplishable in one session, which is uh not really what I expected, but they're good at doing that.
I'm telling you, I'm telling you, this is right because here's here's the fundamental proposition, right? I'm just
going to kill it and I'm just going to start AMP or cloud code, whichever one you like and I'm going to say here it is.
What's next?
>> That's really cool, >> man. That's cool. Come on, man. That's
>> man. That's cool. Come on, man. That's
cool. Because like if you have markdown files, they're not prioritized. There's
certainly no dependencies like there aren't links between them. It's not like you've got a Wikipedia type.
>> I mean, this world plus, you know, validations and testing and a good infrastructure. And
infrastructure. And >> yeah, be is not a complete solution by any means. No, but plus is wake up at is
any means. No, but plus is wake up at is wake up in the morning and see what's been created.
>> You got your coffee and you're like, what's everyone been up to, man? That to
me is the holy grail. And that that's what coding is going to be like next year. And I if if I'm not first, then I
year. And I if if I'm not first, then I won't be first. But I am building a system that does all this for you.
>> It lands the plane. It lifts the plane off of just all the blocking and tackling. And the beautiful thing about
tackling. And the beautiful thing about it man is >> it's not just for coding.
>> Yeah.
>> It's for all knowledge work.
>> Yeah. It doesn't matter. Cloud code can make presentations.
>> It can do like it can build any arbitrarily complicated artifact using the same iterative process, right?
[snorts] >> So man, this could be a universal tool.
Not I mean beads is a part.
>> This is maybe another question you've thought about, but if you were to set up a system that could do this and also feed off of like a business, you know, that brand guide and other rules and
industry knowledge or whatever, and you wanted to package that up as something that you could deploy within a large organization, right? Because again like
organization, right? Because again like right now these tools are inaccessible to most people in the big organization and there needs it's okay to create a service internally and distribute it out. How would you do that? Like it's so
out. How would you do that? Like it's so powerful to have a good command line agent to do all this tuning and set up in my environment and with access to things. It's great. It's another thing
things. It's great. It's another thing to deploy that in a way that many people could use it at a higher level with safeguards. Does that make sense?
safeguards. Does that make sense?
>> Yeah. Yeah. No, that's the uh that's the fundamental problem that the industry faces right now.
>> Yeah. Um so um engineers can help with this by uh >> the executives have a special word for it. They call it productled growth PLG.
it. They call it productled growth PLG.
>> Mhm.
>> Well, what it really means is like rabble rousing, you know, get get the engineer army to like just insist like how does Kafka get into an
organization? Apache Kafka or Apache
organization? Apache Kafka or Apache Cassandra or take your pick your favorite Apache software >> execs don't know what that is. They
wouldn't know what Apache is, right?
Yeah, >> it gets into the organization because one of their trusted engineers says, "Man, we need this."
>> And so that's how this gets into an organization. Anything like this is
organization. Anything like this is their trusted engineers look at it and say, "All right, I understand our org. I
understand how it could break. I
understand, you know, where the bodies are buried, what could go wrong. I'm
going to come up with a plan with you guys to roll this out safely, but aggressively >> so we can learn our learnings now and get ahead so that our competitors aren't beating the pants off us six months from
now because they nailed this before we did. Mhm. Yeah. So, as an engineer, you
did. Mhm. Yeah. So, as an engineer, you can literally you if you are using it and you set it up with your team, I'm not talking about beads in particular or AMP, any tool, any tool at all that you
feel you want to you've tried it at home and you think that it would improve your work life.
I mean fundamentally your company wants you to be more productive and [clears throat] they're feeling attention right now security and legal and compliance and whatever else but >> uh they'll listen to their engineers
>> and if enough engineers say it loudly enough then then then stuff gets done.
>> Yeah. um advice for somebody who does have they've started to use the the old workflow with PRDS and tech specs and using agents and they're getting productive at it. They got this massive
unsavory gross depositive MD files and and would you actually just say hey look at my MD files and >> decide what's future work and move those
into beads please. Is that what you would do?
>> Yes.
>> Okay. That's the thing, man. The Beads
because Beads was designed by me and Claude right?
>> Claude and I designed it.
>> Um, >> it makes agents really happy. And I've
had a lot of people telling me that they're shocked at how aggressively agents adopt beads for their own workflow.
>> It's a lifeline for them because they're like, "Oh, right." Because they they they like, "Oh, I they're they're holding a bunch of stuff and they're like, uh, I got to do this, but I know about this problem."
>> They file an issue and that the pressure is off. It's a pressure release valve
is off. It's a pressure release valve for them.
>> Yeah.
>> Okay. And so what that means is this winds up being uh very very easy to integrate with the rest of your stuff that you already have. You can bring all
your workflow, your your MD files, your GitHub issues, your Jira issues, like everything that you have, just tell your agent, >> slurp it all up into beads.
>> Yeah.
>> Yeah. Cool.
>> And man, I I tell you, it just it just works. It just works.
works. It just works.
>> Those are Yeah. Those companies are not going to like to hear that. But yeah.
>> Yeah. I mean, like product managers don't like to hear it. I've got, you know, good buddies who are working on this problem and they don't like beads.
They're like, well, that doesn't really mesh with my view of how the world should be with top down intelligent planning. And it's like, well, okay,
planning. And it's like, well, okay, but and and what one things one thing is for sure, right, is beads is more about execution than planning.
>> Yep.
>> Beads is really about direct. It's
orchestration is what it is. Beads is
about directing, helping direct workers.
Now, I'm building a larger system. I
wish that I could have showed it to you.
We didn't talk about it, but uh [snorts] >> it's called Vibe Coder or VC. It's in
Go. It's about 85,000 lines of code now.
Beads came out of it, >> sort of forked out of it, you know, >> but it uses beads as its base layer.
>> Mhm.
>> And then it has it spins up workers and it does what I was saying. It automates
the book that Gene and I wrote, you know, >> so uh it does the land the plane thing, right? It does it does cost
right? It does it does cost optimization. It does retries. It's a
optimization. It does retries. It's a
self-healing agent colony is what it is.
>> And I believe that that is what next year's form factor is going to be. It's
not cloud >> code. You're building that to experiment
>> code. You're building that to experiment with these bigger ideas in your own workflow. And you separate beads out as
workflow. And you separate beads out as an open source.
>> Yeah.
>> Because that was the piece that's like in the moment gives you >> Yeah. Because the rest of it keeps
>> Yeah. Because the rest of it keeps changing. We have no idea how it's going
changing. We have no idea how it's going to work. Seriously, it's like it's
to work. Seriously, it's like it's pretty amorphous. It's clearly like a
pretty amorphous. It's clearly like a lot more complicated than the cloud code class, >> AMP class, but um I don't think they're going to really emerge until spring.
>> Yeah, that makes sense.
>> But they'll be cool. You know why?
They're going to be like Kubernetes for your workers.
>> You're going to be able to come in, you'll be able to grab your coffee and be like, "Yo, how's everyone doing today?" And in your web UI, not some
today?" And in your web UI, not some horrible terminal, you're going to be able to see like how everyone's doing.
Oh, you got stuck. Oh, you actually refactored your work and split up three different workers and right. Oh, you
need my input. Oh, look at that. Let's
have a conversation.
>> How do you deal this maybe not with you may have opinions though is how do you deal with conflicting work that's done in parallel by different things that are that are off on their own path. You
know, I've we've all had this just in the current world of like multitab raw dogging on mainline where you know this guy goes off and starts working on something and it has a view of the world. that's changed under its feet by
world. that's changed under its feet by this other one and now they overwrite each other and the reason it's flickering is because that JavaScript file had the same function in it twice.
You know, the first agent who was working on that problem didn't see that and how do you >> how do you provide the sort of overlay?
Is that another service you add into this the grander vision or do you have a best practice to say no don't do that work in branches and then have a have a land the plane merge conflict step how
do you do you do you deal with this >> yeah so I the question you're really asking I mean there's multiple ways we can answer this but the question the root of it is >> what happens when you when when coding
is no longer the the bottleneck >> because then code review and merging and uh CI become the bottleneck real fast.
>> Well, and this is interesting perspective also from AI tinkers and talking to people at the bigger companies who were there is that's already the bottleneck. They have an arbitrary rule that no code is able to
merge until a human has signed off on it. So the amount of code review is
it. So the amount of code review is insane for even the little trivial things that you can now do. You can now you can now take care of all your paper cuts in an hour, right?
>> But who's going to review all them, right? And it can't ship unless they do.
right? And it can't ship unless they do.
So this is a huge bottleneck. Uh
obviously the solution is don't do it with humans but >> yeah so the solution right now is graphite. I mean like AMP has a really
graphite. I mean like AMP has a really cool code review tool a really cool one if you happen to be using AMP and there are other ones out there but the one >> how does graphite work >> the one that I saw that impressed me the most is graphite.
>> Okay. How does it work?
>> Well I don't know how it works but they did a presentation where they've been doing AI code review for uh since GPD35 came out.
>> Oh yeah.
>> So they were like oh this is going to be an issue. I mean they were way ahead of
an issue. I mean they were way ahead of the game. And so for three years now,
the game. And so for three years now, they've been working on really subtle nuances like is the AI's wording gonna piss you off?
>> Yeah.
>> Right. Yeah.
>> Like, you know, like is it benile? Is
it, you know, is it is it is it sperious? You know, because you can lose
sperious? You know, because you can lose your trust in the AI reviewer real fast and be like, I'm not >> Yeah.
>> But man, they've got it as far as I can tell quite dialed.
>> And so you can get like 80 90% of that human workload off offloading. Yeah,
that's that is a huge I call it what is the exhaust of the agents that's unpleasant. One of them definitely is
unpleasant. One of them definitely is like little micro tweaks to copy and emails that and then you later [snorts] get the email and you're like who wrote that? Oh my god, it's terrible or
wrote that? Oh my god, it's terrible or whatever. Yeah. Um, check it out. I just
whatever. Yeah. Um, check it out. I just
did the uh ingested my MD folder. I have
a folder of future projects work which is stuff that I put in that bucket you described earlier. It's like I don't
described earlier. It's like I don't want to work on this now. It's too big too I'm not sure I want to do it. It's
off there. And I had it go. I didn't say create the beads. I said analyze it and decide what you think should be in the beads. And it listed them out. Now it's
beads. And it listed them out. Now it's
just asking me yes, no, do I approve, which is pretty cool. Should I do >> So you still have approval turned on?
>> I No, this is I always ask for approval.
I I don't have it turned on. I I I ask the AI agent usually before you just go do this thing I asked you to do, I want to see it.
>> Oh, you actually prompt it to >> I prompt it for proposals.
>> Oh, I do that sometimes. Yeah. Yeah.
Yeah. In this case, you know, I I don't want to take my future projects and just make beads out of them because >> Yeah. Yeah. This one you probably want
>> Yeah. Yeah. This one you probably want it to prompt you.
>> Those are like six months out things I don't want to >> but beads has a delete feature. Um yeah,
>> some of the controversial people are like, "Why would you delete an issue?"
It's cuz because my worker filed 100,000, you know, BS issues.
>> Yeah. Yeah. Cool.
>> Uh so you can, you know, with a little bit of hygiene, you can keep your beads dab.
>> Is anyone creating an interfaces for beads yet? Like web and voice or
beads yet? Like web and voice or anything like that?
>> Push notifications.
Uh, I mean, I don't know. The funny
thing is like it's starting to like I feel like it's almost, you know, the Cambrian explosion.
>> It's been four weeks, Steve. where
hasn't it?
>> But the funny thing is I feel like like we had this big ocean soup and beads was like a clump like of amino acids and I think it's starting to clump with other ones because this male MCP server could
actually potentially be an important like you always use them together >> and then maybe another one will come in and we may actually wind up assembling this workflow as this Frankenstein thing. It could happen, right? Like it
thing. It could happen, right? Like it
may evolve rather than get designed.
>> Yeah, I agree.
>> I don't know. That's kind of what these LLMs in the first place were is like, well, it also does a bunch of stuff that we didn't train it to.
>> They came out of Google Translate. You
know that, right?
>> Yeah. [snorts]
>> So, it's like when when you have systems that fundamentally at their base layer do things that you didn't design them to do, that you didn't predict, and then you have to discover those things.
>> Emerging behaviors, >> it's going to lead to emerging systems that are created. So, this is great.
>> It is. So, anyway, I hope people I mean, look, you can do UI work with these things. Yeah,
things. Yeah, >> I love your MCP video server idea. I
think that's going to be an important >> uh but I actually got a feature fixed for a mere $14 while we were [laughter] chatting.
>> How much dev work in the old world would have I been in salary dollars?
>> Who knows, man? It's like it's not cheap. This stuff is not cheap, which is
cheap. This stuff is not cheap, which is why you really want amp with ads. Yeah.
>> Yeah.
>> No, I mean $14 is versus the the labor dollars that it would have been if you were battling that for a week or whatever it would cost.
>> It's probably cheaper. It's really
cheap, you know, but it does it's really eye opening when you when you see that that one little feature was 14 bucks and like now the whole thing is going to be how many thousands, right?
>> By the same token, it would never get done. I would simply never have done
done. I would simply never have done this if if it hadn't been for this. So,
it's like to me the trade-off's worth it.
>> Yeah. Super cool. Um I love beads. This
is cool. I'm going to be adopting it in my workflow and >> um reporting back and maybe contributing.
>> Nice.
>> Yeah. Very cool. Is there anything else we wanted to cover? No, I'm good.
>> Thanks, Steve. This was super cool to be able to sit down and um yeah, dive into what you're working on. It's really
cool.
>> Yeah. No, thanks for letting me show this stuff off, man. This was a lot of fun.
>> Thanks.
>> Cheers.
Loading video analysis...