Depth Anything TouchDesigner Plugin
By Torin Blankensmith
Summary
## Key takeaways - **Depth Anything: Real-time Depth Maps from Webcams**: The Depth Anything plugin for TouchDesigner allows users to generate real-time depth maps from any webcam or TOP input without requiring special hardware or additional installations, enabling applications like instanced visuals and occlusion effects. [00:26], [00:48] - **Performance Optimization for Depth Anything**: To maintain performance, consider scaling down the input resolution (e.g., to 1/8th) and adjusting the model size. While larger models offer higher quality, they significantly impact frame rates, with sizes around 350 providing a good balance. [02:27], [07:54] - **Integrating Depth Data with Point Clouds**: The Depth Anything component can be integrated with Josef Pelz's PCR point-cloud renderer, allowing for advanced visualizations with features like bokeh and focus control by mapping depth data to point positions and alpha. [04:12], [09:39] - **Enhancing Depth Renders with Screen-Space Tools**: Prismatic's SSDR (Screen Space Depth Renderer) can be used with Depth Anything's output to create sophisticated material effects, including Fresnel lighting, rim lights, and custom environment maps, all rendered within a material. [04:32], [11:26] - **High-Quality Depth Video Recording**: For high-quality depth map recording, it's recommended to export as OpenEXR images or Apple ProRes 12-bit video. A custom normalization component is necessary to avoid flickering issues caused by dynamic min/max values in the depth data. [15:00], [17:35]
Topics Covered
- Real-time Depth from Any Webcam Without Setup.
- Creative Visuals: Instancing, PBR, and Trails.
- Advanced Rendering with Screen Space Depth.
- Optimizing Performance in Depth Map Rendering.
- Recording High-Quality Depth Maps for Post-Production.
Full Transcript
[Music]
Hey, welcome back to another Touch
Designer tutorial. I'm so excited to
finally get around to creating this
tutorial for the Depth Anything
component that I've made for Touch
Designer. So, if you've seen the actual
plugin that I've posted on Patreon, I'm
really excited that there's actually a
whole new set of features that I'm
adding to this update. And if you're
brand new to it, depth anything is
basically a plugin that allows you to
take any video feed and create a depth
map from it. So what we're seeing right
now is an approximation of the depth
without needing any special hardware.
We're just using a webcam. And this
model is able to in real time
approximate how far away things are. One
thing that's really nice is that this
machine learning model doesn't actually
take any installation process. So you
can just drop it into your project. It
will take a decent amount of time to
load up, but the really nice part is
that you don't have to install any
custom dependencies. You can just drop
it in and it will start running. All
right, so this whole project file is
going to be available for download in my
Patreon. And I'm going to put a link to
that in the description. And I want to
do a brief overview of everything that's
going to be included in that project
file. So there's three main examples.
The first example that I wanted to show
is just using the depth map with some
instancing. So you can kind of see here
I've got a bunch of points and I have
some boxes that I'm instancing and it
will basically figure out the depth of
it and I'm putting in a PBR material so
I get some really nice lighting. But I
can also do things like scale the points
down and add in a little bit of noise to
like offset the position of the
particles and do some kind of fun trail
effects and things like that. And you'll
notice too that the depth kind of
changes a little bit. So if you happen
to be cut off in the scene when you're
showing the instancing, you have this
camera near and far. So you can actually
change the far plane and by default it
can render everything in the whole scene
with depth. But I wanted to kind of
isolate the subject a little bit. So I
kind of dragged this camera far plane to
be a little bit closer so that I'm
isolated in the scene. Another thing is
that you can also change the number of
points that are being instanced. You'll
notice that the uh resolution here is
1280 x 720 just because it's the default
and highest resolution that anyone with
a non-commercial license could use. So
you can always change that over here on
the actual component. But um I'm taking
this fit top and I'm using the output
resolution at 1/8. So I'm cutting the
resolution by an eighth. You can always
bump up the number of points here by
like a quarter or higher if you want.
Uh, and you'll see that there's many,
many more points here that you can use.
But I'm still getting 60 frames per
second on this, but when I'm not
actually recording the video. And then
one thing that can be kind of fun to do
is draw trails wherever your movement is
happening. So with no feedback, I can
kind of move around and we'll get some
updates. But then if I increase this
quite a bit, you'll notice that I can
sort of like draw trails
of all the instances which can be pretty
fun to play around with. The other
example that I wanted to show is
actually an example that Joseph Pel's
made. So you should definitely go check
out their work. So you'll notice that
I'm actually sitting in front of this
Taurus here
and it's sitting between me and the
wall.
And you'll notice if I change the
position of this Taurus, it will
actually eventually clip through the
wall behind me. And then I can
eventually raise it far enough so I can
be inside this little Taurus here, which
is pretty fun.
I also have this example here if you
want to be able to record your depth map
as a really high quality video. And
basically this is set up up here with
the normalization and also this little
script down here and the movie file so
that you can record a really high
quality depth video and then be able to
sync it up later with your original
color source video. And I'm going to
show how to use that later on and also
how to set up this little network here
that will basically record a really high
quality output. I also have an example
that builds off of this instancing
example, but it uses Joseph Pel's PCR
tool. It's a point cloud rendering tool,
and it gives you really nice bokeh and
these sort of like soft features in the
point cloud. That component is actually
going to be available on Joseph's
Patreon. So, I'll put a link to that in
the description as well, but I'll show
you later on how to get that set up
really easily in this project. The other
example that I wanted to show as well is
using Prismatics SSDR. So it's a screen
space depth render. So instead of
actually using a bunch of points to
represent it, it's all done inside a
material. So this was a little example.
The screen space depth render component
is available on Prismatics Patreon. So
definitely check it out. And what's
really cool is that you can just render
this type of content using any type of
top. It'll approximate the depth. One
thing that's really cool is that you can
actually do things like add in Fresnel
lighting. So, it'll add lighting around
the edges of the object. You can do
things like add in these rim lights. So,
you can actually get some really really
nice detail in there and change a lot of
the material properties. So, I can
change like the metallicness. I can
change I can change the roughness. And
you can also put in custom environment
maps and also choose the amount of depth
as well depending on the scale of the
depth map that you're working with. I
did a couple things here to just smooth
it out a little bit and also remove the
background of my project. So I'm getting
much much more information here, but I
wanted to kind of isolate the subject in
this example. So those two plugins
aren't going to be in this project file,
but I'll show you later on how to get
that configured. I feel like they were
really interesting pieces. So, I wanted
to make sure to show you all how to like
extend the functionality of this even
further. Let's go ahead and dive in and
talk a little bit more about the
component itself and some of the
features and how to customize those. You
can switch between using the top input
or you could use the webcam as an input.
And one thing to note is that it will
use a little bit of extra compute in
order to connect any top and upload
those video frames. If you want to get
around that, you could always switch
over. You can just disconnect this and
then it will load in your webcam. And
one thing to note is that your webcam
feed will show up here. And if it's
using a lot of compute, this top frame
here is going to be your synced frame.
So this is going to line up perfectly
with your depth. And I am applying
smoothing. So if you want it to be the
exact frames, you can go over here and
turn off smooth synced web frames. Now
you're seeing it's a little jittery, but
it's actually going to be lining up with
every single frame that you're getting
from the depth output. And we can turn
that back on. So there's some
optimizations that you can make. Um, by
default, I have the output resolution of
the model set to 1280 x 720 so that even
if you're on a non-commercial license,
you can just load this into your project
and use it. But you can change the
output resolution here. One thing I want
to note is that you might notice these
like little white bars at the top and
bottom. The thing is the component
itself is actually set to render at a
square resolution. So to get rid of
those bars, if you just use a square
resolution for your input image and a
square resolution for your output,
that'll get rid of it. Another thing to
note is that the model works best with
sizes that are scaled values of 14. So
you'll notice that if I take this right
here and bump up the model size to the
maximum size, it's 518. You'll see that
the model size actually has the ability
to go larger than that. So if you wanted
to do a much much higher version, um you
could put whatever number you want in
here, but you'll see that my Touch
Designer project is running at 60 frames
a second. But over here, I'm getting
like 22 drop frames roughly. And a huge
advantage of this is that you can run
this machine learning model and it's not
going to hang your main thread in Touch
Designer. So you can do a little bit
more extra compute, but I'm going to go
ahead and set it back to 350. And at
this amount, I feel like I'm able to get
a high enough quality depth map, but I'm
also able to get only a couple drop
frames in the actual model. So we can go
at a lower resolution than this to even
get better frame rate. Um, I'm getting
when I'm not recording video anywhere
from like five to six drop frames at
this particular model size. You can also
use this camera scale here when you have
the actual webcam as the input and you
can drop the size of the webcam a little
bit. That can also help on performance
but will be a loss in the quality of the
actual output. One other thing that I
want to note with this component is that
there are actually two versions of it.
So the first one is a slightly smaller
model. So this is depth anything small.
I'm able to max out my computer's GPU
just using this version of the model.
But I also will have depth anything base
which has more parameters overall. It's
a slightly larger version of the model.
And if your computer's able to handle
it, you should get slightly higher
quality renders. They're two just
standalonees. So you can drag either one
into your project and go from there. For
this next part, I want to show you how
to set up the point cloud rendering
tool. and then later I'll go in to show
how to use this screen space depth
renderer because those two aren't going
to be available in my project file on
Patreon. I'm going to go ahead and start
by going into this example and I'm going
to delete the PCR top and I'll just go
ahead and add that back in so that I can
show you all
how to do it basically. Yeah. So, this
component is available on Joseph's
Patreon. And if you're using Windows,
you can use the pixel format, but on Mac
you can use splats. And basically all we
need to do is the first input is going
to be our positions. Uh I'm actually
going to turn off our geometry again in
here. So we'll basically just connect
our positions right here. And then over
in our color instances, that's going to
be our second input. And you'll see that
nothing really shows up. And it's
because the points are super super
teeny. So I can bump up the scale, but
also the alpha is quite low. So I can
bump this up as well. And from here,
what we can do is maybe increase the
bokeh a little bit. When we do that, we
need to change where it's focusing. So,
I will bring the focus
much closer. And that's basically it.
You can play around with this
functionality. Maybe if you want to fade
the background a little bit, you can
kind of have some fall-off range so the
background alpha fades more quickly.
And now if I open up the parameters on
this little container, you should be
able to see if we bring this little
noise feature back in, we'll get that
depth of field with the point clouds.
Then we can bring it back again. One
thing to note is that this PCR component
is actually a piece of geometry. So you
need to use it in a render network. And
for context, this little render network
is basically just the uh point render,
but all I've done is swapped out the
points with some boxes and added in a
little feedback loop over here with some
noise to be able to kind of move the
points around nicely. In the next
example, I want to show you how I set up
the screen space depth render. So, I'm
going to delete this little part over
here. Basically, what we can do is drag
in the component. This again is
available on Prismatics Patreon. And you
can see some of the really cool lighting
that they're getting just by using a
ramp here. And let's go ahead and
connect our raw depth. So, it's a little
chaotic right now, and that is partially
because the scale is super high. So, we
can bring that back a little bit. Then
I'll scale it down. And you know,
there's kind of a trade-off between what
you're getting with the shadows and
lighting and how much depth you actually
want to have in the scene. I'm also not
an expert on this tool, so hopefully
Prismatic can uh do a little bit more of
a deep dive on how to get some quality
renders out as well. Um, but then what
we can do is basically set the update
the bounds and then set the midpoint and
that will kind of help with the
rendering. And for material, we can kind
of play around with the roughness to get
some reflectiveness.
And the way I was bringing in the rim
lights is over here in the rim light
section. So I turn these on and I can
play around with um bumping up the
number of lights or the gain of those
lights as well. And we also have like a
main light over here. So, we could
increase the intensity of the
environment light or decrease the
intensity of the point light. The point
lights are getting a little washed out
right now. So, I think I'm going to turn
this down a little bit. And one thing
that I did to get rid of some of the
graininess here is that I actually ended
up um blurring it slightly. So, if you
put in a blur top
and then we increase it a little bit, we
should kind of smooth out some of the
edges and get things looking a little
bit crisper. So, you can see there's a
lot of there's some funny artifacts that
are along here. And if I just blur it a
little bit, um, we get some smoother
smoother quality content. The other
thing I wanted to do is basically remove
the background. And so the way I did
that is by putting in a threshold.
And basically all I ended up doing is
just setting uh don't use the luminance
here. You want to use the red. And then
you can set this all the way to one
pretty much and get a cutout of the
person. And then we'll just multiply
this with the original to get ourselves
a little bit of transparency for the
background. and connect it there. And
then we get this little cutout. One
other thing that I wanted to show as
well is how to use the actual webcam as
an environment light on here. So to do
this, you actually need to go inside and
if you want to use a live video for an
environment map, what you can do is go
to the environment light here and under
use pre-filter maps automatic, turn it
off. And what you can do from here is
that there's this little section under
the main lights. We can drag and drop
the synced webcam
onto here and we should basically get
the general lighting of our environment
on our environment map. But it is kind
of nice actually. I do like the original
environment map. Overall, I feel like it
gives you some really really cool
effects. Definitely my computer is
taking a little bit of a hit right now.
Oh, it's because I still have this
instancing turned on. Um,
so this is actually a really performant
way to render the depth map, which is
super super nice. Definitely check out
this component over on Prismatics
Patreon. And
yeah, excited for you all to play around
with it. One last thing that I'm going
to show is that if you've got a video
that you want to be able to capture the
highest resolution possible for this and
you want to record out a video of that
depth map, uh, you can do that with this
component. And I want to show how to do
that. So, let's bring in a movie file
in. And what we're going to do is switch
it over to the count video because it's
just easier to see that all the frames
are captured.
And you can see when it's updating, the
depth model is having a bit of a tough
time figuring out the depth of this cuz
aren't really 3D elements in this. But
what we can see is that whenever this is
updating, we can see there's a little
camera index down here. and we're
actually going to use that to progress
our model. So, we have a couple drop
frames. We have like maybe 15 drop
frames because I'm also recording right
now. So, I'm going to treat this as our
like higher quality version. And what we
can do is turn off instead of
sequential, we'll basically specify
index and we're going to remove this and
we're going to set it to
um zero frames for now because basically
at frame
index one we're going to get frame one.
Index two is frame two. Um but notice
that this camera index isn't updating
until it's finished processing. So, if I
set it to one or let's let's start it
off at zero. Notice that this little
camera index value updated. So, we're
going to select out. We'll put down a
select chop and we're going to snag the
camera index. And we can do a
chop execute.
Basically, whenever this value changes,
what we want to do is increment this
movie file index. And then we want to we
want to save the raw depth video. So
we'll do a movie file out.
All right. Torn from the future here. So
couple of notes on recording an actual
video of the depth. We're going to end
up having to lose a little bit of
information if we save this as a video.
Probably the best way that you could do
this if you want to maintain as much
information as possible is by recording
a movie file out and doing a bunch of
images but as open EXR. That can save
the highest quality data. But in my
case, I'm going to do Apple ProRes and
do uh X Q in order to get the least
amount of loss on my end. But then I'm
going to also bump this up to 12 bit. 16
bit be great ideally. Um but when we go
to record the video content, the thing
you need to do is actually normalize the
data because we have negative values
that are in our uh depth data. And you
can't just use a limit top. If you do a
limit top and you do normalize, what
happens is as the depth continues to
change, what'll happen is that it's
constantly taking that min and max and
it's rearranging everything from 0 to
one. So it'll make your image really
flickery. So instead, I'm going to
include this little normaliz component
that I just put together. And what it
does is it picks a particular duration.
So you can say like sample for about a
second. And what you do, let's play our
video here for a second. I'm going to go
ahead and turn this back to playing.
What it will do is I click this sample
button. It'll sample the min and max
depth for 1.17 seconds. You can choose
to do shorter or longer. And then it
will normalize it. So, it's kind of just
holding on to those values, getting a
minimum value, getting a maximum value,
and rearranging that for us, which is
great. And this is what you would pass
into your movie file. Cool. Back to the
rest of the video. What we want to do is
you'll turn on recording and you're
going to manually add a new frame
whenever it's done processing. And then
you're going to increment this movie
file index to be slightly higher. So
we're going to do a little bit of Python
programming for that. And for this
basically what we can do, I'll go ahead
and edit this file in VS Code. And we
want to get a reference. We don't we
don't actually need all these other
helper functions. We just want whenever
the value changes. And basically what do
we want to do? We want to get the movie
file. So we'll we'll get the operator
movie file at one. And we're going to
get the parameter
ad frame.
and we're going to call pulse on it to
add a new frame. And then we want to
increment the movie file index by one.
So let's set the operator. We're going
to get movie file in one.parindex
plus equals 1.
That's pretty much it. So now
we kind of just need to start that
cycle. So all we need to do to start
that cycle is basically increment a
value but we have no way of stopping it.
So one other thing that we can do is see
if the movie file is set to record and
if it is
then we will do the rest of these
pieces. And if we turn off recording we
don't want it to continue with this
loop. So uh why don't we just keep our
movie file out?
Um
movie file is equal to So if uh if our
movie files
record
part
then we'll do all these pieces. So we
can indent this over
and then we can add our MF
add frame. So fingers crossed this will
work. We are currently recording.
And don't forget to save to update it.
And
let's set this to index zero to ceue it
off. And we can see that we're starting
to increment. And we're getting we
should be getting all the individual
frames recorded. Now um you know
obviously this is going to record past
what we needed. So at a certain point we
can turn off record and that should stop
the loop.
So, I'm going to actually include this
as a little example, and this is going
to be available on the Patreon as well.
I'm going to include these little helper
pieces here for recording a video. Um,
just so that you have them set up, as
well as this little normalization
component. And lastly, if you want to be
able to use the depth data, we can then
snag that movie. We'll bring it back in.
You can see here that it's playing back.
We can see the different frames. If you
want to have this data laid out in this
UV format, you can actually just copy
the little component from inside here.
Uh you'll see that there's a shader that
says GLSL1. You can just copy this,
paste it outside and connect your film.
And this should rearrange it so that
we're able to connect it over here to
the rest of our input. Uh to get your
color data, you would want to basically
take your original movie file
and you would sync up the index of these
two. So we would go back to
lock to timeline, I guess. So this is
two frames before. So you can actually
just do specify index. And what you can
do to sync it up perfectly is just add
two frames to it. And this should give
you the same exact frame number. Oops,
we're a little too far into the video.
There we go. There we go. 22 22. Same
frame number. And then now you've got
these two elements which you could just
connect over here into your instancing.
We'll replace this as our color data.
And now we should be able to get our
instancing there. Um, I would highly
highly recommend using the fit though
for both of these because the main goal
was to scale down the number of points
that are being used. If you do that,
it'll auto resize the resolution. But
you can kind of see this is rendering
out our content using I think uh the
point cloud component.
We can also render it out using
the geometry as well. So cool. Oh, we're
getting our depth data from our video
now perfectly. Um, so I guess that's the
main takeaway here. So yeah, I'll be
including the normalized component here
and the configuration to do the movie
file recording as well. One thing that I
wanted to mention is that Dean Chesman's
got this really great tutorial on Touch
Designer showing how to use the current
build, but also the experimental build
with POPS to create this kind of like 3D
depth point cloud rendering. Um,
definitely go take a look. And this
plugin actually works perfectly with
that implementation. So, it's a great
kind of continuing next step from this.
And I'll put a link to that tutorial in
the description. Go check it out. Cool.
I think that wraps it up. Stay tuned cuz
I've got another really exciting machine
learning model that I've been working
quite a bit on to incorporate into Touch
Designer super super seamlessly. And if
you're not subscribed to the YouTube
channel, if you want to stay uptoate on
some of the things that I'm doing, you
can subscribe there. Or if you want
notifications, you can sign up to my
Patreon for free and you'll get email
notifications for the new pieces that
I'm releasing. Also, one thing that was
really sick is that PJ Visuals, Nick
Marott, tagged me in this post, and it
sounds like they ended up using the
depth anything component to prototype
some of the visuals for the 9-in nails
music video. That was honestly such a
cool moment to see that Touch Designer
is being used in a context like that.
So, I'm super excited to see what you
all end up making with this component.
And yeah, keep me posted. I It honestly
is like really inspiring to see how you
all working with these tools. And I'm
excited to start working on developing
some of these new features into my other
plugins as well. All right, if you stuck
with me this far, thanks for following
along and I look forward to seeing you
in the next one.
Loading video analysis...