Depth Anything TouchDesigner Plugin

By Torin Blankensmith

Summary

## Key takeaways - **Depth Anything: Real-time Depth Maps from Webcams**: The Depth Anything plugin for TouchDesigner allows users to generate real-time depth maps from any webcam or TOP input without requiring special hardware or additional installations, enabling applications like instanced visuals and occlusion effects. [00:26], [00:48] - **Performance Optimization for Depth Anything**: To maintain performance, consider scaling down the input resolution (e.g., to 1/8th) and adjusting the model size. While larger models offer higher quality, they significantly impact frame rates, with sizes around 350 providing a good balance. [02:27], [07:54] - **Integrating Depth Data with Point Clouds**: The Depth Anything component can be integrated with Josef Pelz's PCR point-cloud renderer, allowing for advanced visualizations with features like bokeh and focus control by mapping depth data to point positions and alpha. [04:12], [09:39] - **Enhancing Depth Renders with Screen-Space Tools**: Prismatic's SSDR (Screen Space Depth Renderer) can be used with Depth Anything's output to create sophisticated material effects, including Fresnel lighting, rim lights, and custom environment maps, all rendered within a material. [04:32], [11:26] - **High-Quality Depth Video Recording**: For high-quality depth map recording, it's recommended to export as OpenEXR images or Apple ProRes 12-bit video. A custom normalization component is necessary to avoid flickering issues caused by dynamic min/max values in the depth data. [15:00], [17:35]

Topics Covered

Real-time Depth from Any Webcam Without Setup.
Creative Visuals: Instancing, PBR, and Trails.
Advanced Rendering with Screen Space Depth.
Optimizing Performance in Depth Map Rendering.
Recording High-Quality Depth Maps for Post-Production.

Full Transcript

[Music]

Hey, welcome back to another Touch

Designer tutorial. I'm so excited to

finally get around to creating this

tutorial for the Depth Anything

component that I've made for Touch

Designer. So, if you've seen the actual

plugin that I've posted on Patreon, I'm

really excited that there's actually a

whole new set of features that I'm

adding to this update. And if you're

brand new to it, depth anything is

basically a plugin that allows you to

take any video feed and create a depth

map from it. So what we're seeing right

now is an approximation of the depth

without needing any special hardware.

We're just using a webcam. And this

model is able to in real time

approximate how far away things are. One

thing that's really nice is that this

machine learning model doesn't actually

take any installation process. So you

can just drop it into your project. It

will take a decent amount of time to

load up, but the really nice part is

that you don't have to install any

custom dependencies. You can just drop

it in and it will start running. All

right, so this whole project file is

going to be available for download in my

Patreon. And I'm going to put a link to

that in the description. And I want to

do a brief overview of everything that's

going to be included in that project

file. So there's three main examples.

The first example that I wanted to show

is just using the depth map with some

instancing. So you can kind of see here

I've got a bunch of points and I have

some boxes that I'm instancing and it

will basically figure out the depth of

it and I'm putting in a PBR material so

I get some really nice lighting. But I

can also do things like scale the points

down and add in a little bit of noise to

like offset the position of the

particles and do some kind of fun trail

effects and things like that. And you'll

notice too that the depth kind of

changes a little bit. So if you happen

to be cut off in the scene when you're

showing the instancing, you have this

camera near and far. So you can actually

change the far plane and by default it

can render everything in the whole scene

with depth. But I wanted to kind of

isolate the subject a little bit. So I

kind of dragged this camera far plane to

be a little bit closer so that I'm

isolated in the scene. Another thing is

that you can also change the number of

points that are being instanced. You'll

notice that the uh resolution here is

1280 x 720 just because it's the default

and highest resolution that anyone with

a non-commercial license could use. So

you can always change that over here on

the actual component. But um I'm taking

this fit top and I'm using the output

resolution at 1/8. So I'm cutting the

resolution by an eighth. You can always

bump up the number of points here by

like a quarter or higher if you want.

Uh, and you'll see that there's many,

many more points here that you can use.

But I'm still getting 60 frames per

second on this, but when I'm not

actually recording the video. And then

one thing that can be kind of fun to do

is draw trails wherever your movement is

happening. So with no feedback, I can

kind of move around and we'll get some

updates. But then if I increase this

quite a bit, you'll notice that I can

sort of like draw trails

of all the instances which can be pretty

fun to play around with. The other

example that I wanted to show is

actually an example that Joseph Pel's

made. So you should definitely go check

out their work. So you'll notice that

I'm actually sitting in front of this

Taurus here

and it's sitting between me and the

wall.

And you'll notice if I change the

position of this Taurus, it will

actually eventually clip through the

wall behind me. And then I can

eventually raise it far enough so I can

be inside this little Taurus here, which

is pretty fun.

I also have this example here if you

want to be able to record your depth map

as a really high quality video. And

basically this is set up up here with

the normalization and also this little

script down here and the movie file so

that you can record a really high

quality depth video and then be able to

sync it up later with your original

color source video. And I'm going to

show how to use that later on and also

how to set up this little network here

that will basically record a really high

quality output. I also have an example

that builds off of this instancing

example, but it uses Joseph Pel's PCR

tool. It's a point cloud rendering tool,

and it gives you really nice bokeh and

these sort of like soft features in the

point cloud. That component is actually

going to be available on Joseph's

Patreon. So, I'll put a link to that in

the description as well, but I'll show

you later on how to get that set up

really easily in this project. The other

example that I wanted to show as well is

using Prismatics SSDR. So it's a screen

space depth render. So instead of

actually using a bunch of points to

represent it, it's all done inside a

material. So this was a little example.

The screen space depth render component

is available on Prismatics Patreon. So

definitely check it out. And what's

really cool is that you can just render

this type of content using any type of

top. It'll approximate the depth. One

thing that's really cool is that you can

actually do things like add in Fresnel

lighting. So, it'll add lighting around

the edges of the object. You can do

things like add in these rim lights. So,

you can actually get some really really

nice detail in there and change a lot of

the material properties. So, I can

change like the metallicness. I can

change I can change the roughness. And

you can also put in custom environment

maps and also choose the amount of depth

as well depending on the scale of the

depth map that you're working with. I

did a couple things here to just smooth

it out a little bit and also remove the

background of my project. So I'm getting

much much more information here, but I

wanted to kind of isolate the subject in

this example. So those two plugins

aren't going to be in this project file,

but I'll show you later on how to get

that configured. I feel like they were

really interesting pieces. So, I wanted

to make sure to show you all how to like

extend the functionality of this even

further. Let's go ahead and dive in and

talk a little bit more about the

component itself and some of the

features and how to customize those. You

can switch between using the top input

or you could use the webcam as an input.

And one thing to note is that it will

use a little bit of extra compute in

order to connect any top and upload

those video frames. If you want to get

around that, you could always switch

over. You can just disconnect this and

then it will load in your webcam. And

one thing to note is that your webcam

feed will show up here. And if it's

using a lot of compute, this top frame

here is going to be your synced frame.

So this is going to line up perfectly

with your depth. And I am applying

smoothing. So if you want it to be the

exact frames, you can go over here and

turn off smooth synced web frames. Now

you're seeing it's a little jittery, but

it's actually going to be lining up with

every single frame that you're getting

from the depth output. And we can turn

that back on. So there's some

optimizations that you can make. Um, by

default, I have the output resolution of

the model set to 1280 x 720 so that even

if you're on a non-commercial license,

you can just load this into your project

and use it. But you can change the

output resolution here. One thing I want

to note is that you might notice these

like little white bars at the top and

bottom. The thing is the component

itself is actually set to render at a

square resolution. So to get rid of

those bars, if you just use a square

resolution for your input image and a

square resolution for your output,

that'll get rid of it. Another thing to

note is that the model works best with

sizes that are scaled values of 14. So

you'll notice that if I take this right

here and bump up the model size to the

maximum size, it's 518. You'll see that

the model size actually has the ability

to go larger than that. So if you wanted

to do a much much higher version, um you

could put whatever number you want in

here, but you'll see that my Touch

Designer project is running at 60 frames

a second. But over here, I'm getting

like 22 drop frames roughly. And a huge

advantage of this is that you can run

this machine learning model and it's not

going to hang your main thread in Touch

Designer. So you can do a little bit

more extra compute, but I'm going to go

ahead and set it back to 350. And at

this amount, I feel like I'm able to get

a high enough quality depth map, but I'm

also able to get only a couple drop

frames in the actual model. So we can go

at a lower resolution than this to even

get better frame rate. Um, I'm getting

when I'm not recording video anywhere

from like five to six drop frames at

this particular model size. You can also

use this camera scale here when you have

the actual webcam as the input and you

can drop the size of the webcam a little

bit. That can also help on performance

but will be a loss in the quality of the

actual output. One other thing that I

want to note with this component is that

there are actually two versions of it.

So the first one is a slightly smaller

model. So this is depth anything small.

I'm able to max out my computer's GPU

just using this version of the model.

But I also will have depth anything base

which has more parameters overall. It's

a slightly larger version of the model.

And if your computer's able to handle

it, you should get slightly higher

quality renders. They're two just

standalonees. So you can drag either one

into your project and go from there. For

this next part, I want to show you how

to set up the point cloud rendering

tool. and then later I'll go in to show

how to use this screen space depth

renderer because those two aren't going

to be available in my project file on

Patreon. I'm going to go ahead and start

by going into this example and I'm going

to delete the PCR top and I'll just go

ahead and add that back in so that I can

show you all

how to do it basically. Yeah. So, this

component is available on Joseph's

Patreon. And if you're using Windows,

you can use the pixel format, but on Mac

you can use splats. And basically all we

need to do is the first input is going

to be our positions. Uh I'm actually

going to turn off our geometry again in

here. So we'll basically just connect

our positions right here. And then over

in our color instances, that's going to

be our second input. And you'll see that

nothing really shows up. And it's

because the points are super super

teeny. So I can bump up the scale, but

also the alpha is quite low. So I can

bump this up as well. And from here,

what we can do is maybe increase the

bokeh a little bit. When we do that, we

need to change where it's focusing. So,

I will bring the focus

much closer. And that's basically it.

You can play around with this

functionality. Maybe if you want to fade

the background a little bit, you can

kind of have some fall-off range so the

background alpha fades more quickly.

And now if I open up the parameters on

this little container, you should be

able to see if we bring this little

noise feature back in, we'll get that

depth of field with the point clouds.

Then we can bring it back again. One

thing to note is that this PCR component

is actually a piece of geometry. So you

need to use it in a render network. And

for context, this little render network

is basically just the uh point render,

but all I've done is swapped out the

points with some boxes and added in a

little feedback loop over here with some

noise to be able to kind of move the

points around nicely. In the next

example, I want to show you how I set up

the screen space depth render. So, I'm

going to delete this little part over

here. Basically, what we can do is drag

in the component. This again is

available on Prismatics Patreon. And you

can see some of the really cool lighting

that they're getting just by using a

ramp here. And let's go ahead and

connect our raw depth. So, it's a little

chaotic right now, and that is partially

because the scale is super high. So, we

can bring that back a little bit. Then

I'll scale it down. And you know,

there's kind of a trade-off between what

you're getting with the shadows and

lighting and how much depth you actually

want to have in the scene. I'm also not

an expert on this tool, so hopefully

Prismatic can uh do a little bit more of

a deep dive on how to get some quality

renders out as well. Um, but then what

we can do is basically set the update

the bounds and then set the midpoint and

that will kind of help with the

rendering. And for material, we can kind

of play around with the roughness to get

some reflectiveness.

And the way I was bringing in the rim

lights is over here in the rim light

section. So I turn these on and I can

play around with um bumping up the

number of lights or the gain of those

lights as well. And we also have like a

main light over here. So, we could

increase the intensity of the

environment light or decrease the

intensity of the point light. The point

lights are getting a little washed out

right now. So, I think I'm going to turn

this down a little bit. And one thing

that I did to get rid of some of the

graininess here is that I actually ended

up um blurring it slightly. So, if you

put in a blur top

and then we increase it a little bit, we

should kind of smooth out some of the

edges and get things looking a little

bit crisper. So, you can see there's a

lot of there's some funny artifacts that

are along here. And if I just blur it a

little bit, um, we get some smoother

smoother quality content. The other

thing I wanted to do is basically remove

the background. And so the way I did

that is by putting in a threshold.

And basically all I ended up doing is

just setting uh don't use the luminance

here. You want to use the red. And then

you can set this all the way to one

pretty much and get a cutout of the

person. And then we'll just multiply

this with the original to get ourselves

a little bit of transparency for the

background. and connect it there. And

then we get this little cutout. One

other thing that I wanted to show as

well is how to use the actual webcam as

an environment light on here. So to do

this, you actually need to go inside and

if you want to use a live video for an

environment map, what you can do is go

to the environment light here and under

use pre-filter maps automatic, turn it

off. And what you can do from here is

that there's this little section under

the main lights. We can drag and drop

the synced webcam

onto here and we should basically get

the general lighting of our environment

on our environment map. But it is kind

of nice actually. I do like the original

environment map. Overall, I feel like it

gives you some really really cool

effects. Definitely my computer is

taking a little bit of a hit right now.

Oh, it's because I still have this

instancing turned on. Um,

so this is actually a really performant

way to render the depth map, which is

super super nice. Definitely check out

this component over on Prismatics

Patreon. And

yeah, excited for you all to play around

with it. One last thing that I'm going

to show is that if you've got a video

that you want to be able to capture the

highest resolution possible for this and

you want to record out a video of that

depth map, uh, you can do that with this

component. And I want to show how to do

that. So, let's bring in a movie file

in. And what we're going to do is switch

it over to the count video because it's

just easier to see that all the frames

are captured.

And you can see when it's updating, the

depth model is having a bit of a tough

time figuring out the depth of this cuz

aren't really 3D elements in this. But

what we can see is that whenever this is

updating, we can see there's a little

camera index down here. and we're

actually going to use that to progress

our model. So, we have a couple drop

frames. We have like maybe 15 drop

frames because I'm also recording right

now. So, I'm going to treat this as our

like higher quality version. And what we

can do is turn off instead of

sequential, we'll basically specify

index and we're going to remove this and

we're going to set it to

um zero frames for now because basically

at frame

index one we're going to get frame one.

Index two is frame two. Um but notice

that this camera index isn't updating

until it's finished processing. So, if I

set it to one or let's let's start it

off at zero. Notice that this little

camera index value updated. So, we're

going to select out. We'll put down a

select chop and we're going to snag the

camera index. And we can do a

chop execute.

Basically, whenever this value changes,

what we want to do is increment this

movie file index. And then we want to we

want to save the raw depth video. So

we'll do a movie file out.

All right. Torn from the future here. So

couple of notes on recording an actual

video of the depth. We're going to end

up having to lose a little bit of

information if we save this as a video.

Probably the best way that you could do

this if you want to maintain as much

information as possible is by recording

a movie file out and doing a bunch of

images but as open EXR. That can save

the highest quality data. But in my

case, I'm going to do Apple ProRes and

do uh X Q in order to get the least

amount of loss on my end. But then I'm

going to also bump this up to 12 bit. 16

bit be great ideally. Um but when we go

to record the video content, the thing

you need to do is actually normalize the

data because we have negative values

that are in our uh depth data. And you

can't just use a limit top. If you do a

limit top and you do normalize, what

happens is as the depth continues to

change, what'll happen is that it's

constantly taking that min and max and

it's rearranging everything from 0 to

one. So it'll make your image really

flickery. So instead, I'm going to

include this little normaliz component

that I just put together. And what it

does is it picks a particular duration.

So you can say like sample for about a

second. And what you do, let's play our

video here for a second. I'm going to go

ahead and turn this back to playing.

What it will do is I click this sample

button. It'll sample the min and max

depth for 1.17 seconds. You can choose

to do shorter or longer. And then it

will normalize it. So, it's kind of just

holding on to those values, getting a

minimum value, getting a maximum value,

and rearranging that for us, which is

great. And this is what you would pass

into your movie file. Cool. Back to the

rest of the video. What we want to do is

you'll turn on recording and you're

going to manually add a new frame

whenever it's done processing. And then

you're going to increment this movie

file index to be slightly higher. So

we're going to do a little bit of Python

programming for that. And for this

basically what we can do, I'll go ahead

and edit this file in VS Code. And we

want to get a reference. We don't we

don't actually need all these other

helper functions. We just want whenever

the value changes. And basically what do

we want to do? We want to get the movie

file. So we'll we'll get the operator

movie file at one. And we're going to

get the parameter

ad frame.

and we're going to call pulse on it to

add a new frame. And then we want to

increment the movie file index by one.

So let's set the operator. We're going

to get movie file in one.parindex

plus equals 1.

That's pretty much it. So now

we kind of just need to start that

cycle. So all we need to do to start

that cycle is basically increment a

value but we have no way of stopping it.

So one other thing that we can do is see

if the movie file is set to record and

if it is

then we will do the rest of these

pieces. And if we turn off recording we

don't want it to continue with this

loop. So uh why don't we just keep our

movie file out?

Um

movie file is equal to So if uh if our

movie files

record

part

then we'll do all these pieces. So we

can indent this over

and then we can add our MF

add frame. So fingers crossed this will

work. We are currently recording.

And don't forget to save to update it.

And

let's set this to index zero to ceue it

off. And we can see that we're starting

to increment. And we're getting we

should be getting all the individual

frames recorded. Now um you know

obviously this is going to record past

what we needed. So at a certain point we

can turn off record and that should stop

the loop.

So, I'm going to actually include this

as a little example, and this is going

to be available on the Patreon as well.

I'm going to include these little helper

pieces here for recording a video. Um,

just so that you have them set up, as

well as this little normalization

component. And lastly, if you want to be

able to use the depth data, we can then

snag that movie. We'll bring it back in.

You can see here that it's playing back.

We can see the different frames. If you

want to have this data laid out in this

UV format, you can actually just copy

the little component from inside here.

Uh you'll see that there's a shader that

says GLSL1. You can just copy this,

paste it outside and connect your film.

And this should rearrange it so that

we're able to connect it over here to

the rest of our input. Uh to get your

color data, you would want to basically

take your original movie file

and you would sync up the index of these

two. So we would go back to

lock to timeline, I guess. So this is

two frames before. So you can actually

just do specify index. And what you can

do to sync it up perfectly is just add

two frames to it. And this should give

you the same exact frame number. Oops,

we're a little too far into the video.

There we go. There we go. 22 22. Same

frame number. And then now you've got

these two elements which you could just

connect over here into your instancing.

We'll replace this as our color data.

And now we should be able to get our

instancing there. Um, I would highly

highly recommend using the fit though

for both of these because the main goal

was to scale down the number of points

that are being used. If you do that,

it'll auto resize the resolution. But

you can kind of see this is rendering

out our content using I think uh the

point cloud component.

We can also render it out using

the geometry as well. So cool. Oh, we're

getting our depth data from our video

now perfectly. Um, so I guess that's the

main takeaway here. So yeah, I'll be

including the normalized component here

and the configuration to do the movie

file recording as well. One thing that I

wanted to mention is that Dean Chesman's

got this really great tutorial on Touch

Designer showing how to use the current

build, but also the experimental build

with POPS to create this kind of like 3D

depth point cloud rendering. Um,

definitely go take a look. And this

plugin actually works perfectly with

that implementation. So, it's a great

kind of continuing next step from this.

And I'll put a link to that tutorial in

the description. Go check it out. Cool.

I think that wraps it up. Stay tuned cuz

I've got another really exciting machine

learning model that I've been working

quite a bit on to incorporate into Touch

Designer super super seamlessly. And if

you're not subscribed to the YouTube

channel, if you want to stay uptoate on

some of the things that I'm doing, you

can subscribe there. Or if you want

notifications, you can sign up to my

Patreon for free and you'll get email

notifications for the new pieces that

I'm releasing. Also, one thing that was

really sick is that PJ Visuals, Nick

Marott, tagged me in this post, and it

sounds like they ended up using the

depth anything component to prototype

some of the visuals for the 9-in nails

music video. That was honestly such a

cool moment to see that Touch Designer

is being used in a context like that.

So, I'm super excited to see what you

all end up making with this component.

And yeah, keep me posted. I It honestly

is like really inspiring to see how you

all working with these tools. And I'm

excited to start working on developing

some of these new features into my other

plugins as well. All right, if you stuck

with me this far, thanks for following

along and I look forward to seeing you

in the next one.

Loading...

Loading video analysis...