TLDW logo

NVIDIA’s New AI’s Movements Are So Real It’s Uncanny

By Two Minute Papers

Summary

## Key takeaways - **Motion Imitation: The Force & Torque Problem**: Copying human movement in computer programs is impossible because motion capture data shows what to do but not how to apply the necessary forces and torques at every joint and time instance. [00:07], [00:38] - **DeepMimic: Motion Imitation as a Video Game**: DeepMimic turned motion imitation into a game where virtual characters learned to mimic motion capture data by maximizing scores for joint angles and contacts through endless retries. [00:48], [01:20] - **DeepMimic's Limitation: Manual Score Tuning**: DeepMimic required extensive manual tuning of hundreds of score counters for each joint and movement, making it difficult to adapt to new motions or body types. [02:36], [03:02] - **ADD: AI Judge for Automatic Motion Imitation**: The ADD system uses an AI judge to automatically determine what a perfect performance looks like, eliminating the need for manual score tuning and improving motion realism. [03:43], [04:08] - **ADD vs. DeepMimic: Performance Comparison**: While initial tests showed similar results, ADD significantly outperformed DeepMimic and other methods in complex movements like parkour jumps and climbing, producing fluid and believable motions. [04:39], [06:08] - **ADD's Versatility and Limitations**: ADD works on various body morphologies, controls robots, and can perform diverse behaviors, though it may struggle with extremely flashy tricks, sometimes causing the AI to give up. [06:53], [08:46]

Topics Covered

  • Why Copying Human Motion Is "Impossible."
  • Hand-Tuning AI Motion: A PhD Nightmare?
  • Why AI Judges Outperform Manual Motion Tuning.
  • Why Groundbreaking AI Research Goes Unnoticed.
  • The "First Law of Papers": Predicting AI's Evolution.

Full Transcript

There is one thing that I really want. What I  want is a digital character to move exactly like a  

human. Well, just copy it then, right? Do a little  motion capture with the sensors on the human body,  

record it running, jumping, reading papers, and  then, copy it within a computer program. Well,  

unfortunately, that is impossible. Okay, why?  Well, this motion capture data shows you what  

you need to do, but it does not tell you  how. You see, these virtual characters  

have muscles and joints, and to copy these  movements, you would need to come up with  

the forces and torques exerted everywhere at  every time instance to be able to mimic it.

Oh man, that is hard. Really hard.

But, this amazing paper from 2018 could  already do that. It was called DeepMimic,  

and goodness, this…am I seeing correctly? This  really is able to match the reference motions  

superbly. Is that a word? I don’t know.  I don’t care. We talked about this work  

approximately 500 videos ago. Yes I heard you. I  heard what you just said. And the answer is yes,  

we’ve been around that long.  Almost a 1,000 paper videos now.

Okay, DeepMimic. This worked by turning motion  imitation into a video game where every joint,  

angle, and contact had its own little  score counter. The controller played  

this game over and over, tweaking  its moves through endless retries,  

until it learned how to rack up the  maximum score. Which is perfect imitation  

of the motion capture performance. And  boy, does it look perfect in places.

But it got better, it worked on a bunch of  different body morphologies as well. And you  

could even do the favorite pastime of the computer  graphics researcher: throwing boxes at it, for how  

long? Until it collapses of course. You could even  do some art direction where you would ask it to  

dance a bit more vigorously. Yes, more life, more  life, more energy! Oh baby. It looks terrible but  

I still love it. And now, you’re tired after doing  all this kung fu. Yup, that one checks out too.

So, is this work perfect? It sure seems so! Well,  

it is not. Here is a dirty little  secret, but don’t tell anyone.

So here’s the problem: every single one of  those little score counters in DeepMimic  

had to be designed and tuned by hand. You had  to decide exactly how many points to give for  

matching a knee angle, how harshly to punish a  wobbly torso, and how much to care about foot  

placement versus balance. Change the motion,  or even switch to a robot body, and suddenly  

all your scores are wrong again. You spend days  turning invisible dials just to get something  

that doesn’t collapse into a breakdancing  mess. Just to get a flavor of the paper,  

you have to optimize these: joint rotations,  velocities, root velocity, end effectors that  

describe where hands and feet should go, and  center of mass. Yum yum yum. Tastes great,  

but man, that’s a lot to optimize. And you have  to do it by hand, manually? Oof. This system is  

held together by duct tape and the tears of  PhD students. There’s got to be a better way!

And that’s where this new paper, ADD, the  Adversarial Differential Discriminator comes  

in. Instead of hard-coding hundreds of score  counters, it introduces an AI judge that learns  

automatically what a perfect performance looks  like. The system plays the same imitation game,  

but now, instead of manually juggling  separate scores for elbows, knees,  

and toes, the judge gives a single  verdict on how close the motion feels  

to the real human one. As training  goes on, this judge gets smarter,  

focusing attention on the parts that still look  off and pushing the character to refine them.

So, previous DeepMimic lots of hand-tuning,  new ADD, single automatic AI judge.

Okay, good. But that does help? Hmm… they say  this is better than DeepMimic? Well, I’d like  

to see that! And, well well well. I am not seeing  that. With pink, we got DeepMimic, the previous  

method and with blue, ADD, the new technique. Both  are nailing the problem, but it’s not better. So,  

have we been deceived by the marketing department  again? I swear I’m not buying more car insurance.

Okay, let’s calm down and see if this next test  is better. Here is the reference movement. A  

little parkour! Loving it. So, the previous AMP  method does exactly what we all feel like at the  

moment. Disqualified. Get out of here. And now,  DeepMimic is about to nail it, as always. Wait a  

second…that’s not it, sir. You need a little  more carbohydrates before working out! Wow,  

it failed. And if you think this is a failure  case, now check this out. Now off with you,  

eat a nutrition bar while we watch the reference  motion do the jump. Okay, got it? Got it. So now,  

start running, and…oof! Sir. Sir! Are you  okay? Goodness, failed again, even worse.

Now hold on to your papers Fellow  Scholars and let’s see if the new  

technique with the AI judge is any  better, and…oh my goodness. They did  

it. What about the climbing? Let’s  see…absolutely nailed that too. Wow!

The motions are really fluid and believable,  

and physically correct. It controls all  the joints correctly. That is really tough.

And yes, I know what you want. You also want to  see the earlier low energy AMP do the jump too.  

Your wish is now granted. Ain’t no jumping  here brother. They don’t pay me enough. And  

believe it or not, it’s going to get even  better. Here’s how. Dear Fellow Scholars,  

this is Two Minute Papers with Dr.  Károly Zsolnai-Fehér. Dr. Carroll.

Oh yeah! We still retain DeepMimic’s wonderful  

property of working on different body  morphologies. I am starting to flip out.

So, the crowd favorite, the walking sausage man,  not a problem. ADD does as well as hand-tuned  

techniques, but this one works automatically. It  can control a robot too, it can fall and get up.

It can do a variety of different behaviors,  karate, jumping on one leg, walking an invisible  

dog. And at this point, I am so happy, this  is basically me. So good! Mother of Papers.

Here we also have an ablation study,  which takes out each individual puzzle  

piece that they invented, and they show  how the system breaks when you remove  

these individual puzzle pieces. In short,  they show one by one that everything they  

invented here is indeed useful. Not just  a soup of stuff that works, but they show  

that each piece is necessary. I still can’t  believe that they’ve actually done it. Wow!

And now, before I tell you about  the limitations of this new method,  

look at this. Oh my goodness. Is this  for real? This research paper is an  

absolutely marvelous piece of work, and  nearly nobody is talking about it. Wait a  

minute. Not nearly nobody. Actually nobody  is talking about it. Let me try to help.

So this is why I feel that talking about  these research works is so important,  

it’s a bit like saving endangered species. If we  don’t do it here, nobody does it. So please, save  

a paper today, like this video, subscribe, hit  the bell icon, and leave a really kind comment.

Okay, so it’s not flawless, of course.  Sometimes the AI judge gets confused on  

the flashier tricks - instead of pulling off a  smooth backflip, the poor thing just lies down  

and gives up halfway. It’s a bit like a dance  judge who judges the waltz well but freezes  

when the performer suddenly tries parkour  - you know, still learning what “graceful”  

means when gravity is involved. Reminds  me of an earlier paper where an AI player  

collapsed, get this, in a way that reprogrammed  the mind of the other AI to lose. Insanity.

But just think about it. AI systems  now don’t just imitate motion,  

they actually understand how we move around.  And just two more papers down the line,  

and I am sure these digital creatures  will learn to move with the same grace  

and intent as living ones. That is the First  Law of Papers. Do not look at where we are,  

look at where we will be two more papers  down the line. What a time to be alive!

So, digital ninjas, motion shapers,  

smoother than your graphics shaders  - subscribe to Two Minute Papers!

Loading...

Loading video analysis...