The Rise of AI Motion Capture

Never a dull moment with artificial intelligence! It’s been all the rage ever since GPT-4 drew everyone’s attention. Now, with new models coming out left and right, everyone’s wondering if they’re going to lose their jobs someday. We come bearing more news and analyses for people in the film, animation, and gaming industries, ourselves included! AI Motion capture could be the next hit.

But what exactly is AI mocap? What does it involve? As an international studio with quite the capability in motion capture, we can help. Here’s our take on AI disruptions in the field of mocap!

An Introduction to Motion Capture

So, what is mocap, again? Here’s a quick refresher into what it is and can be used for. If you’re already familiar with the technique and the equipment, feel free to skip to the AI mocap intro!

A Brief Explanation of Mocap

Motion capture is a technique that allows us to capture real-life movements in the form of digital data. Traditionally, it’s done using high-precision cameras, special markers that help identify joints, and inertial sensors.

Nowadays, many studios (including ourselves) go for markerless motion capture, using advanced suits instead of markers and cameras. Here are some of mocap’s pros & cons if you’re interested!

What is Motion Capture Used For?

The technology is now used virtually in any activity that involves movement as a principal factor. Film, animation, and video game studios use it for lifelike character movements. Universities and labs capture and analyze movement to understand it better for human beings or animals. Mocap has also become indispensable to the robotics industry because it provides much-needed ones and zeros that could power smoother robot movements.

Read more about motion capture uses here!

See motion capture at work in this behind the scenes video from Death Stranding

An Intro to AI Motion Capture & How It Works

Enter artificial intelligence! It’s not surprising to see AI incorporated in such a useful technology, especially with the rise of GPT-4, New Bing, MidJourney, DALL-E, and many more AI-powered systems. Let’s see exactly how AI could disrupt traditional paradigms!

What is AI Motion Capture?

Or, how is AI used in mocap? It’s used in two ways, with the first being prominent for the time being.

AI-Driven or AI-Powered Mocap: Motion capture on (AI) steroids! Here, capturing the mocap data, cleaning it up, and applying it to 3D models becomes much easier thanks to artificial intelligence assisting with motion recognition, data filtration, 3D mapping, and asset creation.
AI-Generated Mocap: Motion capture but without any new physical movements! This is a more novel, difficult area where an AI would create movements. The act would be similar to what AI art generators do, drawing on a staggering amount of captured movement data and months (if not years) of AI training.

So, from here on out, we’ll mostly talk about AI-driven mocap.

How Does AI Motion Capture Work?

The short answer is: like any other artificial intelligence. If you’d like an AI to assist you in motion capture, you should first train it. You begin with writing code for the AI to take in massive amounts of data as input, do a task, and provide output. Then, you evaluate the output and provide feedback so the AI can optimize its process. That’s machine learning in motion capture!

But mocap data is highly complex, so machine learning isn’t nearly enough. To optimize the process further, you should create artificial neural networks and leverage deep learning. To simplify, deep neural networks are inspired by the human brain and have algorithms that can evaluate the output in terms of what is needed. These networks optimize themselves, so there is minimal need for human intervention.

The bottom line is that to have a chance at a competent AI in various motion capture stages, you should set up deep neural networks and give them enough data, time, and computational resources.

A video about NVIDIA AI and how it was trained

Why Use AI Motion Capture?

Now that we know what AI mocap is and how it works, let’s compare it with how things are traditionally done. In theory, AI-driven motion capture is faster, more accurate, and better on almost all counts.

We said “in theory” because motion capture is still new ground for artificial intelligence. Many companies today are working on it, and some amazing AI models and systems are even out. Still, it generally takes a lot of time and energy to create an AI that can be exceedingly accurate in mocap. So, with that thought, let’s get to the comparison!

Pose Tracking & Estimations

In traditional motion capture, sensors (or markers) virtually tell the system where a joint is located or how exactly the movement has occurred.

AI can take a more holistic approach in a mocap session, taking in factors like the environment, the actor’s physique, or possible movement estimations to give you more accurate tracking.

Facial Expression Recognition

The human face doesn’t “move” too perceptibly, not in the way the legs and hands move. Plus, putting more physical sensors on the face hasn’t been very comfortable or effective.

AI comes in extremely handy here. It’s faster in recognizing, classifying, and applying even the subtlest facial cues!

Real-Time End Results

In more advanced motion capture sessions, the system receives the raw motion capture data from suits in real time. Then, it gives you a preliminary sketch of how the characters would move, again in real time. (That’s why we can do remote mocap here at Picotion!)

But in other, more traditional sessions, there’s a significant delay between when the actors act and when the results appear. AI could speed things along in both scenarios.

Data Augmentation & Consistent Accuracy

Like the human brain, artificial intelligence does its tasks thanks to its neural network that takes in and evaluates data. Also like the human brain, you can modify existing data and feed them to a deep neural network. Think of how our brains twist facts! Except with these twists, we could get modified data and use them as new training material.

Unlike the brain, however, an AI model doesn’t forget anything and isn’t affected by external or internal stimuli. So, if there’s no fault in the code, the data, or the infrastructure, accuracy remains high and goes higher.

Less Equipment

As we’ll see in the next section, a tremendous upside to AI motion capture is that you probably won’t need the traditional mocap equipment, namely a studio, high-performing computers, suits, cameras, etc.

For example, if you’re creating motion capture data from video, you might just need one or two cameras and a high-speed internet connection. Read on to discover more!

A video on the evolution of motion capture before the latest AI improvements

What’s Next? (AI Mocap In Action & Future Trends)

Most of the cases we’ve covered in our article on motion capture applications leverage artificial intelligence already. But for this article, let’s look at some companies that are really pushing the boundaries!

Move AI: Mocap with Two or More iPhones

This is a British company that brings high-fidelity motion capture to iPhones. According to the videos on their website, the mocap data produced using their technology compares to those of specialized mocap cameras and suits.

Of course, like any emerging technology, Move AI has downsides as well.

They only have an iOS app at the moment, while Android dominates the global market. The company encourages everyone to experiment with any cameras and use Move AI’s server-side software to achieve results.
You need at least two iPhones for their software to be quite impressive, and that’s only for one person acting. Plus, to achieve the preciseness of a Rokoko Suit or advanced optical mocap, you’ll need a double-digit number of iPhones!

Kinetix: Video or Text to Mocap Data

The French company Kinetix specializes purely in turning videos into 3D assets they call “emotes”, according to the video game tradition. So, you can record yourself doing something, and their AI would turn it into a 3D asset in a few minutes.

You can use these assets anywhere, including in video games and the metaverse. Their web-based software can process any video up to 15 seconds. (Here’s how to optimize video recording. They say your webcam works too, which is impressive!)

What’s even more impressive is that they also have an AI-powered text-to-emote function. Like in AI art generators, you can prompt Kinetix and get the desired emote.

HaptX: Realistic Touch for AR & VR

Moving on from video to motion capture data, here’s a US-based company that produces gloves for VR and the metaverse! HaptX has been around since 2012. Their gloves have consistently been applauded for providing a real sense of touch (tactile or haptic feedback) in virtual experiences. These gloves are still super expensive but more affordable than ever.

Where does AI motion capture come in? Each glove has a proprietary motion capture system, accurate to a sub-millimeter level. Given that these gloves operate in real-time, it’s very likely that artificial intelligence is involved in optimizing the raw mocap data.

NVIDIA: AI Mocap Research & Infrastructure

NVIDIA Corporation, the US-based tech giant, is the last on our list but one of the first that ever thought of motion capture and AI together. They have so many datasets, software development kits (SDKs), and products that it’s hard to keep count. Numerous companies around the world are using these tools to push the boundaries on everything AI, including Move AI that we mentioned earlier in this section.

They’ve powered various companies by making motion capture more accessible, created dances with only musical input, and their CEO’s virtual avatar can sing thanks to an ingenious combination of cutting-edge tech!

Toy Jensen [Huang] sings “Jingle Bells”

AI Motion Capture Technical & Ethical Challenges

We’re dealing with artificial intelligence, emerging technologies, and human footage. Any new technology has its technical challenges, but there are also some ethical issues here that are gray at best. We’ll sidestep the issue of “AI anxiety”, the fear that many of us have for AI to replace us at our jobs, since it’s out of scope for this article. We believe that in addition to taking jobs, AI also creates jobs.

With that in mind, let’s go over these challenges before we wrap things up!

Technical: Enormous Datasets

As we’ve discussed, any AI model requires massive amounts of data and processing. Without enough data, the model doesn’t have a chance at accuracy. Even if you have the massive data, properly setting it up requires considerable computer resources and deep-learning expertise.

Ethical: Environmental Impact

Among all the possible repercussions of AI, the environmental impact is now significant and can be attributed to several factors. Here are four:

Energy Consumption: The GPUs (Graphics Processing Units), TPUs (Tensor Processing Units), and other AI-specific hardware are high-performance and thus need large amounts of energy to run.
Carbon Emissions: The electricity used for AI often comes from power plants, which could utilize fossil fuels that release carbon dioxide (CO2) into the atmosphere.
Water Usage: All data centers (including those where AIs are based) are prone to overheating, and so require cooling systems that rely on water consumption.
E-Waste: AI is rapidly evolving, so competent hardware for training and employing it is rapidly becoming obsolete in favor of new, better hardware. Old hardware is considered electronic waste and can be challenging to recycle.

Ethical: Data Privacy

Everyone’s data should be used with their full consent. In other words, how are we gathering those huge amounts of data used in training AI models? This is a challenge for artificial intelligence in every field, not just motion capture.

You can visit a company’s Privacy Policy to find information on how your data is being handled. Switch companies if the Privacy Policy is nonexistent, vague, or not governed by respectable governments or laws!

Ethical & Technical: Potential Biases

This one’s about dealing with possible biases in the original data used for training. (The issue is true of all artificial intelligence.) If your original data is biased, you’ll end up with a biased AI! Here’s a list of common biases in terms of AI mocap:

Cultural Bias: Motion capture data might favor specific cultures, leading to an underrepresentation of movements in other cultures.
Gender Bias: The data might lean toward certain genders, potentially causing the model to favor gender-specific movements.
Body Type Bias: The data might not cover all body types or physical abilities, leading to significant limitations or inaccuracies for the whole range of movements and body structures.
Activity Bias: The data might favor certain activities over others. For example, if you record only professional ballet performances, you’ll probably have an issue with everyday movements.
Camera View Bias: The motion capture data might only be acquired through specific camera angles, and so the system might have difficulty recognizing movement from other angles.

A video on ethics & AI

Final Thoughts

While we’re already seeing substantial developments in AI motion capture solutions, it’s still new, trending, relevant, and certainly multidisciplinary! Expect companies to clash, unite, and innovate for the time being, each trying to provide more value. Hopefully to provide more accurate and better AI algorithms for everyone to use!

Author

Mocap & Realtime Rendering

Is Mocap Next? The Rise of AI Motion Capture