Training FAQs

What is ‘Training’?

Training gives our AI the chance to learn the actors face and is the most important element in delivering a realistic lipdub.

How long is training?

The average actor training takes 20min - 40min to complete!

How much footage is needed?

For good results, we recommend 30 seconds to 1 minute of speaking footage for the actor

E.g. If I have a 10 second clip I’d like to Lipdub, then ideally, before I go to step 2 “Label Actors”, I would upload an extra video of 50 seconds in the “Additional Footage for Training” section.

🗒️

Note: there is no hard requirement for training footage length. You can train on as little footage as you’d like, but the less footage for an actor, the more likely the result will be suboptimal.

Do I need to train every actor?

You only need to train actors that you want to lipdub.

Therefore you can save credits by not training on actors you do NOT need to lip-sync.

‼️ IMPORTANT ‼️ Footage Consistency

For best possible results, your training footage needs to match the footage you would like to lipdub.

This includes all video metadata if possible (e.g. resolution, frame rate, color grade).

⚡

example: Uploading a 20min ungraded video at 720p when the video you’d like to Lip-sync is a 1min 4K and has a LUT applied will not result in a good Lip-Dub.

Check out the Supported video formats and file specifications article to better understand all the ways your footage can/should be made consistent.

A quick way to match the grading is using Adobe’s Premiere’s color matching tool. (While this feature might not be perfect for color correcting actual production shots, the consistency it provides is enough for Lipdub and it is very fast.)