How can we help? 👋

Potential Video Limitations of LipDub AI

Understand where LipDub struggles.

Intro - What to Avoid


If your video has any of the below examples, we want you to be cautious of the end result. LipDub will produce a result even if your footage contains everything listed below.


If you have any questions join our community discord! Check out our link article to learn how to join. But if you prefer to email we’re happy to help so don’t hesitate to reach out at

LipDub focuses on the mouth region of the face. Therefore any situation that adds complexity in the mouth area of the face will interfere with LipDub’s mask and may cause artifacts to appear in your final rendered video. But we are improving everyday and we’re striving to resolve all quality issues!
Notion image

Object Interference

Object Interference is when an object (hand, microphone, etc.) comes into the LipDub mask area.

LipDub can generate a result despite interferences in front of the face, but the end result might create strange visual artifacts.

Full Interference vs Partial Interference

Notion image

Full Interference

Notion image

Partial Interference

Example of LipDub result with Interference

Interference but the object is stable & consistent:


Interference but the object is unstable & not-consistent:

Side Profile

Side profile shots are more difficult for Lipdub AI to lip sync.

However, we recently made some significant jumps in improvement for results with side pose!



Graphics on Screen

This only applies to graphics that fall within the face mask region.

Lipdub will have difficulty re-generating the text perfectly. It is recommended you apply any graphics to the video after LipDub has been applied.

Notion image

Visual FX & Transitions (blurs, fades, zoom ins)

LipDub will dub the face, even when there are transition effects. This can create strange looking results, as LipDub will not match the effect perfectly.

It is always best to apply these FX after the LipDub has been applied.


Lipdub is quite good at handling small beards! But it should be noted that high-frequency details on the face are always harder to handle, especially during extreme close ups camera positions.

If possible, beards should be avoided.

Notion image
Note: Long beards in particular are difficult as our LipDub AI mask does not cover the full length of the beard.

Extreme Camera Angles

If possible, try avoiding camera angles like the one’s show below.

These camera angles are fairly uncommon, and as a result LipDub may have a more difficult time perfectly pasting back their mask area of their mouth compared to a straight to camera position.


Camera position: Bottom up

Notion image

Camera position: Top down

Notion image

Higher than 8-bit depth videos

For example: 10, 12, 16, 32-bit depth

LipDub AI supports all bit depths as an input. However, currently LipDub AI can only output 8-bit depth video on the mouth region.

Extreme Close ups

When the face is so close to the camera where only the mouth is visible. LipDub will be unable to detect a face, and therefore will not Lip-sync this actor.

Notion image

Different Colored Footage

If the face color is different in the training footage compared to the video you’d like to Lip-sync, this will confuse the AI and increase the chance of artifacts to appear to final rendered video.

E.g. Additional footage for training is color graded blue tint but the clip to Lip-sync at the very top of this screenshot is ungraded.

Example of Lipdub result when color varies in the data:

Low-light Footage

When there is very little light in a scene, it is challenging for LipDub to identify the face in the dark. This is make it difficult or impossible in some cases to lip-sync the face.

Notion image

Faces that are too large in frame

LipDub performs its work on a 1024px box.

When faces are larger than this 1024px box LipDub will have to down-sample and then up-sample back to the larger pixel range.

This may cause artifacts to appear on the final render result and we recommend if possible to keep the actors face within this 1024px box.


Example: This is 4K video with a extreme close up face.

STEP 1: Identify face on screen

Notion image

STEP 2: Down-sample

Notion image

STEP 3: Up-sample

Notion image
Did this answer your question?