๐ŸŽฌ MLD Text-to-Motion Generator

Generate realistic human motion animations from text descriptions! Powered by Motion Latent Diffusion (MLD).

๐Ÿ’ก Tips for Best Results:

  • Be specific: "a person walks forward slowly" works better than just "walking"
  • Use present tense: "walks" or "is walking"
  • Describe single continuous actions
  • Recommended length: 40-60 frames for short actions, 80-120 for walking/running

๐Ÿ“ Input

16 196

๐Ÿ“š Example Prompts

Examples
Text Prompt Motion Length (frames)

๐ŸŽฅ Output

Generate a motion to see the results here.


โ„น๏ธ About

Motion Latent Diffusion (MLD) generates 3D human motion by:

  1. Encoding text with CLIP
  2. Generating motion in latent space via diffusion (50 steps)
  3. Decoding to 3D joint positions (22 joints)
  4. Visualizing as a 3D skeleton animation

Citation: Chen et al., "Executing your Commands via Motion Diffusion in Latent Space", CVPR 2023

Repository: motion-latent-diffusion