Notice: This codebase is the SMPL-X verison of the original T2M work

For more information, please refer to our work FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions.

Git Branches

main: Original T2M
humanml3d-smpl-dist: distributed training of T2M
humanml3d-smplx-dist: distributed training of T2M-SMPL-X

We have devoted significant effort to training this version of the model very carefully. So if you find this version useful, please cite our work along with the original T2M paper:

@article{DBLP:journals/corr/abs-2601-12799,
  author       = {Peng Li and
                  Zihan Zhuang and
                  Yangfan Gao and
                  Yi Dong and
                  Sixian Li and
                  Changhao Jiang and
                  Shihan Dou and
                  Zhiheng Xi and
                  Enyu Zhou and
                  Jixuan Huang and
                  Hui Li and
                  Jingjing Gong and
                  Xingjun Ma and
                  Tao Gui and
                  Zuxuan Wu and
                  Qi Zhang and
                  Xuanjing Huang and
                  Yu{-}Gang Jiang and
                  Xipeng Qiu},
  title        = {FRoM-W1: Towards General Humanoid Whole-Body Control with Language
                  Instructions},
  journal      = {CoRR},
  volume       = {abs/2601.12799},
  year         = {2026},
  url          = {https://doi.org/10.48550/arXiv.2601.12799},
  doi          = {10.48550/ARXIV.2601.12799},
  eprinttype   = {arXiv},
  eprint       = {2601.12799},
  timestamp    = {Tue, 24 Mar 2026 08:45:06 +0100},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2601-12799.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Generating Diverse and Natural 3D Human Motions from Text (CVPR 2022)

[Project Page] [Paper]

Given a textual description for example, "the figure rises from a lying position and walks in a counterclockwise circle, and then lays back down the ground", our approach generates a diverse set of 3d human motions that are faithful to the provided text.

Python Virtual Environment

Anaconda is recommended to create this virtual environment.

conda create -f environment.yaml
conda activate text2motion_pub

If you cannot successfully create the environment, here is a list of required libraries:

Python = 3.7.9   # Other version may also work but are not tested.
PyTorch = 1.6.0 (conda install pytorch==1.6.0 torchvision==0.7.0 -c pytorch)  #Other version may also work but are not tested.
scipy
numpy
tensorflow       # For use of tensorboard only
spacy
tqdm
ffmpeg = 4.3.1   # Other version may also work but are not tested.
matplotlib = 3.3.1

After all, if you want to generate 3D motions from customized raw texts, you still need to install the language model for spacy.

python -m spacy download en_core_web_sm

Download Data & Pre-trained Models

If you just want to play our pre-trained models, you don't need to download datasets.

Datasets

We are using two 3D human motion-language dataset: HumanML3D and KIT-ML. For both datasets, you could find the details as well as download link [here].
Please note you don't need to clone that git repository, since all related codes have already been included in current git project.

Download and unzip the dataset files -> Create a dataset folder -> Place related data files in dataset folder:

mkdir ./dataset/

Take HumanML3D for an example, the file directory should look like this:

./dataset/
./dataset/HumanML3D/
./dataset/HumanML3D/new_joint_vecs/
./dataset/HumanML3D/texts/
./dataset/HumanML3D/Mean.mpy
./dataset/HumanML3D/Std.npy
./dataset/HumanML3D/test.txt
./dataset/HumanML3D/train.txt
./dataset/HumanML3D/train_val.txt
./dataset/HumanML3D/val.txt  
./dataset/HumanML3D/all.txt

Pre-trained Models

Create a checkpoint folder to place pre-traine models:

mkdir ./checkpoints

Download models for HumanML3D from [here]. Unzip and place them under checkpoint directory, which should be like

./checkpoints/t2m/
./checkpoints/t2m/Comp_v6_KLD01/           # Text-to-motion generation model
./checkpoints/t2m/Decomp_SP001_SM001_H512/ # Motion autoencoder
./checkpoints/t2m/length_est_bigru/        # Text-to-length sampling model
./checkpoints/t2m/text_mot_match/          # Motion & Text feature extractors for evaluation

Download models for KIT-ML [here]. Unzip and place them under checkpoint directory.

Training Models

All intermediate meta files/animations/models will be saved to checkpoint directory under the folder specified by argument "--name".

Training motion autoencoder

HumanML3D

python train_decomp_v3.py --name Decomp_SP001_SM001_H512 --gpu_id 0 --window_size 24 --dataset_name t2m

KIT-ML

python train_decomp_v3.py --name Decomp_SP001_SM001_H512 --gpu_id 0 --window_size 24 --dataset_name kit

Train text2length model:

HumanML3D

python train_length_est.py --name length_est_bigru --gpu_id 0 --dataset_name t2m

KIT-ML

python train_length_est.py --name length_est_bigru --gpu_id 0 --dataset_name kit

Training text2motion model:

HumanML3D

python train_comp_v6.py --name Comp_v6_KLD01 --gpu_id 0 --lambda_kld 0.01 --dataset_name t2m

KIT-ML

python train_comp_v6.py --name Comp_v6_KLD005 --gpu_id 0 --lambda_kld 0.005 --dataset_name kit

Training motion & text feature extractors:

HumanML3D

python train_tex_mot_match.py --name text_mot_match --gpu_id 1 --batch_size 8 --dataset_name t2m

KIT-ML

python train_tex_mot_match.py --name text_mot_match --gpu_id 1 --batch_size 8 --dataset_name kit

Generating and Animating 3D Motions (HumanML3D)

Sampling results from test sets

python eval_comp_v6.py --name Comp_v6_KLD01 --est_length --repeat_time 3 --num_results 10 --ext default --gpu_id 1

where --est_length asks the model to use sampled motion lengths for generation, --repeat_time gives how many sampling rounds are carried out for each description. This script will results in 3x10 animations under directory ./eval_results/t2m/Comp_v6_KLD01/default/.

Sampling results from customized descriptions

python gen_motion_script.py --name Comp_v6_KLD01 --text_file input.txt --repeat_time 3 --ext customized --gpu_id 1

This will generate 3 animated motions for each description given in text_file ./input.txt.

If you find problem with installing ffmpeg, you may not be able to animate 3d results in mp4. Try gif instead.

Quantitative Evaluations

python final_evaluation.py

This will evaluate the model performance on HumanML3D dataset by default. You could also run on KIT-ML dataset by uncommenting certain lines in ./final_evaluation.py. The statistical results will saved to ./t2m_evaluation.log.

Misc

Contact Chuan Guo at cguo2@ualberta.ca for any questions or comments.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.devcontainer		.devcontainer
.sii		.sii
.vscode		.vscode
analysis		analysis
common		common
data		data
docs		docs
eval_log		eval_log
glove		glove
motion_loaders		motion_loaders
networks		networks
options		options
playground		playground
scripts		scripts
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
final_evaluation_x.py		final_evaluation_x.py
input.txt		input.txt
requirements.txt		requirements.txt
train_comp_v6_x_dist.py		train_comp_v6_x_dist.py
train_decomp_v3_x_dist.py		train_decomp_v3_x_dist.py
train_length_est_x_dist.py		train_length_est_x_dist.py
train_tex_mot_match_x_dist.py		train_tex_mot_match_x_dist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notice: This codebase is the SMPL-X verison of the original T2M work

Generating Diverse and Natural 3D Human Motions from Text (CVPR 2022)

[Project Page] [Paper]

Python Virtual Environment

Download Data & Pre-trained Models

Datasets

Pre-trained Models

Download models for HumanML3D from [here]. Unzip and place them under checkpoint directory, which should be like

Download models for KIT-ML [here]. Unzip and place them under checkpoint directory.

Training Models

Training motion autoencoder

HumanML3D

KIT-ML

Train text2length model:

HumanML3D

KIT-ML

Training text2motion model:

HumanML3D

KIT-ML

Training motion & text feature extractors:

HumanML3D

KIT-ML

Generating and Animating 3D Motions (HumanML3D)

Sampling results from test sets

Sampling results from customized descriptions

Quantitative Evaluations

Misc

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Notice: This codebase is the SMPL-X verison of the original T2M work

Generating Diverse and Natural 3D Human Motions from Text (CVPR 2022)

[Project Page] [Paper]

Python Virtual Environment

Download Data & Pre-trained Models

Datasets

Pre-trained Models

Download models for HumanML3D from [here]. Unzip and place them under checkpoint directory, which should be like

Download models for KIT-ML [here]. Unzip and place them under checkpoint directory.

Training Models

Training motion autoencoder

HumanML3D

KIT-ML

Train text2length model:

HumanML3D

KIT-ML

Training text2motion model:

HumanML3D

KIT-ML

Training motion & text feature extractors:

HumanML3D

KIT-ML

Generating and Animating 3D Motions (HumanML3D)

Sampling results from test sets

Sampling results from customized descriptions

Quantitative Evaluations

Misc

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages