- A model which predict gestures from speech
- This repository is based on text2gesture
- original paper
See "Download raw data" in "Speech_driven_gesture_generation_with_autoencoder" repository
See "Split dataset" in "Speech_driven_gesture_generation_with_autoencoder"
python create_vector.py DATA_DIR
- Dataset is created by separating 64 frames each (both speech and motion)
- Shape
- Speech: (block of frames, 26, 64)
- Motion: (block of frames, 192, 64)
- The mean and standard deviation parameters obtained when standardizing the training data are located in
. /norm/.
python train.py [--batch_size] [--epochs] [--lr] [--weight_decay] [--embedding_dimension]
[--outdir_path] [--device] [--gpu_num] [--speech_path] [--pose_path] [--generator]
[--gan] [--discriminator] [--lambda_d]
- See "Usage" in "text2gesture" for details.
python predict.py [--modelpath] [--inputpath] [--outpath]
- The argument of
--modelpathis set to specifies the folder where the generator model is located- model is output by
train.pyand located in./out/datetime/generator_datetime_weights.pth
- model is output by
python reshape-predict.py [--denorm] [--denormpath] [--datatype] [--npypath] [--outpath]
- If you want to undo the normalized data, set the argument of
--denormto 1. In this case,--denormpathand--datatypeshould be set. (--datatypedefaults to train.)--denormpathand--datatypeare arguments to specify the directory where mean and standard deviation parameters obtained when standardizing the training data are located (Same as/norm/output path in chapter 3.)
--npypathis set to the folder where the test data is located