生成数字人的视频效果
搭建步骤
下载git代码
bash
git clone https://github.com/TMElyralab/MuseTalk.git
创建conda环境
(建议使用 python 版本 >=3.10 和 cuda 版本 =11.7。)
bash
conda create -n musetalk python=3.10
进入conda环境
bash
conda activate musetalk
下载项目依赖包
bash
pip install -r requirements.txt
mmlab 封装
bash
pip install --no-cache-dir -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
注意:这一步可能提示错误,跳过即可
下载 ffmpeg-4.1.4-amd64-static
https://ffmpeg.org/download.html#releases
解压文件ffmpeg-4.1.4-amd64-static
bash
tar -xvf ffmpeg-4.1.4-amd64-static.gz
设置环境变量
bash
vi ~/.bashrc
最后一行填写上
bash
export FFMPEG_PATH=/root/workspace/MuseTalk/musetalk/ffmpeg-4.1.4-amd64-static
使配置生效
bash
source ~/.bashrc
验证安装:验证 ffmpeg 是否正确安装和配置:
bash
$FFMPEG_PATH/ffmpeg -version
bash
(musetalk) [root@iZ0jl0y9289xkrzfhm4p2wZ MuseTalk]# $FFMPEG_PATH/ffmpeg -version
ffmpeg version 4.1.4-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzvbi --enable-libzimg
libavutil 56. 22.100 / 56. 22.100
libavcodec 58. 35.100 / 58. 35.100
libavformat 58. 20.100 / 58. 20.100
libavdevice 58. 5.100 / 58. 5.100
libavfilter 7. 40.101 / 7. 40.101
libswscale 5. 3.100 / 5. 3.100
libswresample 3. 3.100 / 3. 3.100
libpostproc 55. 3.100 / 55. 3.100
验证ffmpeg是不是安装成功
bash
pip list | grep ffmpeg
bash
(musetalk) [root@iZ0jl0y9289xkrzfhm4p2wZ MuseTalk]# pip list | grep ffmpeg
ffmpeg-python 0.2.0
imageio-ffmpeg 0.5.1
(musetalk) [root@iZ0jl0y9289xkrzfhm4p2wZ MuseTalk]#
安装正确的ffmpeg-python 库:
bash
pip install ffmpeg-python
下载需要的模型
链接: https://pan.baidu.com/s/1NxELa1cvtu3aDh3d9sB1Jw?pwd=yptf
提取码: yptf
下载后依次放在models目录下面
bash
(musetalk) [root@iZ0jl0y9289xkrzfhm4p2wZ models]# ls -al
total 3905288
drwxr-xr-x 7 root root 4096 Jun 18 21:46 .
drwxr-xr-x 10 root root 4096 Jun 19 14:43 ..
drwxrwxr-x 2 root root 4096 Jun 18 21:19 dwpose
drwxrwxr-x 2 root root 4096 Jun 18 21:20 face-parse-bisent
drwxrwxr-x 2 root root 4096 Jun 18 21:18 musetalk
drwxrwxr-x 2 root root 4096 Jun 18 21:20 sd-vae-ft-mse
drwxrwxr-x 2 root root 4096 Jun 18 21:20 whisper
models的目录结构为:
./models/
├── musetalk
│ └── musetalk.json
│ └── pytorch_model.bin
├── dwpose
│ └── dw-ll_ucoco_384.pth
├── face-parse-bisent
│ ├── 79999_iter.pth
│ └── resnet18-5c106cde.pth
├── sd-vae-ft-mse
│ ├── config.json
│ └── diffusion_pytorch_model.bin
└── whisper
└── tiny.pt
准备没有声音的视频文件和音频文件
MuseTalk/data/video
MuseTalk/data/audio
编辑配置文件
bash
task_0:
video_path: "data/video/baichuanxu.mp4"
audio_path: "data/audio/baichunxu.wav"
bbox_shift: -7
模型推理
bash
python -m scripts.inference --inference_config configs/inference/test.yaml