推荐轻量级文生视频模型（Text-to-Video）

微信公众号：AI创造财富2025-06-18 15:17

模型名 ：damo/text-to-video-synthesis
输入：一句文字描述（如："a panda is dancing"）
输出：2秒视频（16帧，576x320 分辨率）
显卡推荐：8GB~16GB 显存（支持 CPU fallback）
生成时间：约 40~120 秒（满足你的要求）
优点：
- 真正文生视频（非插帧）
- HuggingFace + ModelScope CLI 接口友好
开源地址：
- GitHub：https://github.com/modelscope/modelscope
- 在线示例：https://modelscope.cn/models/damo/text-to-video-synthesis/summary
使用方式（简化）：
复制代码
pip install modelscope python -m modelscope.cli inference \ --model damo/text-to-video-synthesis \ --text "A dog running in the park"