ETPNav 复现指南:从环境搭建到连续环境视觉语言导航全流程

一篇面向研究者的完整踩坑笔记与操作教程

🌟 项目简介

ETPNav(Evolving Topological Planning)是连续环境视觉语言导航(VLN‑CE)领域一个强力的基线模型,由 Dong An 等人提出,论文已于 2024 年被顶级期刊 IEEE TPAMI 收录。

官方开源仓库:MarSaKi/ETPNav

该算法重点突破了传统方法在长距离规划避障控制上的局限,核心创新包括:

  • 在线拓扑建图与长距离规划:无需预先探索环境,通过自组织沿途预测的路点动态构建拓扑地图,将导航解耦为高层规划与底层控制,利用跨模态 Transformer 规划长距离路径。

  • 连续环境下的避障控制:提出基于试错启发式的鲁棒避障控制器(Tryout),有效防止智能体因碰撞陷入死锁。

本文是我在学习和复现 ETPNav 官方项目时整理的详细踩坑笔记,希望能为后续研究者提供一份清晰、闭环的参考教程。


1. 环境配置

本次复现采用 Python 3.8 环境,推荐使用 Conda 管理。

1.1 创建虚拟环境与安装 PyTorch

bash 复制代码
conda create -n vlnce38 python=3.8
conda activate vlnce38

安装 PyTorch 1.9.1 + cu111(两种方式任选):

方式一:直接 pip 安装(国内镜像加速)

bash 复制代码
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 \
    -f https://download.pytorch.org/whl/torch_stable.html \
    -i https://mirrors.cloud.tencent.com/pypi/simple

方式二:下载 whl 文件后本地安装(推荐)

bash 复制代码
## 下载torch 1.9.1+cu111(Python3.8/Linux x86_64)
wget https://mirrors.aliyun.com/pytorch-wheels/cu111/torch-1.9.1%2Bcu111-cp38-cp38-linux_x86_64.whl

## 下载torchvision 0.10.1+cu111(匹配上面的torch版本)
wget https://mirrors.aliyun.com/pytorch-wheels/cu111/torchvision-0.10.1%2Bcu111-cp38-cp38-linux_x86_64.whl

pip install torch-1.9.1+cu111-cp38-cp38-linux_x86_64.whl torchvision-0.10.1+cu111-cp38-cp38-linux_x86_64.whl

1.2 安装项目依赖

bash 复制代码
pip install "pip<24.1" setuptools==65.5.0 wheel==0.38.4
pip install -r requirements.txt   # requirements.txt 见网盘(提取码: 8je8)

1.3 安装 Habitat 仿真器

Habitat 是 VLN‑CE 任务的底层仿真平台。需要安装特定版本的 habitat‑sim 和 habitat‑lab。https://link.gitcode.com/?target=https%3A%2F%2Fanaconda.org%2Faihabitat%2Fhabitat-sim%2F0.1.7%2Fdownload%2Flinux-64%2Fhabitat-sim-0.1.7-py3.8_headless_linux_856d4b08c1a2632626bf0d205bf46471a99502b7.tar.bz2&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=whitehttps://link.gitcode.com/?target=https%3A%2F%2Fanaconda.org%2Faihabitat%2Fhabitat-sim%2F0.1.7%2Fdownload%2Flinux-64%2Fhabitat-sim-0.1.7-py3.8_headless_linux_856d4b08c1a2632626bf0d205bf46471a99502b7.tar.bz2&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=white

https://link.gitcode.com/?target=https%3A%2F%2Fgithub.com%2Ffacebookresearch%2Fhabitat-lab%2Freleases%2Ftag%2Fv0.1.7&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=whiteGitCode是面向全球开发者的开源社区,包括原创博客,开源代码托管,代码协作,项目管理等。与开发者社区互动,提升您的研发效率和质量。https://link.gitcode.com/?target=https%3A%2F%2Fgithub.com%2Ffacebookresearch%2Fhabitat-lab%2Freleases%2Ftag%2Fv0.1.7&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=white

bash 复制代码
# 下载并安装 habitat-sim v0.1.7 无头版(点击链接下载)
conda install habitat-sim-0.1.7-py3.8_headless_linux_856d4b08c1a2632626bf0d205bf46471a99502b7.tar.bz2

# 下载 habitat-lab v0.1.7 并安装
cd habitat-lab-0.1.7
pip install -e .

1.4 云服务器环境依赖修复(选做)

在精简版 Linux(如容器、云主机)中运行 Habitat 时,常遇到缺少 OpenGL/EGL 库或 C++ ABI 版本冲突。执行以下命令即可解决:

bash 复制代码
# 安装图形渲染依赖
apt-get update
apt-get install -y libopengl0 libgl1-mesa-glx libglib2.0-0 libegl1

# 升级 C++ 标准库
apt-get install -y software-properties-common
add-apt-repository ppa:ubuntu-toolchain-r/test -y
apt-get update
apt-get install --only-upgrade libstdc++6 -y

1.5 下载 ETPNav 源码

https://link.gitcode.com/?target=https%3A%2F%2Fgithub.com%2FMarSaKi%2FETPNav&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=whiteGitCode是面向全球开发者的开源社区,包括原创博客,开源代码托管,代码协作,项目管理等。与开发者社区互动,提升您的研发效率和质量。https://link.gitcode.com/?target=https%3A%2F%2Fgithub.com%2FMarSaKi%2FETPNav&from=https%3A%2F%2Fgitcode.com%2Fgh_mirrors%2Fev%2Fevery-embodied%2Fblob%2Fmain%2F08-%25E5%2585%25B7%25E8%25BA%25AB%25E5%25AF%25BC%25E8%2588%25AA%25E5%258F%258AVLN%2F03%25E5%2589%258D%25E6%25B2%25BFVLN%25E5%25A4%258D%25E7%258E%25B0%2F01VLNCE%2F02ETPNav%25E4%25BB%25A3%25E7%25A0%2581%25E5%25A4%258D%25E7%258E%25B0.md%23etpnav-%25E5%25A4%258D%25E7%258E%25B0%25E6%258C%2587%25E5%258D%2597&lang=zh&theme=white

bash 复制代码
git clone https://gh-proxy.org/https://github.com/MarSaKi/ETPNav.git

2. 数据集下载

2.1 场景数据:Matterport3D(MP3D)

共 90 个场景,数据大小约 22GB。最终存放路径:

data/scene_datasets/mp3d/{scene}/{scene}.glb

bash 复制代码
python download_mp.py --task habitat -o data/scene_datasets/mp3d/

2.2 任务数据:R2R 与 RxR

将下载的文件放置于 data/datasets/ 目录下。

数据集 下载链接 存放路径
R2R_VLNCE_v1-2_preprocessed drive.google.com data/datasets
R2R_VLNCE_v1-2_preprocessed_BERTidx 百度网盘 请输入提取码(提取码: 88yy) data/datasets
RxR 百度网盘 请输入提取码(提取码: g317) data/datasets

2.3 连通图文件

用于可视化导航路径的连通图:


3. 模型权重与预训练数据

3.1 编码器与组件权重

模型组件 下载方式 存放路径
Waypoint Predictor (R2R‑CE) drive.google.com data/wp_pred/check_cwp_bestdist_hfov90
Waypoint Predictor (RxR‑CE) drive.google.com data/wp_pred/check_cwp_bestdist_hfov63
BERT 权重 huggingface.co bert_config/bert-base-uncased
RGB 编码器 (ViT‑B32) huggingface.co .cache/clip/ViT-B-32.pt
Depth 编码器 (ResNet50) https://dl.fbaipublicfiles.com/habitat/data/baselines/v1/ddppo/ddppo-models/gibson-2plus-resnet50.pth data/pretrained_models/ddppo-models/gibson-2plus-resnet50.pth

3.2 预训练数据

  • R2R 预训练数据:[https://www.dropbox.com/scl/fo/4iaw2ii2z2iupu0yn4tqh/AP2waOdlwdbJE5sUti2557U/R2R?dl=0\&rlkey=88khaszmvhybxleyv0a9bulyn\&subfolder_nav_tracking=1![](https://csdnimg.cn/release/blog_editor_html/release2.4.6/ckeditor/plugins/CsdnLink/icons/icon-default.png)https://www.dropbox.com/scl/fo/4iaw2ii2z2iupu0yn4tqh/AP2waOdlwdbJE5sUti2557U/R2R?dl=0\&rlkey=88khaszmvhybxleyv0a9bulyn\&subfolder_nav_tracking=1](https://www.dropbox.com/scl/fo/4iaw2ii2z2iupu0yn4tqh/AP2waOdlwdbJE5sUti2557U/R2R?dl=0&rlkey=88khaszmvhybxleyv0a9bulyn&subfolder_nav_tracking=1 "https://www.dropbox.com/scl/fo/4iaw2ii2z2iupu0yn4tqh/AP2waOdlwdbJE5sUti2557U/R2R?dl=0&rlkey=88khaszmvhybxleyv0a9bulyn&subfolder_nav_tracking=1") → 存至 pretrain_src/datasets/R2R

  • 预计算视觉特征:[https://drive.google.com/file/d/1D3Gd9jqRfF-NjlxDAQG_qwxTIakZlrWd/view![](https://csdnimg.cn/release/blog_editor_html/release2.4.6/ckeditor/plugins/CsdnLink/icons/icon-default.png)https://drive.google.com/file/d/1D3Gd9jqRfF-NjlxDAQG_qwxTIakZlrWd/view](https://drive.google.com/file/d/1D3Gd9jqRfF-NjlxDAQG_qwxTIakZlrWd/view "https://drive.google.com/file/d/1D3Gd9jqRfF-NjlxDAQG_qwxTIakZlrWd/view") → 存至 pretrain_src/datasets/img_features

  • LXMERT 预训练权重:[https://nlp.cs.unc.edu/data/model_LXRT.pth![](https://csdnimg.cn/release/blog_editor_html/release2.4.6/ckeditor/plugins/CsdnLink/icons/icon-default.png)https://nlp.cs.unc.edu/data/model_LXRT.pth](https://nlp.cs.unc.edu/data/model_LXRT.pth "https://nlp.cs.unc.edu/data/model_LXRT.pth") → 存至 pretrain_src/datasets/pretrained/LXMERT

3.3 最终预训练权重(跳过预训练直接微调)

下载链接:[百度网盘 请输入提取码](https://pan.baidu.com/s/1oTmRkuj6syTmI6kE78k0JQ "百度网盘 请输入提取码") 提取码: vfsh

存放路径:pretrained/ETP/model_step_82500.pt

3.4 完整文件夹结构

最终 ETPNav 根目录应包含以下关键内容:

复制代码
ETPNav/
├── bert_config/bert-base-uncased/
├── data/
│   ├── datasets/{R2R_VLNCE_*, RxR_VLNCE_*}
│   ├── scene_datasets/mp3d/
│   ├── wp_pred/
│   ├── ddppo-models/gibson-2plus-resnet50.pth
│   └── connectivity_graphs.pkl
├── pretrain_src/
│   ├── datasets/{R2R, pretrained/LXMERT}
│   └── img_features/
├── pretrained/ETP/model_step_82500.pt
├── run_r2r/
├── habitat_extensions/
├── vlnce_baselines/
└── run.py

4. 代码运行

4.1 预训练(可跳过)

如果已经下载了官方预训练权重(model_step_82500.pt),可以直接跳到微调。

  • 修改 BERT 权重为本地路径(避免无法连接外网)

  • 按硬件条件调整 GPU 数量(如 CUDA_VISIBLE_DEVICES=0

  • 启动预训练:

    bash 复制代码
    CUDA_VISIBLE_DEVICES=0 bash pretrain_src/run_pt/run_r2r.bash 233

    训练日志保存在 pretrained/r2r_ce/mlm.sap_habitat_depth/logs/log.txt,可根据测评指标选择最好权重。

4.2 微调(Finetuning)

以单张 RTX 4090 为例,微调约需 1.5 天。

  1. 修改脚本中的预训练权重路径为 pretrained/ETP/model_step_82500.pt

  2. 按需调整 GPU 数量

  3. 启动训练:

    bash 复制代码
    CUDA_VISIBLE_DEVICES=0 bash run_r2r/main.bash train 2333

可选:屏蔽 TensorFlow 警告

bash 复制代码
export TF_ENABLE_ONEDNN_OPTS=0
export TF_CPP_MIN_LOG_LEVEL=2

4.3 测试与评估

评估模式

bash 复制代码
CUDA_VISIBLE_DEVICES=0 bash run_r2r/main.bash eval 2333

运行后输出各项指标(TL, NE, SR, SPL 等)。

可视化 (可选):

在配置中将视频保存模式改为 disk,评估后导航视频会保存在 data/logs/video/release_r2r/*

推理模式

bash 复制代码
CUDA_VISIBLE_DEVICES=0 bash run_r2r/main.bash inference 2333

5. 常见问题与踩坑提示

问题现象 解决方案
ImportError: libEGL.so.1 执行 1.4 节的系统依赖安装
CXXABI_1.3.13 not found 升级 libstdc++6(同上)
PyTorch 版本冲突 必须使用 1.9.1+cu111,高版本可能导致 habitat 不兼容
数据集路径错误 严格按照 2.1--2.3 的存放路径放置文件
预训练权重下载慢 使用百度网盘或代理,或直接使用作者提供的微调权重

6. 参考与致谢

本复现教程主要参考以下开源项目,感谢原作者的贡献:


希望这篇指南能帮助你顺利复现 ETPNav,并在连续环境视觉语言导航的研究中更进一步。如有问题,欢迎交流讨论。

相关推荐
一口吃俩胖子2 小时前
【脉宽调制DCDC功率变换学习笔记023】渐进分析法
笔记·学习
智者知已应修善业2 小时前
【51单片机2个外部中断切换LED花样】2024-1-3
c++·经验分享·笔记·算法·51单片机
8Qi82 小时前
LeetCode 31:下一个排列(Next Permutation)—— 完整题解笔记 ✅
笔记·算法·leetcode·指针·思维·排列
whyTeaFo3 小时前
MIT 6.1810: Lab traps: traps
笔记
小陈phd3 小时前
多模态大模型学习笔记(四十八)——从自然语言到 SQL:大模型时代结构化数据查询的技术革命与落地实践
笔记·sql·学习
元气少女小圆丶4 小时前
SenseGlove Nova 2+Unity开发笔记4
笔记·unity·游戏引擎
ZK_H5 小时前
MFC程序开发自学笔记其一——windows应用程序与c++基础
c++·笔记·mfc
GLDbalala5 小时前
GPU PRO 5 - 2.6 Wire Antialiasing 笔记
笔记
梦076 小时前
学习笔记-ClaudeCode快速安装配置上手
笔记·学习