Depth Anything V2 复现
一、配置环境
在本机电脑win跑之后依旧爆显存,放到服务器跑:Ubuntu22.04,CUDA17
bash
conda create -n DAv2 python=3.10
conda activate DAv2
conda下安装cuda。由于服务器上面我不能安装CUDA,只能在conda上安装cuda。我安装的cuda11.7。
跟着下面的教程做:
bash
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/linux-64/cudatoolkit-11.7.1-h4bc3d14_13.conda
conda install --use-local cudatoolkit-11.7.1-h4bc3d14_13.conda
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/linux-64/cudnn-8.9.7.29-hcdd5f01_2.conda
conda install --use-local cudnn-8.9.7.29-hcdd5f01_2.conda
安装其他依赖
记得在requirements.txt中增加tensorboard、h5py
bash
pip install torch==2.0.1+cu117 torchvision==0.15.2+cu117 torchaudio==2.0.2 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
检查torch是否安装正确以及cuda版本
bash
python
import torch
torch.cuda.is_available()
torch.version.cuda
二、准备数据
1. 权重文件
将pre-trained-models放在 DepthAnythingV2/checkpoints 文件夹
2. 训练数据
训练的时候需要,我这里之前就准备了vkitti。我先用vkitti数据跑一下试一下。
三、Test
Running script on images:
bash
python run.py \
--encoder <vits | vitb | vitl | vitg> \
--img-path <path> --outdir <outdir> \
[--input-size <size>] [--pred-only] [--grayscale]
Options:
- --img-path: You can either 1) point it to an image directory storing all interested images, 2) point it to a single image, or 3)
point it a text file storing all image paths.- --input-size (optional): By default, we use input size 518 for model inference. You can increase the size for even more fine-grained
results.- --pred-only (optional): Only save the predicted depth map, without raw image.
- --grayscale (optional): Save the grayscale depth map, without applying color palette.
For example:
bash
python run.py --encoder vitl --img-path assets/examples --outdir depth_vis
Running script on videos
bash
python run_video.py \
--encoder <vits | vitb | vitl | vitg> \
--video-path assets/examples_video --outdir video_depth_vis \
[--input-size <size>] [--pred-only] [--grayscale]
Our larger model has better temporal consistency on videos.
四、Train
根据自己的数据修改DepthAnythingV2/metric_depth/dataset/splits和train.py中的路径数据
bash
sh dist_train.sh
但我运行不了这个sh文件,所以我选择直接配置.vscode/launch.json。并且我将我的train代码改为了非分布式的。
bash
{
// 使用 IntelliSense 了解相关属性。
// 悬停以查看现有属性的描述。
// 欲了解更多信息,请访问: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python 调试程序: train.py",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/metric_depth/train.py",
"console": "integratedTerminal",
"args": [
"--epoch", "120",
"--encoder", "vitl",
"--bs", "2",
"--lr", "0.000005",
"--save-path", "./exp/vkitti",
"--dataset", "vkitti",
"--img-size", "518",
"--min-depth", "0.001",
"--max-depth", "20",
"--pretrained-from", "./checkpoints/depth_anything_v2_vitl.pth",
],
"env": {
"MASTER_ADDR": "localhost",
"MASTER_PORT": "20596"
}
},
{
"name":"Python 调试程序: run.py",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/run.py",
"console": "integratedTerminal",
"args": [
"--encoder", "vitl",
"--img-path", "assets/examples",
"--outdir", "output/depth_anything_v2_vitl_test",
"--checkpoints","checkpoints/depth_anything_v2_vitl_test.pth"
],
}
]
}