这个是我使用wandb的offline后生成的文件,然后我把它从没网的服务器A移动到有网的服务器B。
bash
tree wandb [ main ✗
wandb
├── debug-cli.yfyuan.log
├── offline-run-20260409_195319-3zyc8d9a
│ ├── files
│ │ ├── config.yaml
│ │ ├── output.log
│ │ ├── requirements.txt
│ │ ├── wandb-metadata.json
│ │ └── wandb-summary.json
│ ├── logs
│ │ ├── debug-internal.log
│ │ └── debug.log
│ ├── run-3zyc8d9a.wandb
│ └── tmp
└── offline-run-20260409_212004-84yw18mk
├── files
│ ├── config.yaml
│ ├── output.log
│ ├── requirements.txt
│ ├── wandb-metadata.json
│ └── wandb-summary.json
├── logs
│ ├── debug-internal.log
│ └── debug.log
├── run-84yw18mk.wandb
└── tmp
9 directories, 17 files
首先确定服务器B是否能访问wandb:
bash
curl -I https://wandb.ai [ main ✗ ]
HTTP/1.1 200 Connection established
HTTP/2 301
x-powered-by: Express
set-cookie: host_session_id=8ea24005-7856-40ae-9cfc-53ca4e354d59; Max-Age=31536000; Domain=wandb.ai; Path=/; Expires=Sat, 10 Apr 2027 04:14:01 GMT; Secure; SameSite=None
x-frame-options: SAMEORIGIN
content-security-policy: frame-ancestors 'self';
location: /site
vary: Accept
content-type: text/plain; charset=utf-8
x-cloud-trace-context: 0f08552102b530fc8625ad9daba6a1ec
content-length: 39
date: Fri, 10 Apr 2026 04:14:01 GMT
pragma: no-cache
expires: Fri, 01 Jan 1990 00:00:00 GMT
cache-control: no-cache, must-revalidate
server: Google Frontend
via: 1.1 google
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
这样就是可以的。
当我想同步到wandb云端的时候,一直卡在这:
bash
wandb sync --sync-all [ main ✗ ]
Find logs at: /mnt/nvme1/yfyuan/wangxiao/TTRL/verl/wandb/debug-cli.yfyuan.log
Syncing: https://wandb.ai/2561681244-jiangsu-university/TTRL-GBPO/runs/3zyc8d9a ...
然后后来觉得问题可能是因为服务器A和B的问题,然后直接sync .wandb文件就行了。
bash
/mnt/nvme1/yfyuan/wangxiao/TTRL/verl
> wandb sync /mnt/nvme1/yfyuan/wangxiao/TTRL/verl/wandb/offline-run-20260409_212004-84yw18mk/run-84yw18mk.wandb
Find logs at: /mnt/nvme1/yfyuan/wangxiao/TTRL/verl/wandb/debug-cli.yfyuan.log
Syncing: https://....../runs/84yw18mk ... done.