Harbor 启动失败故障排查与解决:从“Cannot allocate memory”到“Operation not permitted”

背景

Harbor 是一个流行的企业级容器镜像仓库,通常以 Docker Compose 方式部署。近期在部署 Harbor v2.14.2 时,遇到了一系列诡异的启动失败问题:多个核心服务(harbor-dbredisregistryctlharbor-core 等)不断重启,日志中交替出现 Cannot allocate memoryOperation not permitted 错误。本文将详细记录从现象到最终解决的全过程,并提供可用的配置方案。

故障现象

执行 docker-compose up -d 后,部分容器处于 Restarting 状态:

bash 复制代码
Name                     Command                  State                         Ports                   
harbor-core         /harbor/entrypoint.sh            Restarting                                               
harbor-db           /docker-entrypoint.sh 14 15      Restarting                                               
harbor-jobservice   /harbor/entrypoint.sh            Restarting                                               
redis               redis-server /etc/redis.conf     Restarting                                               
registryctl         /home/harbor/start.sh            Restarting                                               

查看容器日志,发现以下典型错误:

1. PostgreSQL 数据库

csharp 复制代码
harbor-db | init DB, DB version:15
harbor-db | popen failure: Cannot allocate memory
harbor-db | initdb: error: program "postgres" is needed by initdb but was not found ...

2. Redis

yaml 复制代码
redis | 1:M 18 Mar 2026 08:49:08.631 # Failed to write PID file: Permission denied
redis | 1:M 18 Mar 2026 08:49:08.631 # Fatal: Can't initialize Background Jobs. Error message: Operation not permitted

3. Go 服务(core、jobservice、registryctl)

yaml 复制代码
harbor-core | runtime/cgo: pthread_create failed: Operation not permitted
harbor-core | SIGABRT: abort

初步排查

2. 文件权限检查与修正

Permission denied 错误指向目录权限。我们检查了数据目录 /data/harbor,发现部分子目录所有者不正确:

Harbor 容器默认使用 UID 10000 运行,因此需要将数据目录的属主改为 10000:10000

bash 复制代码
chown -R 10000:10000 /data/harbor

执行命令后,检查:

bash 复制代码
[root@master01 harbor]# ls -alh /data/harbor/
total 32K
drwxrwxrwx 8 10000 10000 4.0K Mar 18 16:27 .
drwxr-xr-x 5 root  root  4.0K Mar 18 16:27 ..
drwxr-xr-x 2 10000 10000 4.0K Mar 18 16:27 ca_download
drwx------ 3 10000 10000 4.0K Mar 18 17:32 database
drwxr-xr-x 2 10000 10000 4.0K Mar 18 16:27 job_logs
drwxr-xr-x 2 10000 10000 4.0K Mar 18 17:52 redis
drwxr-xr-x 2 10000 10000 4.0K Mar 18 16:27 registry
drwxrwxrwx 5 10000 10000 4.0K Mar 18 16:27 secret

修改后,部分服务日志中的权限错误消失,但容器依然重启,pthread_create 错误仍在。这说明权限问题只是表象,还存在更深层的限制。

关键线索:容器 capabilities 限制

查看 Harbor 官方 docker-compose.yml,发现每个服务都定义了 cap_drop: - ALL 和仅添加少数几个能力(如 CHOWNSETGIDSETUID)。这种最小权限原则虽然安全,但在某些环境下可能因缺少必要能力而导致系统调用失败。

特别地,pthread_create 操作可能需要 SYS_PTRACESYS_ADMIN 等能力,而当前配置过于严格。同时,Docker 默认的 seccomp 配置文件也可能拦截了必要的系统调用。

解决方案:使用特权模式

为了快速验证,我们为所有服务添加了 privileged: true,并移除了原有的 cap_dropcap_add 配置。特权模式赋予容器与主机几乎相同的权限,绕过了能力限制和 seccomp 过滤。

修改后的 docker-compose.yml 核心片段如下:

yaml 复制代码
services:
  redis:
    image: goharbor/redis-photon:v2.14.2
    container_name: redis
    restart: always
    privileged: true          # 添加此行
    volumes:
      - /data/harbor/redis:/var/lib/redis
    ...

重新生成配置并启动:

bash 复制代码
docker-compose up -d

所有服务成功启动,Harbor 恢复正常!

最终方案

1. 修正数据目录权限

bash 复制代码
chown -R 10000:10000 /data/harbor

2. 修改 docker-compose.yml

首先,停止并清理当前运行的容器(包括匿名卷),以确保重新部署时不会出现冲突:

bash 复制代码
docker-compose down -v

接下来,编辑 docker-compose.yml 文件,为所有服务添加 privileged: true 配置项,并删除原有的 cap_dropcap_add 块。这一修改将赋予容器必要的系统权限,绕过 capabilities 和 seccomp 的限制。

为了方便您操作,文末提供了完整的、可直接使用的 docker-compose.yml 配置文件,您可以将其内容复制替换现有文件。

3. 重新部署

完成配置修改后,重新生成 Harbor 配置并启动所有服务:

bash 复制代码
docker-compose up -d

等待片刻,即可看到所有容器成功运行。

完整可用的 docker-compose.yml

yaml 复制代码
services:
  log:
    image: goharbor/harbor-log:v2.14.2
    container_name: harbor-log
    restart: always
    privileged: true
    volumes:
      - /var/log/harbor/:/var/log/docker/:z
      - type: bind
        source: ./common/config/log/logrotate.conf
        target: /etc/logrotate.d/logrotate.conf
      - type: bind
        source: ./common/config/log/rsyslog_docker.conf
        target: /etc/rsyslog.d/rsyslog_docker.conf
    ports:
      - 127.0.0.1:1514:10514
    networks:
      - harbor

  registry:
    image: goharbor/registry-photon:v2.14.2
    container_name: registry
    restart: always
    privileged: true
    volumes:
      - /data/harbor/registry:/storage:z
      - ./common/config/registry/:/etc/registry/:z
      - type: bind
        source: /data/harbor/secret/registry/root.crt
        target: /etc/registry/root.crt
      - type: bind
        source: ./common/config/shared/trust-certificates
        target: /harbor_cust_cert
    networks:
      - harbor
    depends_on:
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "registry"

  registryctl:
    image: goharbor/harbor-registryctl:v2.14.2
    container_name: registryctl
    env_file:
      - ./common/config/registryctl/env
    restart: always
    privileged: true
    volumes:
      - /data/harbor/registry:/storage:z
      - ./common/config/registry/:/etc/registry/:z
      - type: bind
        source: ./common/config/registryctl/config.yml
        target: /etc/registryctl/config.yml
      - type: bind
        source: ./common/config/shared/trust-certificates
        target: /harbor_cust_cert
    networks:
      - harbor
    depends_on:
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "registryctl"

  postgresql:
    image: goharbor/harbor-db:v2.14.2
    container_name: harbor-db
    restart: always
    privileged: true
    volumes:
      - /data/harbor/database:/var/lib/postgresql/data:z
    networks:
      harbor:
    env_file:
      - ./common/config/db/env
    depends_on:
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "postgresql"
    shm_size: '1gb'

  core:
    image: goharbor/harbor-core:v2.14.2
    container_name: harbor-core
    env_file:
      - ./common/config/core/env
    restart: always
    privileged: true
    volumes:
      - /data/harbor/ca_download/:/etc/core/ca/:z
      - /data/harbor/:/data/:z
      - ./common/config/core/certificates/:/etc/core/certificates/:z
      - type: bind
        source: ./common/config/core/app.conf
        target: /etc/core/app.conf
      - type: bind
        source: /data/harbor/secret/core/private_key.pem
        target: /etc/core/private_key.pem
      - type: bind
        source: /data/harbor/secret/keys/secretkey
        target: /etc/core/key
      - type: bind
        source: ./common/config/shared/trust-certificates
        target: /harbor_cust_cert
    networks:
      harbor:
    depends_on:
      - log
      - registry
      - redis
      - postgresql
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "core"

  portal:
    image: goharbor/harbor-portal:v2.14.2
    container_name: harbor-portal
    restart: always
    privileged: true
    volumes:
      - type: bind
        source: ./common/config/portal/nginx.conf
        target: /etc/nginx/nginx.conf
    networks:
      - harbor
    depends_on:
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "portal"

  jobservice:
    image: goharbor/harbor-jobservice:v2.14.2
    container_name: harbor-jobservice
    env_file:
      - ./common/config/jobservice/env
    restart: always
    privileged: true
    volumes:
      - /data/harbor/job_logs:/var/log/jobs:z
      - type: bind
        source: ./common/config/jobservice/config.yml
        target: /etc/jobservice/config.yml
      - type: bind
        source: ./common/config/shared/trust-certificates
        target: /harbor_cust_cert
    networks:
      - harbor
    depends_on:
      - core
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "jobservice"

  redis:
    image: goharbor/redis-photon:v2.14.2
    container_name: redis
    restart: always
    privileged: true
    volumes:
      - /data/harbor/redis:/var/lib/redis
    networks:
      harbor:
    depends_on:
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "redis"

  proxy:
    image: goharbor/nginx-photon:v2.14.2
    container_name: nginx
    restart: always
    privileged: true
    volumes:
      - ./common/config/nginx:/etc/nginx:z
      - type: bind
        source: ./common/config/shared/trust-certificates
        target: /harbor_cust_cert
    networks:
      - harbor
    ports:
      - 31104:8080
    depends_on:
      - registry
      - core
      - portal
      - log
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://localhost:1514"
        tag: "proxy"

networks:
  harbor:
    external: false
相关推荐
devlei9 小时前
从源码泄露看AI Agent未来:深度对比Claude Code原生实现与OpenClaw开源方案
android·前端·后端
努力的小郑11 小时前
Canal 不难,难的是用好:从接入到治理
后端·mysql·性能优化
Victor35612 小时前
MongoDB(87)如何使用GridFS?
后端
Victor35612 小时前
MongoDB(88)如何进行数据迁移?
后端
小红的布丁12 小时前
单线程 Redis 的高性能之道
redis·后端
GetcharZp12 小时前
Go 语言只能写后端?这款 2D 游戏引擎刷新你的认知!
后端
宁瑶琴13 小时前
COBOL语言的云计算
开发语言·后端·golang
普通网友14 小时前
阿里云国际版服务器,真的是学生党的性价比之选吗?
后端·python·阿里云·flask·云计算
IT_陈寒15 小时前
Vue的这个响应式问题,坑了我整整两小时
前端·人工智能·后端
Soofjan15 小时前
Go 内存回收-GC 源码1-触发与阶段
后端