VLLMService Operator 开发第八篇：自动创建 HTTPRoute 并完成 Gateway 访问验证

前言

上一篇文章中，我们已经讲清楚了 gatewayRef 的 API 设计和 HTTPRoute 创建前的依赖校验流程。gatewayRef 表示当前 VLLMService 是否要接入已有 Gateway；如果没有配置 gatewayRef，Operator 只创建 Deployment 和 Service；如果配置了 gatewayRef，Operator 会先检查 Gateway 是否存在、指定 listener 是否存在、listener 协议是否适合 HTTPRoute，然后才会进入 HTTPRoute 创建逻辑。

这一篇继续往下走，重点讲 HTTPRoute 自动创建、状态回写、自愈验证和 Gateway 访问验证。当前目标是：当用户在 VLLMService 中配置 gatewayRef 后，Operator 自动创建或更新 HTTPRoute，让外部请求可以通过 Gateway、Host 和 HTTPRoute 转发到后端 vLLM Service。

本文对应的目标链路是：

复制代码

VLLMService
  -> Deployment
  -> Pod
  -> Service
  -> HTTPRoute
  -> Gateway
  -> vLLM OpenAI-compatible API

当前代码中，主 Reconcile 会先同步 Deployment 和 Service，再调用 reconcileHTTPRoute 同步 HTTPRoute，随后统一调用 updateVLLMServiceStatus 更新状态；如果 reconcileHTTPRoute 返回了 requeueAfter，主 Reconcile 会在更新 status 后返回 ctrl.Result{RequeueAfter: requeueAfter}。

当前项目的GitHub 地址：

复制代码

https://github.com/bolin-dai/vllmservice-operator

一、本篇要实现什么

上一篇文章只讲到"创建 HTTPRoute 前要做哪些判断"，这一篇开始进入真正的资源构造。最终我们希望用户只需要写一个带 gatewayRef 的 VLLMService：

复制代码

apiVersion: aiinfra.example.com/v1alpha1
kind: VLLMService
metadata:
  name: qwen-demo
  namespace: ai-demo
spec:
  image: docker.m.daocloud.io/vllm/vllm-openai:latest
  modelPath: /data/models/Qwen2.5-1.5B-Instruct
  modelName: qwen2.5-1.5b-instruct
  replicas: 1
  port: 8000

  schedulerName: volcano
  runtimeClassName: nvidia

  nodeSelector:
    kubernetes.io/hostname: master-01

  labels:
    aiinfra.example.com/model: qwen2.5
    aiinfra.example.com/runtime: vllm
    aiinfra.example.com/team: infra

  gatewayRef:
    name: llm-gateway
    namespace: ai-demo
    sectionName: http
    host: llm.example.local

  resources:
    requests:
      cpu: "2"
      memory: 8Gi
      volcano.sh/vgpu-number: "1"
      volcano.sh/vgpu-memory: "6144"
      volcano.sh/vgpu-cores: "50"
    limits:
      cpu: "4"
      memory: 16Gi
      volcano.sh/vgpu-number: "1"
      volcano.sh/vgpu-memory: "6144"
      volcano.sh/vgpu-cores: "50"

  storage:
    pvcName: qwen-model-pvc
    mountPath: /data/models
    readOnly: true

然后 Operator 自动创建出：

复制代码

Deployment/qwen-demo
Service/qwen-demo
HTTPRoute/qwen-demo

其中 Deployment 负责启动 vLLM Pod，Service 负责给 Pod 提供稳定的集群内访问入口，HTTPRoute 负责把 Gateway 收到的 HTTP 请求转发到后端 Service。HTTPRoute 用来描述 HTTP 请求从 Gateway listener 到后端 API 对象的路由行为，HTTPRoute 的 spec 主要由 parentRefs、hostnames 和 rules 组成，backendRefs 用来描述请求要转发到哪些后端对象。

二、从 reconcileHTTPRoute 进入 HTTPRoute 创建逻辑

2.1 只有 Gateway 依赖校验通过后才创建 HTTPRoute

当前代码中，reconcileHTTPRoute 的前半部分做的是判断和清理：如果 gatewayRef 不存在，就删除当前 VLLMService 拥有的旧 HTTPRoute；如果 gatewayRef 存在，就调用 resolveGatewayRef 检查 Gateway、listener 和 protocol。只有当 resolveGatewayRef 返回的 gateway 不为 nil 时，才会继续创建或更新 HTTPRoute。关键代码如下：

复制代码

gateway, routeMessage, requeueAfter, err := r.resolveGatewayRef(ctx, vllmService)
if err != nil {
	return nil, "", 0, err
}

if gateway == nil {
	if err := r.deleteOwnedHTTPRouteIfExists(ctx, vllmService); err != nil {
		return nil, routeMessage, requeueAfter, err
	}
	return nil, routeMessage, requeueAfter, nil
}

当Gateway 不存在、listener 不存在、listener 协议不适合 HTTPRoute 时，都不创建 HTTPRoute，并且会删除当前 VLLMService 拥有的旧 HTTPRoute，避免旧路由继续暴露服务。这里删除的不是任意同名 HTTPRoute，而是当前 VLLMService 通过 OwnerReference 控制的 HTTPRoute；deleteOwnedHTTPRouteIfExists 内部会用 metav1.IsControlledBy 做判断，避免误删用户手工创建或其他控制器管理的同名资源。

这一步体现了 Operator 的一个核心原则：不是只管创建资源，还要持续维护"期望状态"。如果用户配置不再满足 HTTPRoute 的创建条件，Operator 应该清理自己之前创建出来的 HTTPRoute。

2.2 HTTPRoute 的 name 和 namespace 设计

当前代码中，HTTPRoute 的 name 和 namespace 与 VLLMService 保持一致：

复制代码

httpRoute := &gatewayv1.HTTPRoute{
	ObjectMeta: metav1.ObjectMeta{
		Name:      vllmService.Name,
		Namespace: vllmService.Namespace,
	},
}

这种设计比较简单清晰：

复制代码

VLLMService/qwen-demo  -> HTTPRoute/qwen-demo

这样做有几个好处：资源名称直观，排查问题时容易定位；HTTPRoute 和 VLLMService 在同一个 namespace，OwnerReference 设置更自然；删除 VLLMService 时，Kubernetes 垃圾回收可以清理它拥有的 HTTPRoute；Deployment、Service、HTTPRoute 都使用同一个业务名称，学习阶段也更容易理解。

这里要注意一个细节：HTTPRoute 创建在 VLLMService 所在 namespace，而不是 Gateway 所在 namespace。如果 gatewayRef.namespace 指向另一个 namespace，HTTPRoute 仍然在 VLLMService 的 namespace，只是 parentRefs.namespace 会指向 Gateway 所在 namespace。

三、使用 CreateOrUpdate 构造 HTTPRoute

这一章是本篇文章的核心。当前代码使用 controllerutil.CreateOrUpdate 来创建或更新 HTTPRoute，在 MutateFn 中设置 labels、parentRefs、hostnames、rules.backendRefs，最后设置 OwnerReference。

3.1 CreateOrUpdate 的作用

核心代码如下：

复制代码

httpRouteOperation, err := controllerutil.CreateOrUpdate(ctx, r.Client, httpRoute, func() error {
	httpRoute.Labels = labelsForVLLMService(vllmService)

	sectionName := gatewayv1.SectionName(vllmService.Spec.GatewayRef.SectionName)

	parentRef := gatewayv1.ParentReference{
		Name:        gatewayv1.ObjectName(vllmService.Spec.GatewayRef.Name),
		SectionName: &sectionName,
	}

	if vllmService.Spec.GatewayRef.Namespace != "" {
		gatewayNamespace := gatewayv1.Namespace(vllmService.Spec.GatewayRef.Namespace)
		parentRef.Namespace = &gatewayNamespace
	}

	hostname := gatewayv1.Hostname(vllmService.Spec.GatewayRef.Host)
	backendPort := gatewayv1.PortNumber(portFor(vllmService))

	httpRoute.Spec = gatewayv1.HTTPRouteSpec{
		CommonRouteSpec: gatewayv1.CommonRouteSpec{
			ParentRefs: []gatewayv1.ParentReference{
				parentRef,
			},
		},
		Hostnames: []gatewayv1.Hostname{
			hostname,
		},
		Rules: []gatewayv1.HTTPRouteRule{
			{
				BackendRefs: []gatewayv1.HTTPBackendRef{
					{
						BackendRef: gatewayv1.BackendRef{
							BackendObjectReference: gatewayv1.BackendObjectReference{
								Name: gatewayv1.ObjectName(service.Name),
								Port: &backendPort,
							},
						},
					},
				},
			},
		},
	}

	return controllerutil.SetControllerReference(vllmService, httpRoute, r.Scheme)
})

CreateOrUpdate 的语义是：如果对象不存在，就执行 MutateFn 后创建；如果对象已经存在，就执行 MutateFn 后更新到期望状态；如果对象已经符合期望状态，就不会产生实质性变更。controller-runtime 官方文档中也把它描述为一个创建或更新对象的辅助函数。

所以这里的 HTTPRoute 调谐逻辑可以理解成：

复制代码

HTTPRoute 不存在：创建。
HTTPRoute 已存在但字段不符合期望：更新。
HTTPRoute 已存在且字段已经符合期望：保持不变。

这就是声明式控制器的核心思想：用户声明 VLLMService，Operator 负责持续把 Deployment、Service、HTTPRoute 调整到期望状态。

3.2 设置 labels

在 MutateFn 里，第一步是设置 labels：

复制代码

httpRoute.Labels = labelsForVLLMService(vllmService)

当前代码里的 labelsForVLLMService 会先复制用户在 spec.labels 里写的自定义标签，然后再加上 Operator 自己维护的几个标准标签：

复制代码

labels["app.kubernetes.io/name"] = "vllmservice"
labels["app.kubernetes.io/instance"] = vllmService.Name
labels["app.kubernetes.io/managed-by"] = "vllmservice-operator"

这样创建出来的 HTTPRoute 会带上和 Deployment、Service 类似的标签，后续排查和筛选资源会比较方便。例如：

复制代码

kubectl -n ai-demo get httproute -l app.kubernetes.io/managed-by=vllmservice-operator
kubectl -n ai-demo get httproute -l app.kubernetes.io/instance=qwen-demo

3.3 构造 parentRefs：绑定 Gateway 和 listener

HTTPRoute 要接入 Gateway，最关键的字段是 parentRefs。当前代码中，parentRef.Name 来自 gatewayRef.name，parentRef.SectionName 来自 gatewayRef.sectionName：

复制代码

sectionName := gatewayv1.SectionName(vllmService.Spec.GatewayRef.SectionName)

parentRef := gatewayv1.ParentReference{
	Name:        gatewayv1.ObjectName(vllmService.Spec.GatewayRef.Name),
	SectionName: &sectionName,
}

如果用户显式填写了 gatewayRef.namespace，代码会继续设置 parentRef.Namespace：

复制代码

if vllmService.Spec.GatewayRef.Namespace != "" {
	gatewayNamespace := gatewayv1.Namespace(vllmService.Spec.GatewayRef.Namespace)
	parentRef.Namespace = &gatewayNamespace
}

这部分对应当前代码中的实现。转换成 HTTPRoute YAML，大概就是：

复制代码

spec:
  parentRefs:
    - name: llm-gateway
      namespace: ai-demo
      sectionName: http

HTTPRoute 通过 parentRefs 表达它想绑定到哪些父资源，最常见的父资源就是 Gateway；也可以通过 sectionName 绑定到 Gateway 的某个指定 listener。目标 Gateway 需要允许来自 Route 所在 namespace 的 HTTPRoute 绑定，绑定才会成功。

如果 gatewayRef.namespace 没写，当前代码就不会设置 parentRefs.namespace。这种情况下，HTTPRoute 会引用同 namespace 下的 Gateway。如果 Gateway 和 HTTPRoute 不在同一个 namespace，HTTPRoute 可以通过 parentRefs.namespace 引用其他 namespace 的 Gateway，但是否能真正绑定成功，还要看 Gateway listener 的 allowedRoutes 规则。Gateway API 的跨 namespace 路由文档说明，Gateway 和 Route 可以部署在不同 namespace，Gateway listener 可以通过 attachment constraints 限制哪些 namespace 和哪些 Route 类型可以绑定到它。

也就是说，当前 Operator 负责把 parentRefs.namespace/name/sectionName 写进 HTTPRoute；至于 Gateway 是否接受这个 Route，要由 Gateway Controller 根据 Gateway listener 的 allowedRoutes 和 HTTPRoute status 来判断。

3.4 构造 hostnames：匹配请求 Host

HTTPRoute 的 hostnames 来自 gatewayRef.host：

复制代码

hostname := gatewayv1.Hostname(vllmService.Spec.GatewayRef.Host)

httpRoute.Spec = gatewayv1.HTTPRouteSpec{
	Hostnames: []gatewayv1.Hostname{
		hostname,
	},
}

当前代码里就是把 gatewayRef.host 转成 Gateway API 的 Hostname 类型，然后写入 HTTPRoute 的 spec.hostnames。如果 VLLMService 里写的是：

复制代码

gatewayRef:
  host: llm.example.local

那么创建出来的 HTTPRoute 大概会包含：

复制代码

spec:
  hostnames:
    - llm.example.local

hostnames 用来匹配 HTTP 请求中的 Host header，请求会先根据 hostnames 做匹配，然后再继续匹配 HTTPRoute rules。如果没有设置 hostname，流量会根据 HTTPRoute rules 和 filters 继续路由。所以访问时要带上对应 Host：

复制代码

curl -H "Host: llm.example.local" http://127.0.0.1:8888/v1/models

如果没有带 Host，或者 Host 和 HTTPRoute 的 spec.hostnames 不匹配，请求可能不会命中这条 HTTPRoute，最终表现可能是 404 或者被其他默认路由处理。

3.5 构造 backendRefs：转发到 Service

HTTPRoute 最后要把流量转发到后端 Service。当前代码中，backendRef 的 name 使用 service.Name，port 使用 portFor(vllmService)：

复制代码

backendPort := gatewayv1.PortNumber(portFor(vllmService))

BackendObjectReference: gatewayv1.BackendObjectReference{
	Name: gatewayv1.ObjectName(service.Name),
	Port: &backendPort,
}

这部分对应当前代码中的 HTTPRoute rules 构造。当前 Operator 创建的 Service 名称和 VLLMService 一致，所以这里的 backendRef 实际上会指向：

复制代码

rules:
  - backendRefs:
      - name: qwen-demo
        port: 8000

Gateway API 官方文档说明，backendRefs 用来定义匹配请求应该被发送到哪些 API 对象；如果没有设置 backendRefs，而且也没有设置能直接产生响应的 filters，可能会返回 500。

3.6 没有写 matches 时意味着什么

当前代码没有给 HTTPRoute rule 设置 matches。这不是错误。Gateway API 官方文档说明，如果没有设置 matches，默认行为相当于匹配路径前缀 /，也就是匹配所有 HTTP 请求路径。

所以当前这版 HTTPRoute 的含义可以理解成：

复制代码

只要请求 Host 匹配 llm.example.local，就把所有路径转发到 qwen-demo Service 的 8000 端口。

这对 vLLM 服务是合适的，因为 vLLM 的 OpenAI-compatible API 会使用 /v1/models、/v1/chat/completions 这类路径。当前 HTTPRoute 没有限制 path，就可以把这些路径都转发给后端 vLLM 服务。

四、设置 OwnerReference 并监听 HTTPRoute

4.1 SetControllerReference 的作用

HTTPRoute 的最后一步是设置 OwnerReference：

复制代码

return controllerutil.SetControllerReference(vllmService, httpRoute, r.Scheme)

当前代码就是在 CreateOrUpdate 的 MutateFn 末尾设置这个控制关系。SetControllerReference 会把 owner 设置为对象的 Controller OwnerReference；这个关系可用于 Kubernetes 垃圾回收，也可配合 owner 事件关联，让子资源变化时重新触发 owner 对象的 Reconcile。

这一步非常重要，原因有两个：

复制代码

1. 删除 VLLMService 时，HTTPRoute 可以作为子资源被 Kubernetes 垃圾回收。
2. HTTPRoute 被修改或删除时，controller-runtime 可以通过 owner 关系重新触发 VLLMService 的 Reconcile。

不过要注意，OwnerReference 一般要求 owner 和被控制对象在同一个 namespace。当前设计中 HTTPRoute 和 VLLMService 都创建在同一个 namespace，所以这个设计是合理的。Gateway 是被引用的外部资源，不是 VLLMService 的子资源，因此不会对 Gateway 设置 OwnerReference。

4.2 Owns(&gatewayv1.HTTPRoute{}) 的作用

除了设置 OwnerReference，controller 还需要在 SetupWithManager 中声明自己拥有 HTTPRoute。当前代码中已经增加了：

复制代码

func (r *VLLMServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&aiinfrav1alpha1.VLLMService{}).
		Owns(&appsv1.Deployment{}).
		Owns(&corev1.Service{}).
		Owns(&gatewayv1.HTTPRoute{}).
		Named("vllmservice").
		Complete(r)
}

这意味着，如果用户手动删除当前 VLLMService 拥有的 HTTPRoute，controller-runtime 会根据 owner 关系把对应 VLLMService 重新放入队列，下一次 Reconcile 会再次执行 CreateOrUpdate HTTPRoute，从而把 HTTPRoute 重建出来。

4.3 删除 HTTPRoute 后为什么会自动重建

自愈链路可以这样理解：

复制代码

用户删除 HTTPRoute/qwen-demo
  -> HTTPRoute 是 VLLMService/qwen-demo 拥有的子资源
  -> controller 通过 Owns(&gatewayv1.HTTPRoute{}) 监听到子资源变化
  -> VLLMService/qwen-demo 被重新加入 Reconcile 队列
  -> reconcileHTTPRoute 再次执行
  -> CreateOrUpdate 重新创建 HTTPRoute/qwen-demo

这就是 Operator 的自愈能力：只要 VLLMService 的期望状态仍然包含 gatewayRef，并且 Gateway 依赖检查通过，那么被删除的 HTTPRoute 就会被重新创建。

4.4 删除 gatewayRef 后为什么会清理 HTTPRoute

如果用户从 VLLMService 中删除 gatewayRef，新的期望状态就是：当前模型服务不再接入 Gateway。这时 Operator 应该删除当前 VLLMService 拥有的 HTTPRoute，但 Deployment 和 Service 应该继续保留。这部分逻辑已经在 reconcileHTTPRoute 的开头实现：

复制代码

if !gatewayRefEnabled(vllmService) {
	if err := r.deleteOwnedHTTPRouteIfExists(ctx, vllmService); err != nil {
		return nil, "", 0, err
	}
	return nil, "", 0, nil
}

这段代码表示：gatewayRef 不存在时，Operator 不创建 HTTPRoute，并调用 deleteOwnedHTTPRouteIfExists 清理旧的 owned HTTPRoute。当前删除逻辑只会删除当前 VLLMService 控制的 HTTPRoute，不会删除用户手工创建或其他控制器管理的同名资源。

五、更新 VLLMService Status

HTTPRoute 同步完成后，主 Reconcile 会调用：

复制代码

r.updateVLLMServiceStatus(ctx, vllmService, deployment, service, httpRoute, routeMessage)

当前 updateVLLMServiceStatus 中会更新这些字段：

复制代码

phase
readyReplicas
deploymentName
serviceName
gatewayRefName
gatewayRefNamespace
httpRouteName
message

代码里会根据当前 VLLMService 是否配置了 gatewayRef 来设置 gatewayRefName 和 gatewayRefNamespace；如果本轮成功创建或更新了 HTTPRoute，就把 httpRoute.Name 写入 status.httpRouteName。对应的 API 类型中也已经定义了这些状态字段：

复制代码

GatewayRefName string `json:"gatewayRefName,omitempty"`
GatewayRefNamespace string `json:"gatewayRefNamespace,omitempty"`
HTTPRouteName string `json:"httpRouteName,omitempty"`
Message string `json:"message,omitempty"`

当前 VLLMServiceStatus 里确实包含 gatewayRefName、gatewayRefNamespace、httpRouteName 和 message 这些字段。

这里要注意一个边界：当前 VLLMService status 只是记录了 Operator 自己这一层看到的状态，例如引用的 Gateway 名称、HTTPRoute 名称和 routeMessage。它还没有把 HTTPRoute 自己的 Accepted、ResolvedRefs 等条件同步回 VLLMService status。所以真正判断 Gateway 是否接受 HTTPRoute，仍然要看 HTTPRoute 自己的 status：

复制代码

kubectl -n ai-demo describe httproute qwen-demo
kubectl -n ai-demo get httproute qwen-demo -o yaml

Gateway API 官方文档说明，HTTPRoute 添加 parentRefs 指向 Gateway 后，管理 Gateway 的 controller 应该在 HTTPRoute status 的 parents 中写入父资源状态，例如 Accepted=True 表示该 HTTPRoute 已被 Gateway 接受。

六、重新构建并部署 Operator

如果这次改动涉及 API 类型，例如新增了 status.httpRouteName、status.gatewayRefName、status.gatewayRefNamespace，需要执行：

复制代码

make generate
make manifests

如果只是改 controller 逻辑，理论上可以不执行 make generate，但学习阶段建议走完整流程，避免 CRD、RBAC、DeepCopy 或 role.yaml 漏更新。推荐执行顺序：

复制代码

cd /root/projects/vllmservice-operator

go fmt ./...
go mod tidy

make generate
make manifests
make build

如果 make build 通过，再构建新镜像。建议不要复用旧 tag，直接使用新版本号，例如：

复制代码

make docker-build IMG=registry.cn-hangzhou.aliyuncs.com/docker-test-dai/vllmservice-operator:v0.4
docker push registry.cn-hangzhou.aliyuncs.com/docker-test-dai/vllmservice-operator:v0.4

如果 API 或 CRD 有变化，先安装 CRD：

复制代码

make install

然后部署新版本 Operator：

复制代码

make deploy IMG=registry.cn-hangzhou.aliyuncs.com/docker-test-dai/vllmservice-operator:v0.4

查看 Operator Pod：

复制代码

kubectl -n vllmservice-operator-system get pod
kubectl -n vllmservice-operator-system logs deploy/vllmservice-operator-controller-manager

这里一定要注意：如果继续复用旧镜像 tag，Kubernetes 可能不会重新拉取镜像，导致你以为部署了新代码，实际运行的还是旧代码。学习阶段最稳妥的方式是每次功能变化都换一个新 tag。

七、创建 Gateway 和带 gatewayRef 的 VLLMService

7.1 创建 Gateway

HTTPRoute 是挂到 Gateway 上的，所以验证前要先有 Gateway。这里假设已经安装了 Gateway API 和对应的 Gateway Controller，例如 Envoy Gateway，并且已经存在 GatewayClass eg。

先创建一个简单的 Gateway：

复制代码

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: llm-gateway
  namespace: ai-demo
spec:
  gatewayClassName: eg
  listeners:
    - name: http
      protocol: HTTP
      port: 80
      hostname: llm.example.local

应用后，查看 Gateway：

复制代码

kubectl -n ai-demo get gateway
kubectl -n ai-demo describe gateway llm-gateway

Gateway API 的 Gateway 文档说明，Gateway listener 用来定义 hostname、port、protocol、TLS 配置以及哪些 Route 可以附加到 listener。

7.2 创建带 gatewayRef 的 VLLMService

Gateway 准备好后，创建 VLLMService：

复制代码

apiVersion: aiinfra.example.com/v1alpha1
kind: VLLMService
metadata:
  name: qwen-demo
  namespace: ai-demo
spec:
  image: docker.m.daocloud.io/vllm/vllm-openai:latest
  modelPath: /data/models/Qwen2.5-1.5B-Instruct
  modelName: qwen2.5-1.5b-instruct
  replicas: 1
  port: 8000

  schedulerName: volcano
  runtimeClassName: nvidia

  nodeSelector:
    kubernetes.io/hostname: master-01

  labels:
    aiinfra.example.com/model: qwen2.5
    aiinfra.example.com/runtime: vllm
    aiinfra.example.com/team: infra

  gatewayRef:
    name: llm-gateway
    namespace: ai-demo
    sectionName: http
    host: llm.example.local

  resources:
    requests:
      cpu: "2"
      memory: 8Gi
      volcano.sh/vgpu-number: "1"
      volcano.sh/vgpu-memory: "6144"
      volcano.sh/vgpu-cores: "50"
    limits:
      cpu: "4"
      memory: 16Gi
      volcano.sh/vgpu-number: "1"
      volcano.sh/vgpu-memory: "6144"
      volcano.sh/vgpu-cores: "50"

  storage:
    pvcName: qwen-model-pvc
    mountPath: /data/models
    readOnly: true

应用该yaml文件后，这时 Operator 应该自动同步 Deployment、Service 和 HTTPRoute。

7.3 sectionName、hostname、namespace 要对应

这里要重点注意三个字段：

复制代码

gatewayRef:
  name: llm-gateway
  namespace: ai-demo
  sectionName: http
  host: llm.example.local

gatewayRef.name 要和 Gateway 名称一致，gatewayRef.namespace 要和 Gateway 所在 namespace 一致，gatewayRef.sectionName 要和 Gateway listener 的 name 一致，gatewayRef.host 要和访问时使用的 Host 一致。listener 协议也必须是 HTTP 或 HTTPS，因为当前 Operator 创建的是 HTTPRoute。

如果 Gateway 和 VLLMService 不在同一个 namespace，还要额外关注 Gateway listener 的 allowedRoutes。Gateway API 跨 namespace 路由文档说明，Gateway 和 Route 可以跨 namespace 绑定，但绑定是否成功取决于 Gateway 侧的 attachment constraints。

八、验证 HTTPRoute 和 Gateway 访问

8.1 验证 VLLMService status

先看 VLLMService：

复制代码

kubectl -n ai-demo get vllmservice qwen-demo
kubectl -n ai-demo get vllmservice qwen-demo -o yaml

重点看 status：

复制代码

status:
  deploymentName: qwen-demo
  serviceName: qwen-demo
  gatewayRefName: llm-gateway
  gatewayRefNamespace: ai-demo
  httpRouteName: qwen-demo
  phase: Running

如果 httpRouteName 为空，要继续看 status.message 和 Operator 日志，确认 Gateway 是否存在、listener 是否存在、protocol 是否正确。

8.2 验证 Deployment、Pod、Service

查看 Deployment 和 Pod：

复制代码

kubectl -n ai-demo get deploy qwen-demo
kubectl -n ai-demo get pod -l app.kubernetes.io/instance=qwen-demo
kubectl -n ai-demo describe pod -l app.kubernetes.io/instance=qwen-demo

查看 Service：

复制代码

kubectl -n ai-demo get svc qwen-demo
kubectl -n ai-demo describe svc qwen-demo
kubectl -n ai-demo get endpoints qwen-demo
kubectl -n ai-demo get endpointslice -l kubernetes.io/service-name=qwen-demo

如果 Service 没有 endpoints，说明后端 Pod 可能还没 Ready，或者 Service selector 没有匹配到 Pod。这个时候即使 HTTPRoute 创建成功，流量也无法正常转发到 vLLM 容器。

8.3 验证 HTTPRoute spec 和 status

查看 HTTPRoute：

复制代码

kubectl -n ai-demo get httproute qwen-demo
kubectl -n ai-demo describe httproute qwen-demo
kubectl -n ai-demo get httproute qwen-demo -o yaml

正常情况下，HTTPRoute 的 spec 应该类似：

复制代码

spec:
  parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: llm-gateway
      namespace: ai-demo
      sectionName: http
  hostnames:
    - llm.example.local
  rules:
    - backendRefs:
        - group: ""
          kind: Service
          name: qwen-demo
          port: 8000
          weight: 1

这里可能会出现一些默认字段，例如 group、kind、weight，这些通常是 API Server 或 Gateway API 默认值体现出来的结果，不影响理解。

真正判断 HTTPRoute 是否被 Gateway 接受，要看 status 中的 conditions，重点关注：

复制代码

Accepted
ResolvedRefs

如果 Accepted=True，说明 Gateway 接受了这个 HTTPRoute。如果 ResolvedRefs=True，说明 HTTPRoute 引用的后端对象等引用解析正常。反过来，如果 Accepted=False，要重点看 Gateway listener 的 allowedRoutes、namespace、sectionName、hostname 是否匹配；如果 ResolvedRefs=False，要重点看 Service 是否存在、端口是否正确、Service 是否有 endpoints。

8.4 通过 Gateway 访问 `/v1/models`

HTTPRoute 创建成功后，就可以通过 Gateway 访问后端 vLLM 服务。具体 Gateway 数据面 Service 名称取决于你使用的 Gateway Controller。以 Envoy Gateway 为例，可以先查看 Service：

复制代码

kubectl -n envoy-gateway-system get svc

找到对应 Gateway 的数据面 Service 后，做端口转发。下面命令里的 Service 名称需要按你的实际环境替换：

复制代码

kubectl -n envoy-gateway-system port-forward svc/<gateway-data-plane-service-name> 8888:80

然后访问 /v1/models：

复制代码

curl -H "Host: llm.example.local" http://127.0.0.1:8888/v1/models

这里一定要带：

复制代码

-H "Host: llm.example.local"

因为 HTTPRoute 的 hostnames 写的是 llm.example.local。如果不带 Host，或者 Host 和 HTTPRoute 不匹配，请求可能不会命中这条 HTTPRoute。

8.5 通过 Gateway 访问 `/v1/chat/completions`

再测试 Chat Completions：

复制代码

curl -X POST http://127.0.0.1:8888/v1/chat/completions \
  -H "Host: llm.example.local" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-1.5b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "1+1等于几？"
      }
    ],
    "max_tokens": 64,
    "temperature": 0
  }'

如果 /v1/models 正常，但 /v1/chat/completions 报错，要检查请求体里的 model 是否和 vLLM 启动参数中的 --served-model-name 一致。当前容器参数里模型服务名来自：

复制代码

"--served-model-name", vllmservice.Spec.ModelName,

代码是通过 Spec.ModelName 设置 vLLM 的 served model name。如果 VLLMService 里写的是：

复制代码

modelName: qwen2.5-1.5b-instruct

请求体里也应该写：

复制代码

{
  "model": "qwen2.5-1.5b-instruct"
}

九、常见问题排查

9.1 没有创建 HTTPRoute

先检查 VLLMService 是否配置了 gatewayRef：

复制代码

kubectl -n ai-demo get vllmservice qwen-demo -o yaml

如果 spec.gatewayRef 不存在，Operator 不会创建 HTTPRoute，这是正常行为。

再检查 Operator 日志：

复制代码

kubectl -n vllmservice-operator-system logs deploy/vllmservice-operator-controller-manager

如果看到 Gateway 不存在、listener 不存在、listener 协议不是 HTTP/HTTPS 之类的信息，就要回到 Gateway 配置排查。

9.2 Operator 报 forbidden

如果日志中出现 forbidden，通常是 RBAC 没有更新。检查 config/rbac/role.yaml 是否包含：

复制代码

resources:
  - gateways
verbs:
  - get
  - list
  - watch

以及：

复制代码

resources:
  - httproutes
verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch

如果本地 marker 已经写了，但集群里还是没权限，通常是忘记执行：

复制代码

make manifests
make deploy IMG=...

或者没有重新应用最新 RBAC。

9.3 HTTPRoute 存在，但是 Accepted=False

先查看 HTTPRoute：

复制代码

kubectl -n ai-demo describe httproute qwen-demo

重点看 status.parents.conditions。如果 Accepted=False，通常说明 Gateway 没有接受这条 Route。常见原因包括：

复制代码

1. parentRefs.name 写错。
2. parentRefs.namespace 写错。
3. parentRefs.sectionName 和 Gateway listener name 不一致。
4. Gateway listener 的 allowedRoutes 不允许当前 namespace 的 HTTPRoute 绑定。
5. HTTPRoute hostnames 和 Gateway listener hostname 不匹配。

跨 namespace 场景尤其要注意 allowedRoutes。Gateway API 跨 namespace 路由文档说明，Gateway 和 Route 跨 namespace 绑定是通过双方关系共同决定的，Gateway 侧可以通过 listener attachment constraints 限制哪些 namespace 的 Route 可以绑定。

9.4 HTTPRoute 存在，但是 ResolvedRefs=False

如果 ResolvedRefs=False，通常要检查 backendRef。当前 Operator 生成的 backendRef 指向 Service：

复制代码

backendRefs:
  - name: qwen-demo
    port: 8000

排查命令：

复制代码

kubectl -n ai-demo get svc qwen-demo
kubectl -n ai-demo describe svc qwen-demo
kubectl -n ai-demo get endpoints qwen-demo
kubectl -n ai-demo get endpointslice -l kubernetes.io/service-name=qwen-demo

如果 Service 不存在、端口不对、Service 没有 endpoints，都可能导致流量无法正常转发。

9.5 curl 返回 404

如果 HTTPRoute 已经 Accepted，但 curl 还是 404，先检查 Host：

复制代码

curl -H "Host: llm.example.local" http://127.0.0.1:8888/v1/models

如果忘记带 Host，Gateway 可能匹配不到 hostnames: llm.example.local 这条 HTTPRoute。

再检查访问的是不是正确的 Gateway 数据面 Service：

复制代码

kubectl -n envoy-gateway-system get svc

如果 port-forward 到了错误的 Service，即使 HTTPRoute 正确，也访问不到对应后端。

9.6 `/v1/chat/completions` 报错

如果 /v1/models 正常，但 /v1/chat/completions 报错，优先检查请求体里的 model 是否正确。它应该和 VLLMService 的 spec.modelName 一致，因为当前 controller 会把 spec.modelName 传给 vLLM 的 --served-model-name 参数。另外还要检查 vLLM Pod 日志：

复制代码

kubectl -n ai-demo logs deploy/qwen-demo

如果是模型加载失败、显存不足、模型路径错误、请求参数不支持，通常都能在 vLLM 容器日志里看到更明确的原因。

十、本文总结

到这一篇为止，VLLMService Operator 的链路已经从：

复制代码

VLLMService -> Deployment -> Pod -> Service

扩展成了：

复制代码

VLLMService -> Deployment -> Pod -> Service -> HTTPRoute -> Gateway

当前实现已经能完成一个比较完整的模型服务暴露流程：用户声明 VLLMService，Operator 自动创建 Deployment、Service 和 HTTPRoute，Gateway 根据 HTTPRoute 把请求转发到 vLLM Service，最后通过 /v1/models 和 /v1/chat/completions 验证模型服务是否可访问。

下一步可以继续考虑两个方向：第一，把 HTTPRoute 的 Accepted、ResolvedRefs 等条件同步到 VLLMService status，让用户只看 VLLMService 就能知道路由是否真正生效；第二，继续增加 ServiceMonitor 和 PrometheusRule，让模型服务具备监控和告警能力。

本人水平有限，欢迎各位大佬批评指正。

VLLMService Operator 开发第八篇：自动创建 HTTPRoute 并完成 Gateway 访问验证

前言

一、本篇要实现什么

二、从 reconcileHTTPRoute 进入 HTTPRoute 创建逻辑

2.1 只有 Gateway 依赖校验通过后才创建 HTTPRoute

2.2 HTTPRoute 的 name 和 namespace 设计

三、使用 CreateOrUpdate 构造 HTTPRoute

3.1 CreateOrUpdate 的作用

3.2 设置 labels

3.3 构造 parentRefs：绑定 Gateway 和 listener

3.4 构造 hostnames：匹配请求 Host

3.5 构造 backendRefs：转发到 Service

3.6 没有写 matches 时意味着什么

四、设置 OwnerReference 并监听 HTTPRoute

4.1 SetControllerReference 的作用

4.2 Owns(&gatewayv1.HTTPRoute{}) 的作用

4.3 删除 HTTPRoute 后为什么会自动重建

4.4 删除 gatewayRef 后为什么会清理 HTTPRoute

五、更新 VLLMService Status

六、重新构建并部署 Operator

七、创建 Gateway 和带 gatewayRef 的 VLLMService

7.1 创建 Gateway

7.2 创建带 gatewayRef 的 VLLMService

7.3 sectionName、hostname、namespace 要对应

八、验证 HTTPRoute 和 Gateway 访问

8.1 验证 VLLMService status

8.2 验证 Deployment、Pod、Service

8.3 验证 HTTPRoute spec 和 status

8.4 通过 Gateway 访问 /v1/models

8.5 通过 Gateway 访问 /v1/chat/completions

九、常见问题排查

9.1 没有创建 HTTPRoute

9.2 Operator 报 forbidden

9.3 HTTPRoute 存在，但是 Accepted=False

9.4 HTTPRoute 存在，但是 ResolvedRefs=False

9.5 curl 返回 404

9.6 /v1/chat/completions 报错

十、本文总结

8.4 通过 Gateway 访问 `/v1/models`

8.5 通过 Gateway 访问 `/v1/chat/completions`

9.6 `/v1/chat/completions` 报错