初次尝试在kubernetes 1.31 上安装 人工智能模型运行平台 llm-d

备注:

按照官方文档,排除官方文档不清楚的步骤,安装到最后一步了,只缺 HF_TOKEN了,因为我的kubernetes集群无法访问HF.

root@bastion quickstart\]# cat /etc/redhat-release Rocky Linux release 9.5 (Blue Onyx) \[root@bastion quickstart\]# \[root@bastion quickstart\]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01.kcloudonline.com Ready control-plane 46h v1.31.0 worker01.kcloudonline.com Ready \ 46h v1.31.0 worker02.kcloudonline.com Ready \ 46h v1.31.0 worker03.kcloudonline.com Ready \ 46h v1.31.0 \[root@bastion quickstart\]# ### 获取安装代码/介质 (Get the code) Clone the llm-d-deployer repository. git clone https://github.com/llm-d/llm-d-deployer.git Navigate to the quickstart directory cd llm-d-deployer/quickstart \[root@bastion software\]# dnf install git -y \[root@bastion software\]# mkdir llm-d \[root@bastion software\]# cd llm-d/ \[root@bastion llm-d\]# git clone https://github.com/llm-d/llm-d-deployer.git \[root@bastion llm-d\]# cd llm-d-deployer/ \[root@bastion llm-d-deployer\]# ls chart-dependencies CONTRIBUTING.md ct-install.yaml DCO LICENSE Makefile OWNERS README.md charts cr.yaml ct.yaml helpers lintconf.yaml notes quickstart REPO_DOCS.md \[root@bastion llm-d-deployer\]# cd quickstart/ \[root@bastion quickstart\]# ls examples grafana grafana-setup.md infra install-deps.sh llmd-installer.sh metrics-overview.md README.md README-minikube.md test-request.sh \[root@bastion quickstart\]# ### 要求的工具 (Required tools) Following prerequisite are required for the installer to work. yq (mikefarah) -- installation jq -- download \& install guide git -- installation guide Helm -- quick-start install Kustomize -- official install docs kubectl -- install \& setup You can use the installer script that installs all the required dependencies. ./install-deps.sh # 下载并安装yq sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq # 赋予执行权限 sudo chmod +x /usr/local/bin/yq # 验证安装 yq --version 使用官方脚本安装(推荐) # 下载并安装最新版本的Kustomize curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash # 将kustomize移动到系统PATH中 sudo mv kustomize /usr/local/bin/ # 验证安装 kustomize version \[root@bastion quickstart\]# sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq Resolving release-assets.githubusercontent.com (release-assets.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ... Connecting to release-assets.githubusercontent.com (release-assets.githubusercontent.com)\|185.199.110.133\|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 11477176 (11M) \[application/octet-stream

Saving to: '/usr/local/bin/yq'

/usr/local/bin/yq 100%[=====================================================================================>] 10.95M 1002KB/s in 7.1s

2025-09-26 08:34:22 (1.55 MB/s) - '/usr/local/bin/yq' saved [11477176/11477176]

root@bastion quickstart\]# sudo chmod +x /usr/local/bin/yq \[root@bastion quickstart\]# yq --version yq (https://github.com/mikefarah/yq/) version v4.47.2 \[root@bastion quickstart\]# \[root@bastion llm-d\]# curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash v5.7.1 kustomize installed to /software/llm-d/kustomize \[root@bastion llm-d\]# ls kustomize llm-d-deployer \[root@bastion llm-d\]# cp kustomize /usr/local/bin/ \[root@bastion llm-d\]# kustomize version v5.7.1 \[root@bastion llm-d\]# \[root@bastion quickstart\]# ./install-deps.sh Rocky Linux 9 - BaseOS 2.5 kB/s \| 4.1 kB 00:01 Rocky Linux 9 - AppStream 5.0 kB/s \| 4.5 kB 00:00 Rocky Linux 9 - Extras 631 B/s \| 2.9 kB 00:04 Dependencies resolved. ========================================================================================================================================================================= Package Architecture Version Repository Size =========================================================================================================================================================================Installing: make x86_64 1:4.3-8.el9 baseos 529 k Transaction Summary =========================================================================================================================================================================Install 1 Package Total download size: 529 k Installed size: 1.6 M Downloading Packages: make-4.3-8.el9.x86_64.rpm 301 kB/s \| 529 kB 00:01 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total 212 kB/s \| 529 kB 00:02 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : make-1:4.3-8.el9.x86_64 1/1 Running scriptlet: make-1:4.3-8.el9.x86_64 1/1 Verifying : make-1:4.3-8.el9.x86_64 1/1 Installed: make-1:4.3-8.el9.x86_64 Complete! Installing yq... \[root@bastion quickstart\]# ### 要求的凭证和配置 (Required credentials and configuration) llm-d-deployer GitHub repo -- clone here([https://github.com/llm-d/llm-d-deployer.git](https://github.com/llm-d/llm-d-deployer.git "https://github.com/llm-d/llm-d-deployer.git")) HuggingFace HF_TOKEN (https://huggingface.co/docs/hub/en/security-tokens) with download access for the model you want to use. By default the sample application will use meta-llama/Llama-3.2-3B-Instruct. ⚠️ Your Hugging Face account must have access to the model you want to use. You may need to visit Hugging Face meta-llama/Llama-3.2-3B-Instruct and accept the usage terms if you have not already done so. ### 目标平台 (Target Platforms) Since the llm-d-deployer is based on helm charts, llm-d can be deployed on a variety of Kubernetes platforms. ### 安装llm-d (llm-d Installation) Only a single installation of llm-d on a cluster is currently supported. In the future, multiple model services will be supported. Until then, uninstall llm-d before reinstalling. The llm-d-deployer contains all the helm charts necessary to deploy llm-d. To facilitate the installation of the helm charts, the llmd-installer.sh script is provided. This script will populate the necessary manifests in the manifests directory. After this, it will apply all the manifests in order to bring up the cluster. The llmd-installer.sh script aims to simplify the installation of llm-d using the llm-d-deployer as it's main function. It scripts as many of the steps as possible to make the installation process more streamlined. This includes: Installing the GAIE infrastructure Creating the namespace with any special configurations Creating the pull secret to download the images Creating the model service CRDs Applying the helm charts Deploying the sample app (model service) It also supports uninstalling the llm-d infrastructure and the sample app. Before proceeding with the installation, ensure you have completed the prerequisites and are able to issue kubectl or oc commands to your cluster by configuring your \~/.kube/config file or by using the oc login command. #### Usage The installer needs to be run from the llm-d-deployer/quickstart directory as a cluster admin with CLI access to the cluster. ./llmd-installer.sh \[OPTIONS

Flags

案例(Examples)

在Kubernetes 安装 (Install llm-d on an Existing Kubernetes Cluster)

export HF_TOKEN="your-token"

./llmd-installer.sh

root@bastion quickstart\]# ./llmd-installer.sh 📂 Setting up script environment... kubectl can reach to a running Kubernetes cluster. ❌ HF_TOKEN not set; Run: export HF_TOKEN=\ \[root@bastion quickstart\]# 备注: llm-d的安装和模型没有分离,这个设计我觉得有点问题。按照我的理解,安装好了 再上载模型可能更好。 在OpenShift上安装(Install on OpenShift ) Before running the installer, ensure you have logged into the cluster as a cluster administrator. For example: oc login --token=sha256\~yourtoken --server=https://api.yourcluster.com:6443 export HF_TOKEN="your-token" ./llmd-installer.sh ### Validation The inference-gateway serves as the HTTP ingress point for all inference requests in our deployment. It's implemented as a Kubernetes Gateway (gateway.networking.k8s.io/v1) using either kgateway or istio as the gatewayClassName, and sits in front of your inference pods to handle path-based routing, load balancing, retries, and metrics. This example validates that the gateway itself is routing your completion requests correctly. You can execute the test-request.sh script to test on the cluster. # Default options (the model id will be discovered via /v1/models) ./test-request.sh # Non-default namespace/model ./test-request.sh -n \ -m \ --minikube If you receive an error indicating PodSecurity "restricted" violations when running the smoke-test script, you need to remove the restrictive PodSecurity labels from the namespace. Once these labels are removed, re-run the script and it should proceed without PodSecurity errors. Run the following command: kubectl label namespace \ \\ pod-security.kubernetes.io/warn- \\ pod-security.kubernetes.io/warn-version- \\ pod-security.kubernetes.io/audit- \\ pod-security.kubernetes.io/audit-version-

相关推荐
jiayong232 分钟前
Spring AI Alibaba 深度解析(三):实战示例与最佳实践
java·人工智能·spring
北邮刘老师18 分钟前
【智能体互联协议解析】需要“智能体名字系统”(ANS)吗?
网络·人工智能·大模型·智能体·智能体互联网
梁辰兴38 分钟前
AI解码千年甲骨文,指尖触碰的文明觉醒!
人工智能·ai·ai+·文明·甲骨文·ai赋能·梁辰兴
阿里云大数据AI技术41 分钟前
# Hologres Dynamic Table:高效增量刷新,构建实时统一数仓的核心利器
人工智能·数据分析
JxWang051 小时前
pandas计算某列每行带有分隔符的数据中包含特定值的次数
人工智能
能源系统预测和优化研究1 小时前
创新点解读:基于非线性二次分解的Ridge-RF-XGBoost时间序列预测(附代码实现)
人工智能·深度学习·算法
执笔论英雄1 小时前
【RL】ROLL下载模型流程
人工智能·算法·机器学习
لا معنى له1 小时前
目标分割介绍及最新模型----学习笔记
人工智能·笔记·深度学习·学习·机器学习·计算机视觉
carver w2 小时前
one-hot编码
人工智能
邮一朵向日葵2 小时前
企查查开放平台MCP:为AI智能体注入精准商业数据,驱动智能决策新时代
大数据·人工智能