初次尝试在kubernetes 1.31 上安装 人工智能模型运行平台 llm-d

备注:

按照官方文档,排除官方文档不清楚的步骤,安装到最后一步了,只缺 HF_TOKEN了,因为我的kubernetes集群无法访问HF.

root@bastion quickstart\]# cat /etc/redhat-release Rocky Linux release 9.5 (Blue Onyx) \[root@bastion quickstart\]# \[root@bastion quickstart\]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01.kcloudonline.com Ready control-plane 46h v1.31.0 worker01.kcloudonline.com Ready \ 46h v1.31.0 worker02.kcloudonline.com Ready \ 46h v1.31.0 worker03.kcloudonline.com Ready \ 46h v1.31.0 \[root@bastion quickstart\]# ### 获取安装代码/介质 (Get the code) Clone the llm-d-deployer repository. git clone https://github.com/llm-d/llm-d-deployer.git Navigate to the quickstart directory cd llm-d-deployer/quickstart \[root@bastion software\]# dnf install git -y \[root@bastion software\]# mkdir llm-d \[root@bastion software\]# cd llm-d/ \[root@bastion llm-d\]# git clone https://github.com/llm-d/llm-d-deployer.git \[root@bastion llm-d\]# cd llm-d-deployer/ \[root@bastion llm-d-deployer\]# ls chart-dependencies CONTRIBUTING.md ct-install.yaml DCO LICENSE Makefile OWNERS README.md charts cr.yaml ct.yaml helpers lintconf.yaml notes quickstart REPO_DOCS.md \[root@bastion llm-d-deployer\]# cd quickstart/ \[root@bastion quickstart\]# ls examples grafana grafana-setup.md infra install-deps.sh llmd-installer.sh metrics-overview.md README.md README-minikube.md test-request.sh \[root@bastion quickstart\]# ### 要求的工具 (Required tools) Following prerequisite are required for the installer to work. yq (mikefarah) -- installation jq -- download \& install guide git -- installation guide Helm -- quick-start install Kustomize -- official install docs kubectl -- install \& setup You can use the installer script that installs all the required dependencies. ./install-deps.sh # 下载并安装yq sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq # 赋予执行权限 sudo chmod +x /usr/local/bin/yq # 验证安装 yq --version 使用官方脚本安装(推荐) # 下载并安装最新版本的Kustomize curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash # 将kustomize移动到系统PATH中 sudo mv kustomize /usr/local/bin/ # 验证安装 kustomize version \[root@bastion quickstart\]# sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq Resolving release-assets.githubusercontent.com (release-assets.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ... Connecting to release-assets.githubusercontent.com (release-assets.githubusercontent.com)\|185.199.110.133\|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 11477176 (11M) \[application/octet-stream

Saving to: '/usr/local/bin/yq'

/usr/local/bin/yq 100%[=====================================================================================>] 10.95M 1002KB/s in 7.1s

2025-09-26 08:34:22 (1.55 MB/s) - '/usr/local/bin/yq' saved [11477176/11477176]

root@bastion quickstart\]# sudo chmod +x /usr/local/bin/yq \[root@bastion quickstart\]# yq --version yq (https://github.com/mikefarah/yq/) version v4.47.2 \[root@bastion quickstart\]# \[root@bastion llm-d\]# curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash v5.7.1 kustomize installed to /software/llm-d/kustomize \[root@bastion llm-d\]# ls kustomize llm-d-deployer \[root@bastion llm-d\]# cp kustomize /usr/local/bin/ \[root@bastion llm-d\]# kustomize version v5.7.1 \[root@bastion llm-d\]# \[root@bastion quickstart\]# ./install-deps.sh Rocky Linux 9 - BaseOS 2.5 kB/s \| 4.1 kB 00:01 Rocky Linux 9 - AppStream 5.0 kB/s \| 4.5 kB 00:00 Rocky Linux 9 - Extras 631 B/s \| 2.9 kB 00:04 Dependencies resolved. ========================================================================================================================================================================= Package Architecture Version Repository Size =========================================================================================================================================================================Installing: make x86_64 1:4.3-8.el9 baseos 529 k Transaction Summary =========================================================================================================================================================================Install 1 Package Total download size: 529 k Installed size: 1.6 M Downloading Packages: make-4.3-8.el9.x86_64.rpm 301 kB/s \| 529 kB 00:01 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total 212 kB/s \| 529 kB 00:02 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : make-1:4.3-8.el9.x86_64 1/1 Running scriptlet: make-1:4.3-8.el9.x86_64 1/1 Verifying : make-1:4.3-8.el9.x86_64 1/1 Installed: make-1:4.3-8.el9.x86_64 Complete! Installing yq... \[root@bastion quickstart\]# ### 要求的凭证和配置 (Required credentials and configuration) llm-d-deployer GitHub repo -- clone here([https://github.com/llm-d/llm-d-deployer.git](https://github.com/llm-d/llm-d-deployer.git "https://github.com/llm-d/llm-d-deployer.git")) HuggingFace HF_TOKEN (https://huggingface.co/docs/hub/en/security-tokens) with download access for the model you want to use. By default the sample application will use meta-llama/Llama-3.2-3B-Instruct. ⚠️ Your Hugging Face account must have access to the model you want to use. You may need to visit Hugging Face meta-llama/Llama-3.2-3B-Instruct and accept the usage terms if you have not already done so. ### 目标平台 (Target Platforms) Since the llm-d-deployer is based on helm charts, llm-d can be deployed on a variety of Kubernetes platforms. ### 安装llm-d (llm-d Installation) Only a single installation of llm-d on a cluster is currently supported. In the future, multiple model services will be supported. Until then, uninstall llm-d before reinstalling. The llm-d-deployer contains all the helm charts necessary to deploy llm-d. To facilitate the installation of the helm charts, the llmd-installer.sh script is provided. This script will populate the necessary manifests in the manifests directory. After this, it will apply all the manifests in order to bring up the cluster. The llmd-installer.sh script aims to simplify the installation of llm-d using the llm-d-deployer as it's main function. It scripts as many of the steps as possible to make the installation process more streamlined. This includes: Installing the GAIE infrastructure Creating the namespace with any special configurations Creating the pull secret to download the images Creating the model service CRDs Applying the helm charts Deploying the sample app (model service) It also supports uninstalling the llm-d infrastructure and the sample app. Before proceeding with the installation, ensure you have completed the prerequisites and are able to issue kubectl or oc commands to your cluster by configuring your \~/.kube/config file or by using the oc login command. #### Usage The installer needs to be run from the llm-d-deployer/quickstart directory as a cluster admin with CLI access to the cluster. ./llmd-installer.sh \[OPTIONS

Flags

案例(Examples)

在Kubernetes 安装 (Install llm-d on an Existing Kubernetes Cluster)

export HF_TOKEN="your-token"

./llmd-installer.sh

root@bastion quickstart\]# ./llmd-installer.sh 📂 Setting up script environment... kubectl can reach to a running Kubernetes cluster. ❌ HF_TOKEN not set; Run: export HF_TOKEN=\ \[root@bastion quickstart\]# 备注: llm-d的安装和模型没有分离,这个设计我觉得有点问题。按照我的理解,安装好了 再上载模型可能更好。 在OpenShift上安装(Install on OpenShift ) Before running the installer, ensure you have logged into the cluster as a cluster administrator. For example: oc login --token=sha256\~yourtoken --server=https://api.yourcluster.com:6443 export HF_TOKEN="your-token" ./llmd-installer.sh ### Validation The inference-gateway serves as the HTTP ingress point for all inference requests in our deployment. It's implemented as a Kubernetes Gateway (gateway.networking.k8s.io/v1) using either kgateway or istio as the gatewayClassName, and sits in front of your inference pods to handle path-based routing, load balancing, retries, and metrics. This example validates that the gateway itself is routing your completion requests correctly. You can execute the test-request.sh script to test on the cluster. # Default options (the model id will be discovered via /v1/models) ./test-request.sh # Non-default namespace/model ./test-request.sh -n \ -m \ --minikube If you receive an error indicating PodSecurity "restricted" violations when running the smoke-test script, you need to remove the restrictive PodSecurity labels from the namespace. Once these labels are removed, re-run the script and it should proceed without PodSecurity errors. Run the following command: kubectl label namespace \ \\ pod-security.kubernetes.io/warn- \\ pod-security.kubernetes.io/warn-version- \\ pod-security.kubernetes.io/audit- \\ pod-security.kubernetes.io/audit-version-

相关推荐
luoganttcc11 小时前
是凯恩斯主义主导 西方的经济决策吗
大数据·人工智能·金融·哲学
好奇龙猫12 小时前
AI学习:SPIN -win-安装SPIN-工具过程 SPIN win 电脑安装=accoda 环境-第五篇:代码修复]
人工智能·学习
远山枫谷12 小时前
如何通过nodean安装n8n以及可能遇到的问题
人工智能
AIGC_北苏12 小时前
EvalScope模型压力测试实战
人工智能·语言模型·模型评估·框架评估
CheungChunChiu12 小时前
AI 模型部署体系全景:从 PyTorch 到 RKNN 的嵌入式类比解析
人工智能·pytorch·python·模型
分布式存储与RustFS12 小时前
存算一体架构的先行者:RustFS在异构计算环境下的探索与实践
大数据·人工智能·物联网·云原生·对象存储·minio·rustfs
Scc_hy12 小时前
强化学习_Paper_2000_Eligibility Traces for Off-Policy Policy Evaluation
人工智能·深度学习·算法·强化学习·rl
IT小哥哥呀12 小时前
论文见解:REACT:在语言模型中协同推理和行动
前端·人工智能·react.js·语言模型
来酱何人12 小时前
低资源NLP数据处理:少样本/零样本场景下数据增强与迁移学习结合方案
人工智能·深度学习·分类·nlp·bert
ChinaRainbowSea12 小时前
11. Spring AI + ELT
java·人工智能·后端·spring·ai编程