初次尝试在kubernetes 1.31 上安装 人工智能模型运行平台 llm-d

备注:

按照官方文档,排除官方文档不清楚的步骤,安装到最后一步了,只缺 HF_TOKEN了,因为我的kubernetes集群无法访问HF.

root@bastion quickstart\]# cat /etc/redhat-release Rocky Linux release 9.5 (Blue Onyx) \[root@bastion quickstart\]# \[root@bastion quickstart\]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01.kcloudonline.com Ready control-plane 46h v1.31.0 worker01.kcloudonline.com Ready \ 46h v1.31.0 worker02.kcloudonline.com Ready \ 46h v1.31.0 worker03.kcloudonline.com Ready \ 46h v1.31.0 \[root@bastion quickstart\]# ### 获取安装代码/介质 (Get the code) Clone the llm-d-deployer repository. git clone https://github.com/llm-d/llm-d-deployer.git Navigate to the quickstart directory cd llm-d-deployer/quickstart \[root@bastion software\]# dnf install git -y \[root@bastion software\]# mkdir llm-d \[root@bastion software\]# cd llm-d/ \[root@bastion llm-d\]# git clone https://github.com/llm-d/llm-d-deployer.git \[root@bastion llm-d\]# cd llm-d-deployer/ \[root@bastion llm-d-deployer\]# ls chart-dependencies CONTRIBUTING.md ct-install.yaml DCO LICENSE Makefile OWNERS README.md charts cr.yaml ct.yaml helpers lintconf.yaml notes quickstart REPO_DOCS.md \[root@bastion llm-d-deployer\]# cd quickstart/ \[root@bastion quickstart\]# ls examples grafana grafana-setup.md infra install-deps.sh llmd-installer.sh metrics-overview.md README.md README-minikube.md test-request.sh \[root@bastion quickstart\]# ### 要求的工具 (Required tools) Following prerequisite are required for the installer to work. yq (mikefarah) -- installation jq -- download \& install guide git -- installation guide Helm -- quick-start install Kustomize -- official install docs kubectl -- install \& setup You can use the installer script that installs all the required dependencies. ./install-deps.sh # 下载并安装yq sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq # 赋予执行权限 sudo chmod +x /usr/local/bin/yq # 验证安装 yq --version 使用官方脚本安装(推荐) # 下载并安装最新版本的Kustomize curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash # 将kustomize移动到系统PATH中 sudo mv kustomize /usr/local/bin/ # 验证安装 kustomize version \[root@bastion quickstart\]# sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq Resolving release-assets.githubusercontent.com (release-assets.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ... Connecting to release-assets.githubusercontent.com (release-assets.githubusercontent.com)\|185.199.110.133\|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 11477176 (11M) \[application/octet-stream

Saving to: '/usr/local/bin/yq'

/usr/local/bin/yq 100%[=====================================================================================>] 10.95M 1002KB/s in 7.1s

2025-09-26 08:34:22 (1.55 MB/s) - '/usr/local/bin/yq' saved [11477176/11477176]

root@bastion quickstart\]# sudo chmod +x /usr/local/bin/yq \[root@bastion quickstart\]# yq --version yq (https://github.com/mikefarah/yq/) version v4.47.2 \[root@bastion quickstart\]# \[root@bastion llm-d\]# curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" \| bash v5.7.1 kustomize installed to /software/llm-d/kustomize \[root@bastion llm-d\]# ls kustomize llm-d-deployer \[root@bastion llm-d\]# cp kustomize /usr/local/bin/ \[root@bastion llm-d\]# kustomize version v5.7.1 \[root@bastion llm-d\]# \[root@bastion quickstart\]# ./install-deps.sh Rocky Linux 9 - BaseOS 2.5 kB/s \| 4.1 kB 00:01 Rocky Linux 9 - AppStream 5.0 kB/s \| 4.5 kB 00:00 Rocky Linux 9 - Extras 631 B/s \| 2.9 kB 00:04 Dependencies resolved. ========================================================================================================================================================================= Package Architecture Version Repository Size =========================================================================================================================================================================Installing: make x86_64 1:4.3-8.el9 baseos 529 k Transaction Summary =========================================================================================================================================================================Install 1 Package Total download size: 529 k Installed size: 1.6 M Downloading Packages: make-4.3-8.el9.x86_64.rpm 301 kB/s \| 529 kB 00:01 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total 212 kB/s \| 529 kB 00:02 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : make-1:4.3-8.el9.x86_64 1/1 Running scriptlet: make-1:4.3-8.el9.x86_64 1/1 Verifying : make-1:4.3-8.el9.x86_64 1/1 Installed: make-1:4.3-8.el9.x86_64 Complete! Installing yq... \[root@bastion quickstart\]# ### 要求的凭证和配置 (Required credentials and configuration) llm-d-deployer GitHub repo -- clone here([https://github.com/llm-d/llm-d-deployer.git](https://github.com/llm-d/llm-d-deployer.git "https://github.com/llm-d/llm-d-deployer.git")) HuggingFace HF_TOKEN (https://huggingface.co/docs/hub/en/security-tokens) with download access for the model you want to use. By default the sample application will use meta-llama/Llama-3.2-3B-Instruct. ⚠️ Your Hugging Face account must have access to the model you want to use. You may need to visit Hugging Face meta-llama/Llama-3.2-3B-Instruct and accept the usage terms if you have not already done so. ### 目标平台 (Target Platforms) Since the llm-d-deployer is based on helm charts, llm-d can be deployed on a variety of Kubernetes platforms. ### 安装llm-d (llm-d Installation) Only a single installation of llm-d on a cluster is currently supported. In the future, multiple model services will be supported. Until then, uninstall llm-d before reinstalling. The llm-d-deployer contains all the helm charts necessary to deploy llm-d. To facilitate the installation of the helm charts, the llmd-installer.sh script is provided. This script will populate the necessary manifests in the manifests directory. After this, it will apply all the manifests in order to bring up the cluster. The llmd-installer.sh script aims to simplify the installation of llm-d using the llm-d-deployer as it's main function. It scripts as many of the steps as possible to make the installation process more streamlined. This includes: Installing the GAIE infrastructure Creating the namespace with any special configurations Creating the pull secret to download the images Creating the model service CRDs Applying the helm charts Deploying the sample app (model service) It also supports uninstalling the llm-d infrastructure and the sample app. Before proceeding with the installation, ensure you have completed the prerequisites and are able to issue kubectl or oc commands to your cluster by configuring your \~/.kube/config file or by using the oc login command. #### Usage The installer needs to be run from the llm-d-deployer/quickstart directory as a cluster admin with CLI access to the cluster. ./llmd-installer.sh \[OPTIONS

Flags

案例(Examples)

在Kubernetes 安装 (Install llm-d on an Existing Kubernetes Cluster)

export HF_TOKEN="your-token"

./llmd-installer.sh

root@bastion quickstart\]# ./llmd-installer.sh 📂 Setting up script environment... kubectl can reach to a running Kubernetes cluster. ❌ HF_TOKEN not set; Run: export HF_TOKEN=\ \[root@bastion quickstart\]# 备注: llm-d的安装和模型没有分离,这个设计我觉得有点问题。按照我的理解,安装好了 再上载模型可能更好。 在OpenShift上安装(Install on OpenShift ) Before running the installer, ensure you have logged into the cluster as a cluster administrator. For example: oc login --token=sha256\~yourtoken --server=https://api.yourcluster.com:6443 export HF_TOKEN="your-token" ./llmd-installer.sh ### Validation The inference-gateway serves as the HTTP ingress point for all inference requests in our deployment. It's implemented as a Kubernetes Gateway (gateway.networking.k8s.io/v1) using either kgateway or istio as the gatewayClassName, and sits in front of your inference pods to handle path-based routing, load balancing, retries, and metrics. This example validates that the gateway itself is routing your completion requests correctly. You can execute the test-request.sh script to test on the cluster. # Default options (the model id will be discovered via /v1/models) ./test-request.sh # Non-default namespace/model ./test-request.sh -n \ -m \ --minikube If you receive an error indicating PodSecurity "restricted" violations when running the smoke-test script, you need to remove the restrictive PodSecurity labels from the namespace. Once these labels are removed, re-run the script and it should proceed without PodSecurity errors. Run the following command: kubectl label namespace \ \\ pod-security.kubernetes.io/warn- \\ pod-security.kubernetes.io/warn-version- \\ pod-security.kubernetes.io/audit- \\ pod-security.kubernetes.io/audit-version-

相关推荐
沃达德软件4 小时前
智慧警务图像融合大数据
大数据·图像处理·人工智能·目标检测·计算机视觉·目标跟踪
QxQ么么4 小时前
移远通信(桂林)26校招-助理AI算法工程师-面试纪录
人工智能·python·算法·面试
愤怒的可乐5 小时前
从零构建大模型智能体:统一消息格式,快速接入大语言模型
人工智能·语言模型·自然语言处理
每天一个java小知识6 小时前
AI Agent
人工智能
猫头虎7 小时前
如何解决 pip install 编译报错 fatal error: hdf5.h: No such file or directory(h5py)问题
人工智能·python·pycharm·开源·beautifulsoup·ai编程·pip
龙赤子7 小时前
人工智能AI的大框架
人工智能
比奥利奥还傲.7 小时前
本地+AI+大模型自由用!Cherry+Studio打破局域网限制
人工智能
雪碧聊技术7 小时前
深度学习、机器学习、人工智能三者的关系
人工智能·深度学习·机器学习
β添砖java7 小时前
机器学习初级
人工智能·机器学习
陈奕昆7 小时前
n8n实战营Day3:电商订单全流程自动化·需求分析与流程拆解
大数据·开发语言·人工智能·自动化·需求分析·n8n