文章目录
- 前言
- 目标
- 具体步骤
-
- [第一步:编写 Python 应用(输出结构化日志)](#第一步:编写 Python 应用(输出结构化日志))
- [第二步:构建 Docker 镜像](#第二步:构建 Docker 镜像)
- [第三步:部署 Python 应用到 K8s](#第三步:部署 Python 应用到 K8s)
- [第四步:准备PLG(Promtail + Loki + Grafana)离线资源包](#第四步:准备PLG(Promtail + Loki + Grafana)离线资源包)
- [第五步:部署 Loki(日志存储)](#第五步:部署 Loki(日志存储))
-
- 创建命名空间
- [创建 Loki 配置 ConfigMap](#创建 Loki 配置 ConfigMap)
- [部署 Loki Deployment](#部署 Loki Deployment)
- [第六步:部署 Promtail(日志采集)](#第六步:部署 Promtail(日志采集))
- [第七步:部署 Grafana(可视化)](#第七步:部署 Grafana(可视化))
-
- [创建 Grafana 配置(预配置 Loki 数据源)](#创建 Grafana 配置(预配置 Loki 数据源))
- [部署 Grafana](#部署 Grafana)
- 第八步:验证日志采集
前言
在 Kubernetes(K8s)集群中设计一个高效、可靠且可扩展的日志系统,是保障可观测性(Observability)的重要组成部分。
此篇博客,我将结合一个 Python 应用示例,详细讲解如何在 Kubernetes(K8s)集群中设计并部署一套完整的 应用容器日志收集系统。我们将采用业界主流、轻量高效的 PLG 架构(Promtail + Loki + Grafana),适用于大多数中小型到中大型云原生场景。
目标
- 编写一个输出结构化日志的 Python 应用。
- 将该应用部署到 K8s 集群。
- 部署 PLG 日志系统(Promtail 采集 → Loki 存储 → Grafana 查询)。
- 在 Grafana 中查看和过滤该应用的日志。
具体步骤
第一步:编写 Python 应用(输出结构化日志)
创建app.py文件
python
# app.py
import json
import logging
import time
from datetime import datetime
# 配置日志输出为 JSON 格式
class JsonFormatter(logging.Formatter):
def format(self, record):
log_entry = {
"timestamp": datetime.utcnow().isoformat() + "Z",
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
"service": "my-python-app", # 自定义服务名
"pod_name": record.pod_name if hasattr(record, 'pod_name') else "unknown"
}
return json.dumps(log_entry)
# 设置 root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
logger.addHandler(handler)
if __name__ == "__main__":
counter = 0
while True:
counter += 1
logger.info(f"Processing request #{counter}")
time.sleep(5)
💡 注意:不要写入文件,而是直接 print 到 stdout(通过 StreamHandler)。K8s 会自动将容器
stdout/stderr 写入 /var/log/pods/*.log。
第二步:构建 Docker 镜像
创建dockerfile文件
bash
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY app.py .
RUN pip install --no-cache-dir gunicorn
CMD ["python", "app.py"]
构建并推送(假设你有阿里云ACK仓库):
bash
docker build -t <your-acr-registry>/my-python-app:1.0 .
docker push <your-acr-registry>/my-python-app:1.0
第三步:部署 Python 应用到 K8s
yaml
# app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-python-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: my-python-app
template:
metadata:
labels:
app: my-python-app
# 可选:添加日志相关标签,便于 Promtail 过滤
log_type: application
spec:
containers:
- name: app
image: your-dockerhub/my-python-app:1.0
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: my-python-app-svc
spec:
selector:
app: my-python-app
ports:
- protocol: TCP
port: 80
targetPort: 80
应用部署:
bash
kubectl apply -f app-deployment.yaml
验证日志是否输出:
bash
kubectl logs -l app=my-python-app --tail=5
你应该看到类似:
bash
{"timestamp": "2025-11-27T10:00:00Z", "level": "INFO", "logger": "root", "message": "Processing request #1", "service": "my-python-app", "pod_name": "unknown"}
第四步:准备PLG(Promtail + Loki + Grafana)离线资源包
容器镜像(版本可按需调整)
| 组件 | 官方镜像(示例) | 推荐版本 |
|---|---|---|
| Promtail | grafana/promtail:2.9.6 | 2.9.6+ |
| Loki | grafana/loki:2.9.6 | 2.9.6+ |
| Grafana | grafana/grafana:10.2.3 | 10.2+ |
将镜像推送到私有镜像仓库(如阿里云 ACR)
bash
# 示例:打 tag 并推送
docker pull grafana/promtail:2.9.6
docker tag grafana/promtail:2.9.6 <your-acr-registry>/promtail:2.9.6
docker push <your-acr-registry>/promtail:2.9.6
# 同样处理 loki 和 grafana
第五步:部署 Loki(日志存储)
创建命名空间
bash
kubectl create ns logging
创建 Loki 配置 ConfigMap
创建 loki-config.yaml:
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-config
namespace: logging
data:
loki.yaml: |
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
path_prefix: /tmp/loki
replication_factor: 1
ring:
kvstore:
store: inmemory
instance_addr: 127.0.0.1
schema_config:
configs:
- from: "2020-01-01"
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /tmp/loki/boltdb-shipper-active
cache_location: /tmp/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /tmp/loki/chunks
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
compactor:
working_directory: /tmp/loki/compactor
shared_store: filesystem
部署 Loki Deployment
创建 loki-deployment.yaml:
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: loki
namespace: logging
spec:
replicas: 1
selector:
matchLabels:
app: loki
template:
metadata:
labels:
app: loki
spec:
containers:
- name: loki
image: <your-acr-registry>/loki:2.9.6 # 替换为你的私有镜像
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3100
args:
- "-config.file=/etc/loki/loki.yaml"
volumeMounts:
- name: config
mountPath: /etc/loki
- name: storage
mountPath: /tmp/loki
volumes:
- name: config
configMap:
name: loki-config
- name: storage
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: loki
namespace: logging
spec:
ports:
- port: 3100
targetPort: 3100
selector:
app: loki
应用:
bash
kubectl apply -f loki-deployment.yaml
查看日志
bash
kubectl logs -l app=loki -n logging --follow
✅ 正常启动日志应包含:
level=info msg="Loki started"
level=info msg="boltdb-shipper initialized"
第六步:部署 Promtail(日志采集)
创建 promtail-config.yaml:
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: promtail-config
namespace: logging
data:
promtail.yaml: |
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /run/promtail/positions.yaml
clients:
- url: http://loki.logging.svc.cluster.local:3100/loki/api/v1/push
scrape_configs:
- job_name: containers
static_configs:
- targets:
- localhost
labels:
job: containers
__path__: /host/*/*/*.log
创建 promtail-daemonset.yaml:
yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: promtail
namespace: logging
labels:
app: promtail
spec:
selector:
matchLabels:
app: promtail
template:
metadata:
labels:
app: promtail
spec:
serviceAccountName: promtail-sa
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: promtail
image: <your-acr-registry>/promtail:2.9.6
imagePullPolicy: IfNotPresent
args:
- "-config.file=/etc/promtail/promtail.yaml"
volumeMounts:
- name: config
mountPath: /etc/promtail
- name: run
mountPath: /run/promtail
- name: host-root
mountPath: /host
readOnly: true
volumes:
- name: config
configMap:
name: promtail-config
- name: run
hostPath:
path: /run/promtail
type: DirectoryOrCreate
- name: host-root
hostPath:
path: /var/log/pods
type: Directory
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: promtail-sa
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: promtail-cluster-role
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: promtail-cluster-role-binding
subjects:
- kind: ServiceAccount
name: promtail-sa
namespace: logging
roleRef:
kind: ClusterRole
name: promtail-cluster-role
apiGroup: rbac.authorization.k8s.io
进入容器验证路径是否存在:
bash
kubectl exec -n logging <promtail-pod> -- ls -l /host
查看日志是否开始采集:
bash
kubectl logs -n logging <promtail-pod>
应看到:
level=info msg="Seeked /host/.lifsea/rootfs/var/log/pods/.../0.log..."
验证
bash
kubectl port-forward -n logging svc/loki 3100:3100
查询具体日志内容
bash
# 查询最近5分钟、job=containers 的日志
curl -G \
--data-urlencode 'query={job="containers"}' \
--data 'limit=10' \
--data 'direction=backward' \
http://localhost:3100/loki/api/v1/query_range
第七步:部署 Grafana(可视化)
创建 Grafana 配置(预配置 Loki 数据源)
创建 grafana-datasource.yaml:
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasource
namespace: logging
data:
loki.yaml: |
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
url: http://loki.logging.svc.cluster.local:3100
isDefault: true
部署 Grafana
grafana-deployment.yaml:
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: logging
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: <your-acr-registry>/grafana:10.2.3
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
env:
- name: GF_SECURITY_ADMIN_PASSWORD
value: "admin123" # 生产环境请用 Secret
volumeMounts:
- name: datasource
mountPath: /etc/grafana/provisioning/datasources
volumes:
- name: datasource
configMap:
name: grafana-datasource
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: logging
spec:
type: NodePort # 或 NodePort,根据网络策略选择
ports:
- port: 3000
targetPort: 3000
selector:
app: grafana
获取访问地址:
bash
kubectl get svc -n logging grafana
浏览器访问 http://ip:3000,用户名 admin,密码 admin123。
进入后应自动加载 Loki 数据源,可直接查询日志。
第八步:验证日志采集
- 在 Grafana 中打开 Explore;
- 选择 Loki 数据源;
- 输入 {job="containers"} 查询;
- 应看到来自各节点容器的日志。
