一、概述
DataKit 通过 Kubernetes Custom Resource Definition (CRD) 提供了一种声明式的容器日志采集配置方式。用户可以通过创建 ClusterLoggingConfig 资源来自动配置 DataKit 的日志采集,无需手动修改 DataKit 配置文件或重启 DataKit,同样也无需重启业务。
二、前置条件
- Kubernetes 集群版本 1.16+
- DataKit Version-1.84.0 或更新版本
- 集群管理员权限(用于注册 CRD)
三、采集流程
1. 注册 Kubernetes CRD
- 使用以下 YAML 注册
ClusterLoggingConfigCRD:
yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: clusterloggingconfigs.logging.datakits.io
labels:
app: datakit-logging-config
version: v1alpha1
spec:
group: logging.datakits.io
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
apiVersion:
type: string
kind:
type: string
metadata:
type: object
spec:
type: object
required:
- selector
properties:
selector:
type: object
properties:
namespaceRegex:
type: string
podRegex:
type: string
podLabelSelector:
type: string
containerRegex:
type: string
podTargetLabels:
type: array
items:
type: string
configs:
type: array
items:
type: object
required:
- source
- type
properties:
source:
type: string
type:
type: string
disable:
type: boolean
path:
type: string
multiline_match:
type: string
pipeline:
type: string
storage_index:
type: string
tags:
type: object
additionalProperties:
type: string
scope: Cluster
names:
plural: clusterloggingconfigs
singular: clusterloggingconfig
kind: ClusterLoggingConfig
shortNames:
- logging
-
创建 CRD 资源,自动应用采集配置
kubectl apply -f clusterloggingconfig-crd.yaml
-
验证 CRD 注册
arduino
kubectl get crd clusterloggingconfigs.logging.datakits.io

2. 创建 CRD 配置资源
- 如下为业务应用 yaml :
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-demo
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: demo
template:
metadata:
labels:
app: demo
version: v1.0
enviroment: test
spec:
containers:
- name: container-demo
image: swr.cn-north-4.myhuaweicloud.com/liurui_bj/springboot-server:openj8
resources:
limits:
cpu: 250m
memory: 512Mi
requests:
cpu: 250m
memory: 512Mi
- k8s 部署运行业务后如下:

- 对应采集配置如下,该采集配置用于采集 default 工作空间 demo 业务的容器内日志以及容器的标准输出,容器内日志来源 source 自定义命名为 demo-file,容器标准输出的 source 自定义命名为 demo-std,更多配置参考链接
yaml
apiVersion: logging.datakits.io/v1alpha1
kind: ClusterLoggingConfig
metadata:
name: demo-logs
spec:
selector:
namespaceRegex: "^(default)$"
podRegex: "^(deploy.*)$"
podLabelSelector: "app=demo"
podTargetLabels:
- app
- version
- enviroment
configs:
- source: "demo-file"
type: "file"
path: "/data/logs/server/server.log"
tags:
log_type: "server"
component: "springboot-server"
- source: "demo-std"
type: "stdout"
disable: false
tags:
log_type: "server"
component: "springboot-server"
- 应用配置
arduino
kubectl apply -f logging-config.yaml

3. 添加相关 RBAC 配置
- 在 DataKit 的 ClusterRole 中添加以下权限:
makefile
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: datakit
rules:
# 原有的其他权限
- apiGroups: ["logging.datakits.io"]
resources: ["clusterloggingconfigs"]
verbs: ["get", "list", "watch"]
-
调整 DataKit 的其他采集配置,如日志白名单,采集器,数据 dataway 上报地址等,重新 apply DataKit 应用
kubectl apply -f datakit.yaml

4. 额外配置与说明
- 需要全局屏蔽日志标准输出采集,需要额外应用自定义 CRD 配置,如下:
yaml
kind: ClusterLoggingConfig
metadata:
name: test
spec:
selector:
namespaceRegex: "^(.*)$"
configs:
- source: "test"
type: "stdout"
disable: true
- DataKit 需要打开 container 采集器,不然 CRD 配置不生效
四、日志采集展示
- 容器内日志如下图,数据成功上报到观测云,相关 source,log_type,component 等配置字段均成功上报

- 容器标准输出如下图,据成功上报到观测云,相关 source,log_type。component 等配置字段均成功上报
