18.2 k8s-apiserver监控源码解读

本节重点介绍 :

  • k8s代码库和模块地址

    • 下载 apiserver源码
  • apiserver中监控源码阅读

k8s源码地址分布

k8s代码库

  • 访问github上k8s仓库,readme中给出了k8s 模块的代码地址
  • 举例图片

组件仓库列表 地址

Repositories currently staged here:

下载 apiserver 源码

shell 复制代码
go get -d  k8s.io/apiserver

分析apiserver 监控源码

以qps指标 apiserver_request_total为例

定位源码位置

  • 在源码目录全文搜索apiserver_request_total,选择.go文件
  • 举例图片
  • 发现位于 D:\go_path\pkg\mod\k8s.io\apiserver@v0.22.1\pkg\endpoints\metrics\metrics.go
go 复制代码
	requestCounter = compbasemetrics.NewCounterVec(
		&compbasemetrics.CounterOpts{
			Name:           "apiserver_request_total",
			Help:           "Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.",
			StabilityLevel: compbasemetrics.STABLE,
		},
		[]string{"verb", "dry_run", "group", "version", "resource", "subresource", "scope", "component", "code"},
	)
  • 分析,在这里定义了指标,并且指定了相关的标签,可以和我们在prometheus中查询到的结果匹配上
shell 复制代码
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="apiregistration.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="apiservices", scope="cluster", verb="WATCH", version="v1beta1"}
9014
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="apps", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="daemonsets", scope="cluster", verb="WATCH", version="v1"}
8863
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="apps", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="deployments", scope="cluster", verb="WATCH", version="v1"}
8830
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="crd.projectcalico.org", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="bgpconfigurations", scope="cluster", verb="WATCH", version="v1"}
5575
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="crd.projectcalico.org", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="bgppeers", scope="cluster", verb="WATCH", version="v1"}
2791
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="crd.projectcalico.org", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="blockaffinities", scope="cluster", verb="WATCH", version="v1"}
2942
apiserver_request_total{code="0", component="apiserver", contentType="application/json", group="crd.projectcalico.org", instance="172.20.70.205:6443", job="kubernetes-apiservers", resource="clusterinformations", scope="cluster", verb="WATCH", version="v1"}
2770

处理请求的metric 函数 MonitorRequest

  • 代码位置 D:\go_path\pkg\mod\k8s.io\apiserver@v0.22.1\pkg\endpoints\metrics\metrics.go
  • 源码如下
go 复制代码
// MonitorRequest handles standard transformations for client and the reported verb and then invokes Monitor to record
// a request. verb must be uppercase to be backwards compatible with existing monitoring tooling.
func MonitorRequest(req *http.Request, verb, group, version, resource, subresource, scope, component string, deprecated bool, removedRelease string, httpCode, respSize int, elapsed time.Duration) {
	// We don't use verb from <requestInfo>, as this may be propagated from
	// InstrumentRouteFunc which is registered in installer.go with predefined
	// list of verbs (different than those translated to RequestInfo).
	// However, we need to tweak it e.g. to differentiate GET from LIST.
	reportedVerb := cleanVerb(CanonicalVerb(strings.ToUpper(req.Method), scope), verb, req)

	dryRun := cleanDryRun(req.URL)
	elapsedSeconds := elapsed.Seconds()
	requestCounter.WithContext(req.Context()).WithLabelValues(reportedVerb, dryRun, group, version, resource, subresource, scope, component, codeToString(httpCode)).Inc()
	// MonitorRequest happens after authentication, so we can trust the username given by the request
	info, ok := request.UserFrom(req.Context())
	if ok && info.GetName() == user.APIServerUser {
		apiSelfRequestCounter.WithContext(req.Context()).WithLabelValues(reportedVerb, resource, subresource).Inc()
	}
	if deprecated {
		deprecatedRequestGauge.WithContext(req.Context()).WithLabelValues(group, version, resource, subresource, removedRelease).Set(1)
		audit.AddAuditAnnotation(req.Context(), deprecatedAnnotationKey, "true")
		if len(removedRelease) > 0 {
			audit.AddAuditAnnotation(req.Context(), removedReleaseAnnotationKey, removedRelease)
		}
	}
	requestLatencies.WithContext(req.Context()).WithLabelValues(reportedVerb, dryRun, group, version, resource, subresource, scope, component).Observe(elapsedSeconds)
	// We are only interested in response sizes of read requests.
	if verb == "GET" || verb == "LIST" {
		responseSizes.WithContext(req.Context()).WithLabelValues(reportedVerb, group, version, resource, subresource, scope, component).Observe(float64(respSize))
	}
}
  • 其中 WithLabelValues设置metric的值 ,Inc()代表counter +1
go 复制代码
requestCounter.WithContext(req.Context()).WithLabelValues(reportedVerb, dryRun, group, version, resource, subresource, scope, component, codeToString(httpCode)).Inc()
  • 这个是k8s在prometheus sdk上做的封装,位置 D:\go_path\pkg\mod\k8s.io\component-base@v0.22.1\metrics\counter.go
go 复制代码
func (v *CounterVec) WithLabelValues(lvs ...string) CounterMetric {
	if !v.IsCreated() {
		return noop // return no-op counter
	}
	if v.LabelValueAllowLists != nil {
		v.LabelValueAllowLists.ConstrainToAllowedList(v.originalLabels, lvs)
	}
	return v.CounterVec.WithLabelValues(lvs...)
}
自己请求的计数
  • 如果userName是 system:apiserver ,那么把代表自身请求的metric apiserver_selfrequest_total +1
go 复制代码
	info, ok := request.UserFrom(req.Context())
	if ok && info.GetName() == user.APIServerUser {
		apiSelfRequestCounter.WithContext(req.Context()).WithLabelValues(reportedVerb, resource, subresource).Inc()
	}
要被废弃的api被请求
  • metric丢弃的设置,apiserver_requested_deprecated_apis代表要被废弃的api被请求了,打印信息
go 复制代码
	if deprecated {
		deprecatedRequestGauge.WithContext(req.Context()).WithLabelValues(group, version, resource, subresource, removedRelease).Set(1)
		audit.AddAuditAnnotation(req.Context(), deprecatedAnnotationKey, "true")
		if len(removedRelease) > 0 {
			audit.AddAuditAnnotation(req.Context(), removedReleaseAnnotationKey, removedRelease)
		}
	}
  • prometheus查询结果
shell 复制代码
apiserver_requested_deprecated_apis{group="apiregistration.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="apiservices", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="authorization.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="subjectaccessreviews", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="certificates.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="certificatesigningrequests", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="extensions", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="ingresses", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="networking.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="ingressclasses", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="networking.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="ingresses", version="v1beta1"}
1
apiserver_requested_deprecated_apis{group="scheduling.k8s.io", instance="172.20.70.205:6443", job="kubernetes-apiservers", removed_release="1.22", resource="priorityclasses", version="v1beta1"}
1
设置请求延迟值
  • 代码如下
go 复制代码
requestLatencies.WithContext(req.Context()).WithLabelValues(reportedVerb, dryRun, group, version, resource, subresource, scope, component).Observe(elapsedSeconds)
  • requestLatencies定义了从0.05到60秒的bucket
go 复制代码
	requestLatencies = compbasemetrics.NewHistogramVec(
		&compbasemetrics.HistogramOpts{
			Name: "apiserver_request_duration_seconds",
			Help: "Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.",
			// This metric is used for verifying api call latencies SLO,
			// as well as tracking regressions in this aspects.
			// Thus we customize buckets significantly, to empower both usecases.
			Buckets: []float64{0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0,
				1.25, 1.5, 1.75, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60},
			StabilityLevel: compbasemetrics.STABLE,
		},
		[]string{"verb", "dry_run", "group", "version", "resource", "subresource", "scope", "component"},
	)
  • 分位值查询
shell 复制代码
histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers"}[5m])) by (verb, le))
  • 结果图片
设置 http响应的大小histogram
  • 设置代码
go 复制代码
	if verb == "GET" || verb == "LIST" {
		responseSizes.WithContext(req.Context()).WithLabelValues(reportedVerb, group, version, resource, subresource, scope, component).Observe(float64(respSize))
	}
  • responseSizes定义了 1KB 到1GB的大小的bucket
go 复制代码
	responseSizes = compbasemetrics.NewHistogramVec(
		&compbasemetrics.HistogramOpts{
			Name: "apiserver_response_sizes",
			Help: "Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.",
			// Use buckets ranging from 1000 bytes (1KB) to 10^9 bytes (1GB).
			Buckets:        compbasemetrics.ExponentialBuckets(1000, 10.0, 7),
			StabilityLevel: compbasemetrics.ALPHA,
		},
		[]string{"verb", "group", "version", "resource", "subresource", "scope", "component"},
	)

追踪MonitorRequest调用链

InstrumentRouteFunc在prometheus的HandlerFunc基础上封装了k8s的go-restful

go 复制代码
// InstrumentRouteFunc works like Prometheus' InstrumentHandlerFunc but wraps
// the go-restful RouteFunction instead of a HandlerFunc plus some Kubernetes endpoint specific information.
func InstrumentRouteFunc(verb, group, version, resource, subresource, scope, component string, deprecated bool, removedRelease string, routeFunc restful.RouteFunction) restful.RouteFunction {
	return restful.RouteFunction(func(req *restful.Request, response *restful.Response) {
		requestReceivedTimestamp, ok := request.ReceivedTimestampFrom(req.Request.Context())
		if !ok {
			requestReceivedTimestamp = time.Now()
		}

		delegate := &ResponseWriterDelegator{ResponseWriter: response.ResponseWriter}

		//lint:file-ignore SA1019 Keep supporting deprecated http.CloseNotifier
		_, cn := response.ResponseWriter.(http.CloseNotifier)
		_, fl := response.ResponseWriter.(http.Flusher)
		_, hj := response.ResponseWriter.(http.Hijacker)
		var rw http.ResponseWriter
		if cn && fl && hj {
			rw = &fancyResponseWriterDelegator{delegate}
		} else {
			rw = delegate
		}
		response.ResponseWriter = rw

		routeFunc(req, response)

		MonitorRequest(req.Request, verb, group, version, resource, subresource, scope, component, deprecated, removedRelease, delegate.Status(), delegate.ContentLength(), time.Since(requestReceivedTimestamp))
	})
}

registerResourceHandlers 中根据不同的verb处理

  • 位置 D:\go_path\pkg\mod\k8s.io\apiserver@v0.22.1\pkg\endpoints\installer.go
  • LIST的verb代码如下
go 复制代码
		case "LIST": // List all resources of a kind.
			doc := "list objects of kind " + kind
			if isSubresource {
				doc = "list " + subresource + " of objects of kind " + kind
			}
			handler := metrics.InstrumentRouteFunc(action.Verb, group, version, resource, subresource, requestScope, metrics.APIServerComponent, deprecated, removedRelease, restfulListResource(lister, watcher, reqScope, false, a.minRequestTimeout))
			handler = utilwarning.AddWarningsHandler(handler, warnings)
			route := ws.GET(action.Path).To(handler).
				Doc(doc).
				Param(ws.QueryParameter("pretty", "If 'true', then the output is pretty printed.")).
				Operation("list"+namespaced+kind+strings.Title(subresource)+operationSuffix).
				Produces(append(storageMeta.ProducesMIMETypes(action.Verb), allMediaTypes...)...).
				Returns(http.StatusOK, "OK", versionedList).
				Writes(versionedList)
  • 上层被APIInstaller调用
go 复制代码
// Install handlers for API resources.
func (a *APIInstaller) Install() ([]metav1.APIResource, []*storageversion.ResourceInfo, *restful.WebService, []error) {
	var apiResources []metav1.APIResource
	var resourceInfos []*storageversion.ResourceInfo
	var errors []error
	ws := a.newWebService()

	// Register the paths in a deterministic (sorted) order to get a deterministic swagger spec.
	paths := make([]string, len(a.group.Storage))
	var i int = 0
	for path := range a.group.Storage {
		paths[i] = path
		i++
	}
	sort.Strings(paths)
	for _, path := range paths {
		apiResource, resourceInfo, err := a.registerResourceHandlers(path, a.group.Storage[path], ws)
		if err != nil {
			errors = append(errors, fmt.Errorf("error in registering resource: %s, %v", path, err))
		}
		if apiResource != nil {
			apiResources = append(apiResources, *apiResource)
		}
		if resourceInfo != nil {
			resourceInfos = append(resourceInfos, resourceInfo)
		}
	}
	return apiResources, resourceInfos, ws, errors
}

本节重点总结 :

  • k8s代码库和模块地址

    • 下载 apiserver源码
  • apiserver中监控源码阅读

相关推荐
SilentCodeY5 小时前
docker build前耗时太长,不明所以
运维·docker·容器·镜像·dockerfile
A仔不会笑5 小时前
微服务配置管理——动态路由
微服务·云原生·架构
小胖胖吖6 小时前
[CKA]CKA的购买和注册考试券
云原生·容器·kubernetes·cka
华为云开发者联盟6 小时前
《华为云DTSE》期刊免费下载:10个案例读懂云上架构升级策略
ai·云原生·php·元宇宙·dtse
弥琉撒到我7 小时前
微服务SpringSession解析部署使用全流程
spring cloud·微服务·云原生·架构·springsession
多多*7 小时前
OJ在线评测系统 判题机开发 保证Docker容器执行的安全性
运维·docker·容器
oNuoyi8 小时前
解决docker目录内存不足扩容处理
运维·docker·容器
FGGIT10 小时前
使用Docker快速本地部署RSSHub结合内网穿透访问RSS订阅源
运维·docker·容器
豆包MarsCode10 小时前
使用豆包MarsCode 实现高可用扫描工具
大数据·人工智能·python·云原生·容器
li37149089010 小时前
生产k8s 应用容器内存溢出OOMKilled问题处理
docker·容器·kubernetes