问题
遇到的问题:同一个应用,Spring Boot(Java)和Grails(Groovy)混合编程,常规的Spring Controller,可通过Micromete + Pushgateway,
采集到http.server.requests
指标数据,注意下面的指标名称是点号(请忽略下面截图里的接口的uri并不是上面的截图里的)
在Prometheus页面,会发现指标名称已经变成下划线命名,且增加后缀_seconds_sum
为啥Grails的UrlMappings和controller,无法采集到http_server_requests指标数据?(请忽略下面的截图是另一个应用)
源码分析
一开始,我只知道MeterRegistry.registerMeterIfNecessary
方法,打个断点,调试可进入断点:
截图如上,tag里的uri全部变成root,也就是上面截图4中看到的所有接口全变成root,不同的是method方法。
为啥会变成root呢?
只能断点调试。
断点调试的前提是熟悉框架代码。想一想,如果不知道方法调用层级关系,怎么打断点呢?
如何熟悉代码?花时间。或者反复询问ChatGPT、DeepSeek、GitHub Copilot。
总之,这里直接给出原因。
WebMvcMetricsFilter类相关方法如下:
java
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
TimingContext timingContext = TimingContext.get(request);
if (timingContext == null) {
timingContext = startAndAttachTimingContext(request);
}
try {
filterChain.doFilter(request, response);
if (!request.isAsyncStarted()) {
// Only record when async processing has finished or never been started.
// If async was started by something further down the chain we wait until the second filter invocation (but we'll be using the TimingContext that was attached to the first)
Throwable exception = fetchException(request);
record(timingContext, request, response, exception);
}
} catch (Exception ex) {
response.setStatus(HttpStatus.INTERNAL_SERVER_ERROR.value());
record(timingContext, request, response, unwrapNestedServletException(ex));
throw ex;
}
}
private void record(TimingContext timingContext, HttpServletRequest request, HttpServletResponse response,
Throwable exception) {
try {
Object handler = getHandler(request);
Set<Timed> annotations = getTimedAnnotations(handler);
Timer.Sample timerSample = timingContext.getTimerSample();
AutoTimer.apply(this.autoTimer, this.metricName, annotations,
(builder) -> timerSample.stop(getTimer(builder, handler, request, response, exception)));
}
catch (Exception ex) {
logger.warn("Failed to record timer metrics", ex);
// Allow request-response exchange to continue, unaffected by metrics problem
}
}
private Timer getTimer(Builder builder, Object handler, HttpServletRequest request, HttpServletResponse response,
Throwable exception) {
return builder.description("Duration of HTTP server request handling")
.tags(this.tagsProvider.getTags(request, response, handler, exception))
.register(this.registry);
}
DefaultWebMvcTagsProvider类的相关方法如下:
java
@Override
public Iterable<Tag> getTags(HttpServletRequest request, HttpServletResponse response, Object handler,
Throwable exception) {
Tags tags = Tags.of(WebMvcTags.method(request), WebMvcTags.uri(request, response, this.ignoreTrailingSlash),
WebMvcTags.exception(exception), WebMvcTags.status(response), WebMvcTags.outcome(response));
for (WebMvcTagsContributor contributor : this.contributors) {
tags = tags.and(contributor.getTags(request, response, handler, exception));
}
return tags;
}
WebMvcTags类的相关方法如下:
java
// 这才是我们最终想要定位的代码行,
private static final Tag URI_ROOT = Tag.of("uri", "root");
public static Tag uri(HttpServletRequest request, HttpServletResponse response, boolean ignoreTrailingSlash) {
if (request != null) {
String pattern = getMatchingPattern(request);
if (pattern != null) {
if (ignoreTrailingSlash && pattern.length() > 1) {
pattern = TRAILING_SLASH_PATTERN.matcher(pattern).replaceAll("");
}
if (pattern.isEmpty()) {
return URI_ROOT;
}
return Tag.of("uri", pattern);
}
if (response != null) {
HttpStatus status = extractStatus(response);
if (status != null) {
if (status.is3xxRedirection()) {
return URI_REDIRECTION;
}
if (status == HttpStatus.NOT_FOUND) {
return URI_NOT_FOUND;
}
}
}
String pathInfo = getPathInfo(request);
if (pathInfo.isEmpty()) {
return URI_ROOT;
}
}
return URI_UNKNOWN;
}
private static String getPathInfo(HttpServletRequest request) {
String pathInfo = request.getPathInfo();
String uri = StringUtils.hasText(pathInfo) ? pathInfo : "/";
uri = MULTIPLE_SLASH_PATTERN.matcher(uri).replaceAll("/");
return TRAILING_SLASH_PATTERN.matcher(uri).replaceAll("");
}
private static String getMatchingPattern(HttpServletRequest request) {
PathPattern dataRestPathPattern = (PathPattern) request.getAttribute(DATA_REST_PATH_PATTERN_ATTRIBUTE);
if (dataRestPathPattern != null) {
return dataRestPathPattern.getPatternString();
}
return (String) request.getAttribute(HandlerMapping.BEST_MATCHING_PATTERN_ATTRIBUTE);
}
如下截图所示,在HttpServletRequest类里根本就没有pathInfo字段:
以及
代码为啥会走到getPathInfo方法呢,那是因为getMatchingPattern方法返回为空。
一个常规的Spring Boot Controller接口是可以获取到pattern的:
但是Grails框架下的Groovy Controller接口,pattern为null:
继续看看getMatchingPattern方法:
这里面尝试从request里获取两个key都失败,都返回null:
org.springframework.data.rest.webmvc.RepositoryRestHandlerMapping.EFFECTIVE_REPOSITORY_RESOURCE_LOOKUP_PATH
org.springframework.web.servlet.HandlerMapping.bestMatchingPattern
总结一下:Spring Boot Actuator的Filter类WebMvcMetricsFilter类doFilterInternal方法,调用内部方法record,继续调用内部方法getTimer,然后调用DefaultWebMvcTagsProvider的getTags方法,然后调用WebMvcTags的uri方法,调用内部方法getMatchingPattern,获取不到接口的uri信息,则走到内部方法getPathInfo,而HttpServletRequest.getPathInfo
方法,也是返回null。导致最后记录到的tag为private static final Tag URI_ROOT = Tag.of("uri", "root");
如果不熟悉框架原理,全局搜索root
关键词,根本就定位不到WebMvcTags类的URI_ROOT
字段。
自定义指标采集
既然Grails框架下,Micrometer采集http.server.requests
数据有问题,DeepSeek等工具告诉我,可以自定义指标数据。
下面的代码片段是DeepSeek给出的:
java
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.stereotype.Component;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
/**
* @author johnny
*/
@Component
class CustomMetricsFilter implements Filter {
private final MeterRegistry meterRegistry;
CustomMetricsFilter(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
HttpServletResponse httpResponse = (HttpServletResponse) response;
// 开始计时
Timer.Sample sample = Timer.start(meterRegistry);
try {
// 继续处理请求
chain.doFilter(request, response);
} finally {
// 结束计时并记录指标
// DeepSeek给出的是http.server.requests.custom自定义名称
sample.stop(meterRegistry.timer("http.server.requests",
"method", httpRequest.getMethod(),
"uri", httpRequest.getRequestURI(),
"status", String.valueOf(httpResponse.getStatus())
));
}
}
}
FilterConfig配置类:
java
package com.johnny.config;
import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.boot.web.servlet.FilterRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.Ordered;
/**
* @author johnny
*/
@Configuration
public class FilterConfig {
@Bean
public FilterRegistrationBean<CustomMetricsFilter> customMetricsFilter(MeterRegistry meterRegistry) {
FilterRegistrationBean<CustomMetricsFilter> bean = new FilterRegistrationBean<>();
bean.setFilter(new CustomMetricsFilter(meterRegistry));
bean.addUrlPatterns("/*");
bean.setOrder(Ordered.HIGHEST_PRECEDENCE);
return bean;
}
}
我没有使用自定义名称,而是使用期望推送的指标名称,http.server.requests
。通过断点调试,上面的代码是生效的,但在Prometheus页面并不能看到我请求的接口,也就是说为啥不能覆盖默认的指标名。
原因,经过分析,在Timer类的register方法上:
java
/**
* Add the timer to a single registry, or return an existing timer in that
* registry. The returned timer will be unique for each registry, but each
* registry is guaranteed to only create one timer for the same combination of
* name and tags.
* @param registry A registry to add the timer to, if it doesn't already exist.
* @return A new or existing timer.
*/
public Timer register(MeterRegistry registry) {
// the base unit for a timer will be determined by the monitoring system
// implementation
return registry.timer(new Meter.Id(name, tags, null, description, Type.TIMER),
distributionConfigBuilder.build(),
pauseDetector == null ? registry.config().pauseDetector() : pauseDetector);
}
猜测下来,对于已存在的指标名称http.server.requests
,会直接返回,并不会。
既然上面的代码可以断点调试,说明逻辑没有什么问题,为了进一步验证,使用自定义的指标名称http.server.requests.custom
浏览器打开:http://localhost:8867/actuator/metrics
如上图,除了组件默认采集到的http.server.requests,还有一条自定义的http.server.requests.custom。
打开Prometheus,查询新增的自定义指标,PromQL为:http_server_requests_custom_seconds_sum{job="agent-document"}
确实有数据。
问题来了:我想要在Grafana页面查询,查询范围当然是所有的应用。
DeepSeek给出的答案:
java
// 移除默认的 http.server.requests 指标
meterRegistry.remove(meterRegistry.find("http.server.requests").tags().timer());
// 结束计时并记录指标
// 省略代码
确实可以解决问题。
但是,如果一段时间内没有请求,组件自带的默认指标http.server.requests
还是会覆盖我推送的。
代码里定时将数据通过Pushgateway推送到Prometheus(已经保存下来),Grafana可以查询到数据,哪怕被覆盖也没有问题??
另一方面,前面刚刚使用meterRegistry.remove()
方法移除,后一脚又采集meterRegistry.timer("http.server.requests")
数据,感觉怪怪的。
那能不能禁用默认的http.server.requests
指标呢?
Grails
Grails框架下对HttpServletRequest做了各种不知道的封装。
主要是下面这个:
以及GrailsDispatcherServlet:
看到上面这么多Grails的Jar包,是不是要疯掉。
禁用默认指标
yml
management:
metrics:
enable:
http.server.requests: false
http: false
不管是http: false
,还是http.server.requests: false
,并不能将Micrometer默认的http.server.requests
指标给屏蔽掉。
真正可以实现屏蔽的配置如下:
yml
management:
metrics:
web:
server:
request:
autotime:
enabled: false
重启应用,请求http://localhost:8867/actuator/metrics
,再随便请求一个其他接口,发现不再有http.server.requests
指标,即实现禁用。
方案
最终的方案:禁用默认指标,加上CustomMetricsFilter,和FilterConfig配置类。
写在最后
本文如果行文思路还算清晰的话,请一定不要以为排查问题的过程也是思路清晰的。
实际上,在排查问题时,由于对Micrometer组件的源码不熟悉,浪费不少时间。
参考
- GitHub Copilot
- DeepSeek
- ChatGPT