默认配置
http客户端
3.1.1 版本的openfeign 默认使用JDK自带的 HttpURLConnection
作为Http 访问客户端
重试策略
3.1.1版本的openfeign 默认使用开启 spring.cloud.loadbalancer.retry.enabled
使用 RetryableFeignBlockingLoadBalancerClient
作为重试客户端,内部会将 Http请求委托给 Client(真正干活的http客户端,开发者可以自定义扩展,目前我司有两款主流客户端: okhttp、asynchttpclient) 。
less
@Configuration(proxyBeanMethods = false)
@EnableConfigurationProperties(LoadBalancerClientsProperties.class)
class DefaultFeignLoadBalancerConfiguration {
@Bean
@ConditionalOnMissingBean
@Conditional(OnRetryNotEnabledCondition.class)
public Client feignClient(LoadBalancerClient loadBalancerClient,
LoadBalancerClientFactory loadBalancerClientFactory) {
return new FeignBlockingLoadBalancerClient(new Client.Default(null, null), loadBalancerClient,
loadBalancerClientFactory);
}
@Bean
@ConditionalOnMissingBean
@ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate")
@ConditionalOnBean(LoadBalancedRetryFactory.class)
@ConditionalOnProperty(value = "spring.cloud.loadbalancer.retry.enabled", havingValue = "true",
matchIfMissing = true)
public Client feignRetryClient(LoadBalancerClient loadBalancerClient,
LoadBalancedRetryFactory loadBalancedRetryFactory, LoadBalancerClientFactory loadBalancerClientFactory) {
return new RetryableFeignBlockingLoadBalancerClient(new Client.Default(null, null), loadBalancerClient,
loadBalancedRetryFactory, loadBalancerClientFactory);
}
}
重试策略核心流程
Spring Retry 原理
在开始之前 需要聊一下 spring retry 的基本逻辑。
如果想继续重试,那么以下几个条件必须满足。否则将退出重试逻辑。
canRetry
必须为true
isExhaustedOnly
必须为false
-
doWithRetry
执行重试的逻辑流程,使用者自行实现。 -
registerThrowable
重试异常后回调方法 -
doOnErrorInterceptors
异常后可以配置拦截器对异常进行自定义逻辑处理。 -
backOffPolicy.backOff(backOffContext);
是否等待一段时间再次重试?
kotlin
while (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Retry: count=" + context.getRetryCount());
}
lastException = null;
return retryCallback.doWithRetry(context);
}
catch (Throwable e) {
lastException = e;
try {
registerThrowable(retryPolicy, state, context, e);
}
catch (Exception ex) {
throw new TerminatedRetryException("Could not register throwable",
ex);
}
finally {
doOnErrorInterceptors(retryCallback, context, e);
}
if (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
backOffPolicy.backOff(backOffContext);
}
catch (BackOffInterruptedException ex) {
lastException = e;
// back off was prevented by another thread - fail the retry
if (this.logger.isDebugEnabled()) {
this.logger
.debug("Abort retry because interrupted: count="
+ context.getRetryCount());
}
throw ex;
}
}
if (this.logger.isDebugEnabled()) {
this.logger.debug(
"Checking for rethrow: count=" + context.getRetryCount());
}
if (shouldRethrow(retryPolicy, context, state)) {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Rethrow in retry for policy: count="
+ context.getRetryCount());
}
throw RetryTemplate.<E>wrapIfNecessary(e);
}
}
if (state != null && context.hasAttribute(GLOBAL_STATE)) {
break;
}
}
重试客户端
上文提到,3.1.1 版本的openfeign 默认使用 RetryableFeignBlockingLoadBalancerClient
当作重试客户端。
RetryableFeignBlockingLoadBalancerClient
重试客户端继承 Client 内部通过委托模式将请求转发给真正的 delegate
, 由 delegate
发起真正的http请求, 在这个过程中使用Spring Retry 进行重试策略。
ini
public Response execute(Request request, Request.Options options) throws IOException {
final URI originalUri = URI.create(request.url());
String serviceId = originalUri.getHost();
Assert.state(serviceId != null, "Request URI does not contain a valid hostname: " + originalUri);
// 1. 构建重试策略
final LoadBalancedRetryPolicy retryPolicy = loadBalancedRetryFactory.createRetryPolicy(serviceId,
loadBalancerClient);
// 2. 构建重试模板
RetryTemplate retryTemplate = buildRetryTemplate(serviceId, request, retryPolicy);
return retryTemplate.execute(context -> {
Request feignRequest = null;
ServiceInstance retrievedServiceInstance = null;
Set<LoadBalancerLifecycle> supportedLifecycleProcessors = LoadBalancerLifecycleValidator
.getSupportedLifecycleProcessors(
loadBalancerClientFactory.getInstances(serviceId, LoadBalancerLifecycle.class),
RetryableRequestContext.class, ResponseData.class, ServiceInstance.class);
String hint = getHint(serviceId);
DefaultRequest<RetryableRequestContext> lbRequest = new DefaultRequest<>(
new RetryableRequestContext(null, buildRequestData(request), hint));
// On retries the policy will choose the server and set it in the context
// and extract the server and update the request being made
if (context instanceof LoadBalancedRetryContext) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
ServiceInstance serviceInstance = lbContext.getServiceInstance();
if (serviceInstance == null) {
if (LOG.isDebugEnabled()) {
LOG.debug("Service instance retrieved from LoadBalancedRetryContext: was null. "
+ "Reattempting service instance selection");
}
ServiceInstance previousServiceInstance = lbContext.getPreviousServiceInstance();
lbRequest.getContext().setPreviousServiceInstance(previousServiceInstance);
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle.onStart(lbRequest));
retrievedServiceInstance = loadBalancerClient.choose(serviceId, lbRequest);
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Selected service instance: %s", retrievedServiceInstance));
}
lbContext.setServiceInstance(retrievedServiceInstance);
}
if (retrievedServiceInstance == null) {
if (LOG.isWarnEnabled()) {
LOG.warn("Service instance was not resolved, executing the original request");
}
org.springframework.cloud.client.loadbalancer.Response<ServiceInstance> lbResponse = new DefaultResponse(
retrievedServiceInstance);
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle
.onComplete(new CompletionContext<ResponseData, ServiceInstance, RetryableRequestContext>(
CompletionContext.Status.DISCARD, lbRequest, lbResponse)));
feignRequest = request;
}
else {
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Using service instance from LoadBalancedRetryContext: %s",
retrievedServiceInstance));
}
String reconstructedUrl = loadBalancerClient.reconstructURI(retrievedServiceInstance, originalUri)
.toString();
feignRequest = buildRequest(request, reconstructedUrl);
}
}
org.springframework.cloud.client.loadbalancer.Response<ServiceInstance> lbResponse = new DefaultResponse(
retrievedServiceInstance);
Response response = LoadBalancerUtils.executeWithLoadBalancerLifecycleProcessing(delegate, options,
feignRequest, lbRequest, lbResponse, supportedLifecycleProcessors,
retrievedServiceInstance != null);
int responseStatus = response.status();
if (retryPolicy != null && retryPolicy.retryableStatusCode(responseStatus)) {
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Retrying on status code: %d", responseStatus));
}
byte[] byteArray = response.body() == null ? new byte[] {}
: StreamUtils.copyToByteArray(response.body().asInputStream());
response.close();
throw new LoadBalancerResponseStatusCodeException(serviceId, response, byteArray,
URI.create(request.url()));
}
return response;
}, new LoadBalancedRecoveryCallback<Response, Response>() {
@Override
protected Response createResponse(Response response, URI uri) {
return response;
}
});
}
- 构建重试策略 因为此处是loadbalancer的源码流程 所以会调用
BlockingLoadBalancedRetryFactory#createRetryPolicy
方法返回BlockingLoadBalancedRetryPolicy
typescript
@Override
public LoadBalancedRetryPolicy createRetryPolicy(String serviceId, ServiceInstanceChooser serviceInstanceChooser) {
return new BlockingLoadBalancedRetryPolicy(loadBalancerFactory.getProperties(serviceId));
}
-
构建重试模板
RetryTemplate
,createBackOffPolicy
方法返回 NoBackOffPolicy(是spring retry框架中一个backoff机制, 意思是失败立即发起重试 而不需要时间延迟)- 构建重试监听器, 这里基本没什么作用。返回了Spring retry默认的重试监听器。这块源码可以忽略。
- 构建重试策略, 这里会使用上文创建的
BlockingLoadBalancedRetryPolicy
当作InterceptorRetryPolicy
的执行者(委托模式)来真正的执行重试策略。
ini
private RetryTemplate buildRetryTemplate(String serviceId, Request request, LoadBalancedRetryPolicy retryPolicy) {
RetryTemplate retryTemplate = new RetryTemplate();
// a. 重试backoff机制 主要对重试延迟做配置
BackOffPolicy backOffPolicy = this.loadBalancedRetryFactory.createBackOffPolicy(serviceId);
retryTemplate.setBackOffPolicy(backOffPolicy == null ? new NoBackOffPolicy() : backOffPolicy);
// b. 构建重试监听器
RetryListener[] retryListeners = this.loadBalancedRetryFactory.createRetryListeners(serviceId);
if (retryListeners != null && retryListeners.length != 0) {
retryTemplate.setListeners(retryListeners);
}
// c. 重试策略 配置重试几次 如何重试。
retryTemplate.setRetryPolicy(retryPolicy == null ? new NeverRetryPolicy()
: new InterceptorRetryPolicy(toHttpRequest(request), retryPolicy, loadBalancerClient, serviceId));
return retryTemplate;
}
看下 loadbalancer中是如何进行重试以及如何根据配置进行重试执行的 。首先我们要了解loadbalancer 的两个配置参数 maxRetriesOnSameServiceInstance
和 maxRetriesOnNextServiceInstance
。
maxRetriesOnSameServiceInstance
: 当前节点最大重试次数。maxRetriesOnNextServiceInstance
: 新节点最大重试次数。此处的新节点有可能是触发异常的节点,这取决于loadbalancer 负载均衡算法的choose方法。目前loadbalancer 提供的几个负载均衡算法返回的实例都是无状态的,所以可能会拿到异常的节点。下面会对这个结果进行分析。
重试拦截器
重点在于 InterceptorRetryPolicy
的 canRetry 和 registerThrowable
这两个方法
getRetryCount()
返回重试了多少次 第一次肯定为 0 并且 此时还没有进行负载均衡 所以serviceInstance
为null- 如果调用过程中出现异常, 则会调用
BlockingLoadBalancedRetryPolicy
的registerThrowable
方法。
typescript
@Override
public boolean canRetry(RetryContext context) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
// 1
if (lbContext.getRetryCount() == 0 && lbContext.getServiceInstance() == null) {
lbContext.setServiceInstance(null);
return true;
}
return policy.canRetryNextServer(lbContext);
}
@Override
public RetryContext open(RetryContext parent) {
return new LoadBalancedRetryContext(parent, request);
}
@Override
public void close(RetryContext context) {
policy.close((LoadBalancedRetryContext) context);
}
// 2
@Override
public void registerThrowable(RetryContext context, Throwable throwable) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
// this is important as it registers the last exception in the context and also
// increases the retry count
lbContext.registerThrowable(throwable);
// let the policy know about the exception as well
policy.registerThrowable(lbContext, throwable);
}
BlockingLoadBalancedRetryPolicy
类中方法功能
canRetrySameServer
使用当前异常节点最大重试次数canRetryNextServer
重新获取新节点 最大重试次数
java
public class BlockingLoadBalancedRetryPolicy implements LoadBalancedRetryPolicy {
private final LoadBalancerProperties properties;
private int sameServerCount = 0;
private int nextServerCount = 0;
public BlockingLoadBalancedRetryPolicy(LoadBalancerProperties properties) {
this.properties = properties;
}
public boolean canRetry(LoadBalancedRetryContext context) {
HttpMethod method = context.getRequest().getMethod();
return HttpMethod.GET.equals(method) || properties.getRetry().isRetryOnAllOperations();
}
@Override
public boolean canRetrySameServer(LoadBalancedRetryContext context) {
return sameServerCount < properties.getRetry().getMaxRetriesOnSameServiceInstance() && canRetry(context);
}
@Override
public boolean canRetryNextServer(LoadBalancedRetryContext context) {
// After the failure, we increment first and then check, hence the equality check
return nextServerCount <= properties.getRetry().getMaxRetriesOnNextServiceInstance() && canRetry(context);
}
@Override
public void close(LoadBalancedRetryContext context) {
}
@Override
public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) {
if (!canRetrySameServer(context) && canRetry(context)) {
// Reset same server since we are moving to a new ServiceInstance
sameServerCount = 0;
nextServerCount++;
if (!canRetryNextServer(context)) {
context.setExhaustedOnly();
}
else {
// We want the service instance to be set by
// `RetryLoadBalancerInterceptor`
// in order to get the entire data of the request
context.setServiceInstance(null);
}
}
else {
sameServerCount++;
}
}
@Override
public boolean retryableStatusCode(int statusCode) {
return properties.getRetry().getRetryableStatusCodes().contains(statusCode);
}
}
当请求异常时, 会调用 registerThrowable
方法。有两种情况
- 如果配置 重试在异常节点: 即
maxRetriesOnSameServiceInstance
= 3 那就不会执行context.setServiceInstance(null);
实例就不会为空。 那么此时在发起调用时就会一直使用上一次异常请求获取到的serviceInstance
- 如果
maxRetriesOnSameServiceInstance
= 0 则肯定会执行context.setServiceInstance(null)
那么此时在
执行 doWithRetry
时就会重新根据负载均衡器获取新的 serviceInstance

负载均衡器
loadBalancerClient.choose
方法为通过负载均衡器获取serviceInstance
BlockingLoadBalancerClient#choose
通过ReactiveLoadBalancer 负载均衡器选择器 选择开发者配置的负载均衡器()
ini
public <T> ServiceInstance choose(String serviceId, Request<T> request) {
ReactiveLoadBalancer<ServiceInstance> loadBalancer = loadBalancerClientFactory.getInstance(serviceId);
if (loadBalancer == null) {
return null;
}
Response<ServiceInstance> loadBalancerResponse = Mono.from(loadBalancer.choose(request)).block();
if (loadBalancerResponse == null) {
return null;
}
return loadBalancerResponse.getServer();
}

这里以轮训为例
RoundRobinLoadBalancer#choose
方法中
- 通过
ServiceInstanceListSupplier
获取 注册中心下发的实例列表。这里可添加日志或监控, 这里是nacos和loadbalancer结合处,将从nacos获取到的服务列表打印出来, 很多情况下可以判断是nacos的服务列表问题还是loadbalancer 服务匹配问题。
Nacos 打通loadbalancer
主要看以下几个类
DiscoveryClient
: nacos的NacosDiscoveryClient
实现DiscoveryClient
集成springcloud 的服务发现能力CompositeDiscoveryClientAutoConfiguration
springcloud 提供的服务发现规范会构造CompositeDiscoveryClient
并且为 Primary属性LoadBalancerClientConfiguration
中的BlockingSupportConfiguration
将CompositeDiscoveryClient
作为 delete 注入到DiscoveryClientServiceInstanceListSupplier
当作服务发现客户端。
DiscoveryClientServiceInstanceListSupplier
通过调用CompositeDiscoveryClient
的getInstances方法从服务发现客户端(nacos)中获取服务列表。
Nacos 实现了DiscoveryClient (springcloud 提供的服务注册发现规范)

Loadbalancer
的DiscoveryClientServiceInstanceListSupplier
(实现ServiceInstanceListSupplier接口)会调用NacosDiscoverClient
将服务列表缓存在serviceInstances
中提供给负载均衡器使用
通过 ServiceInstanceListSupplier
可以获取nacos 同步给loadbalncer 的所有的servicesList列表。
下面标红的逻辑就是从 DiscoveryClientServiceInstanceListSupplier
中获取serviceInstances
将该列表交给负载均衡算法计算选择使用哪一个实例发起请求。
supplier.get(request)
返回 flux 会调用next
方法从nacos
中获取 服务列表。
scss
public Mono<Response<ServiceInstance>> choose(Request request) {
ServiceInstanceListSupplier supplier = serviceInstanceListSupplierProvider
.getIfAvailable(NoopServiceInstanceListSupplier::new);
return supplier.get(request).next()
.map(serviceInstances -> processInstanceResponse(supplier, serviceInstances));
}
private Response<ServiceInstance> processInstanceResponse(ServiceInstanceListSupplier supplier,
List<ServiceInstance> serviceInstances) {
Response<ServiceInstance> serviceInstanceResponse = getInstanceResponse(serviceInstances);
if (supplier instanceof SelectedInstanceCallback && serviceInstanceResponse.hasServer()) {
((SelectedInstanceCallback) supplier).selectedServiceInstance(serviceInstanceResponse.getServer());
}
return serviceInstanceResponse;
}
补充
