1.问题现象
txt
在使用springBatch的partition时,如果配置了skip-policy后,在写入数据时发现如果交易抛出异常,会自动再另开事务重试一遍。
2.问题根本原因
txt
1.在执行tasklet的时候,配置跳过策略,和不配置跳过策略的chunkProcessor,不管配置的跳过异常是什么,最大次数是多少,都会至少重试一次,因为代码的实现逻辑就是这样的,后面进行了源码解析。不是同一个实例,这个过程在服务启动的时候就已经加载配置好了。
2.配置了跳过策略的chunkProcessor,不管配置的跳过异常是什么,最大次数是多少,都会至少重试一次,因为代码的实现逻辑就是这样的,后面进行了源码解析。
配置了跳过策略的情况:
没有配置跳过策略的情况:
3.配置跳过策略chunkProcessor的源码解析
3.1partition用法简单介绍
txt
1.partition其实就是在reader、writer的基础上,加了一个数据拆分器,需要实现org.springframework.batch.core.partition.support.Partitioner,并重写其partition方法,通过gridSize来限制数据分组的大小,这个过程可以自定义也可以使用默认的Partitioner。
2.通过trunk中的chunk-completion-policy策略还可以配置一次事务处理数据的条数,即trunkSize
3.2FaultTolerantChunkProcessor处理方式源码解析
在完成数据拆分后,会走到RetryTemplate.doExecute方法,方法的代码如下:
java
protected <T, E extends Throwable> T doExecute(RetryCallback<T, E> retryCallback,
RecoveryCallback<T> recoveryCallback, RetryState state)
throws E, ExhaustedRetryException {
RetryPolicy retryPolicy = this.retryPolicy;
BackOffPolicy backOffPolicy = this.backOffPolicy;
// Allow the retry policy to initialise itself...
RetryContext context = open(retryPolicy, state);
if (this.logger.isTraceEnabled()) {
this.logger.trace("RetryContext retrieved: " + context);
}
// Make sure the context is available globally for clients who need
// it...
RetrySynchronizationManager.register(context);
Throwable lastException = null;
boolean exhausted = false;
try {
// Give clients a chance to enhance the context...
boolean running = doOpenInterceptors(retryCallback, context);
if (!running) {
throw new TerminatedRetryException(
"Retry terminated abnormally by interceptor before first attempt");
}
// Get or Start the backoff context...
BackOffContext backOffContext = null;
Object resource = context.getAttribute("backOffContext");
if (resource instanceof BackOffContext) {
backOffContext = (BackOffContext) resource;
}
if (backOffContext == null) {
backOffContext = backOffPolicy.start(context);
if (backOffContext != null) {
context.setAttribute("backOffContext", backOffContext);
}
}
/*
* We allow the whole loop to be skipped if the policy or context already
* forbid the first try. This is used in the case of external retry to allow a
* recovery in handleRetryExhausted without the callback processing (which
* would throw an exception).
*/
while (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Retry: count=" + context.getRetryCount());
}
// Reset the last exception, so if we are successful
// the close interceptors will not think we failed...
lastException = null;
return retryCallback.doWithRetry(context);
}
catch (Throwable e) {
lastException = e;
try {
registerThrowable(retryPolicy, state, context, e);
}
catch (Exception ex) {
throw new TerminatedRetryException("Could not register throwable",
ex);
}
finally {
doOnErrorInterceptors(retryCallback, context, e);
}
if (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
backOffPolicy.backOff(backOffContext);
}
catch (BackOffInterruptedException ex) {
lastException = e;
// back off was prevented by another thread - fail the retry
if (this.logger.isDebugEnabled()) {
this.logger
.debug("Abort retry because interrupted: count="
+ context.getRetryCount());
}
throw ex;
}
}
if (this.logger.isDebugEnabled()) {
this.logger.debug(
"Checking for rethrow: count=" + context.getRetryCount());
}
if (shouldRethrow(retryPolicy, context, state)) {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Rethrow in retry for policy: count="
+ context.getRetryCount());
}
throw RetryTemplate.<E>wrapIfNecessary(e);
}
}
/*
* A stateful attempt that can retry may rethrow the exception before now,
* but if we get this far in a stateful retry there's a reason for it,
* like a circuit breaker or a rollback classifier.
*/
if (state != null && context.hasAttribute(GLOBAL_STATE)) {
break;
}
}
if (state == null && this.logger.isDebugEnabled()) {
this.logger.debug(
"Retry failed last attempt: count=" + context.getRetryCount());
}
exhausted = true;
return handleRetryExhausted(recoveryCallback, context, state);
}
catch (Throwable e) {
throw RetryTemplate.<E>wrapIfNecessary(e);
}
finally {
close(retryPolicy, context, state, lastException == null || exhausted);
doCloseInterceptors(retryCallback, context, lastException);
RetrySynchronizationManager.clear();
}
}
当第一次消费数据时,走的是**return retryCallback.doWithRetry(context);**这行,这行代码调用的是FaultTolerantChunkProcessor类的write方法中retryCallback的doWithRetry方法,其中dowriter方法之后调用的就是我们consumer的consumeData方法
在第一次调用后会抛出异常,但是错误被catch住了,所以会在RetryTemplate.doExecute方法中继续往下走,走的是return handleRetryExhausted(recoveryCallback, context, state);这行,这个调用的就是FaultTolerantChunkProcessor类的write方法中recoveryCallback的recover方法
其中scan方法内部有第二次调用writer的地方
在第二次报错后,错误也被catch住了,但此时其实后面还会再跑一遍,但是是新的构建的代码块,inputs和outputs都为null,相当于空跑,在后面的判断就return了
在整个处理过程中,还有一些用来判断的参数,比如:
txt
1.FaultTolerantChunkProcessor类里子类UserData的scanning属性是用来判断用户数据是否被扫描过的。如果scanning属性为true,则表示该用户数据已经被扫描过,否则表示该用户数据尚未被扫描过。
2.在org.springframework.batch.core.step.item.Chunk类中,busy属性是一个布尔值,用于指示该代码块是否正在被构建。如果busy为true,则表示该代码块正在被构建,否则表示该代码块已经构建完成。
4.结论
txt
在使用partition时,只要配置了跳过策略,那么在writer的过程中,如果第一次处理抛出异常,那么都会另起事务再处理一次,如果想规避这种情况,可以把跳过策略去掉,那么在数据处理失败时,就会直接抛错,批量也会停住。
附上断点截图: