org.apache.flink.util.FlinkException:Could not stop with a savepoint job
问题描述
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.util.FlinkException: Could not stop with a savepoint job "e139a2eba7f8dc0b07fab65e84421ee4".
at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:581)
at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002)
at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:569)
at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1069)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:579)
... 6 more
可以看下 JM log 中这个 savepoint 失败是什么原因导致的,如果是 savepoint 超时了,就要看哪个 task 完成的慢,(savepoint 可能比 checkpoint 要慢)