Flink CDC监听数据槽位满位报错
flink服务启动失败,报错原因:
javascript
Caused by: org.postgresql.util.PSQLException: ERROR: all replication slots are in use
Hint: Free one or increase max_replication_slots.
日志:
javascript
类名 :org.apache.flink.runtime.source.coordinator.SourceCoordinator
方法名: lambda$runInEventLoop$10(SourceCoordinator.java:478)
内容 :Uncaught exception in the SplitEnumerator for Source org.apache.flink.util.FlinkRuntimeException: Fail to get or create slot for global stream split, the slot name is 130000fa4e20085ba54852927bad24cd811e10. Due to:
at com.ververica.cdc.connectors.postgres.source.enumerator.PostgresSourceEnumerator.createSlotForGlobalStreamSplit(PostgresSourceEnumerator.java:76)
at com.ververica.cdc.connectors.postgres.source.enumerator.PostgresSourceEnumerator.start(PostgresSourceEnumerator.java:50)
at org.apache.flink.runtime.source.coordinator.SourceCoordinator.lambda$start$1(SourceCoordinator.java:233)
at org.apache.flink.runtime.source.coordinator.SourceCoordinator.lambda$runInEventLoop$10(SourceCoordinator.java:469)
at org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:40)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: all replication slots are in use
Hint: Free one or increase max_replication_slots.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:333)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:319)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:295)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:290)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.createReplicationSlot(PostgresReplicationConnection.java:442)
at com.ververica.cdc.connectors.postgres.source.enumerator.PostgresSourceEnumerator.createSlotForGlobalStreamSplit(PostgresSourceEnumerator.java:71)
... 11 more
while {}. Triggering job failover.
排查后问题后发现:flink每一个任务会创建一个slot进行数据解析,重启服务,以前的slot的active为变为false,但不会删除,导致槽位满了,新增slot无法加入,导致服务启动失败
解决方案:手动删除无用槽位。
- 1、查看当前使用的复制槽及其状态
sql
SELECT * FROM pg_replication_slots;
- 2、删除复制槽
sql
// slot_name是你希望删除的复制槽的名称
SELECT * FROM pg_drop_replication_slot('slot_name');