Flink SQL Job内存溢出

目录

异常

  • 异常详情
java 复制代码
java.lang.OutOfMemoryError: Java heap space

    at java.base/java.util.Arrays.copyOf(Arrays.java:3537)

    at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100)

    at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:130)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RequestConverters.bulk(RequestConverters.java:252)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RestHighLevelClient$$Lambda$1735/0x00007ff77baa5898.apply(Unknown Source)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RestHighLevelClient.internalPerformRequestAsync(RestHighLevelClient.java:1763)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RestHighLevelClient.performRequestAsync(RestHighLevelClient.java:1732)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RestHighLevelClient.performRequestAsyncAndParseEntity(RestHighLevelClient.java:1698)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.client.RestHighLevelClient.bulkAsync(RestHighLevelClient.java:549)

    at org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge.lambda$createBulkProcessorBuilder$0(Elasticsearch7ApiCallBridge.java:87)

    at org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge$$Lambda$1519/0x00007ff77b9150f0.accept(Unknown Source)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.Retry$RetryHandler.execute(Retry.java:204)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.Retry.withBackoff(Retry.java:59)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.BulkRequestHandler.execute(BulkRequestHandler.java:62)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.BulkProcessor.execute(BulkProcessor.java:454)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:389)

    at org.apache.flink.elasticsearch7.shaded.org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:361)

    at org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7BulkProcessorIndexer.add(Elasticsearch7BulkProcessorIndexer.java:82)

    at org.apache.flink.streaming.connectors.elasticsearch.table.RowElasticsearchSinkFunction.processUpsert(RowElasticsearchSinkFunction.java:100)

    at org.apache.flink.streaming.connectors.elasticsearch.table.RowElasticsearchSinkFunction.process(RowElasticsearchSinkFunction.java:82)

    at org.apache.flink.streaming.connectors.elasticsearch.table.RowElasticsearchSinkFunction.process(RowElasticsearchSinkFunction.java:43)

    at org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase.invoke(ElasticsearchSinkBase.java:318)

    at org.apache.flink.table.runtime.operators.sink.SinkOperator.processElement(SinkOperator.java:65)

    at org.apache.flink.streaming.runtime.io.RecordProcessorUtils$$Lambda$938/0x00007ff77b6c0bc8.accept(Unknown Source)

    at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:75)

    at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)

    at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)

    at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:52)

    at org.apache.flink.table.runtime.operators.sink.SinkUpsertMaterializer.addRow(SinkUpsertMaterializer.java:170)

    at org.apache.flink.table.runtime.operators.sink.SinkUpsertMaterializer.processElement(SinkUpsertMaterializer.java:147)

    at org.apache.flink.streaming.runtime.io.RecordProcessorUtils.lambda$getRecordProcessor$0(RecordProcessorUtils.java:64)

    at org.apache.flink.streaming.runtime.io.RecordProcessorUtils$$Lambda$849/0x00007ff77b6428b8.accept(Unknown Source)
  • 我遇到的情况是使用Flink SQL在group by大字段导致需要的内存暴涨。

解决方案

方案一

  • 临时方案,调整内存,以下参数

taskmanager.memory.task.heap.size

taskmanager.memory.process.size

yml 复制代码
taskmanager:
  memory:
    task:
      heap:
        size: 3000m
    process:
      size: 6728m
  • 另外,如果确认没有使用Managed Memory,可以把Managed Memory的值调小,heap能得到的内存自然就变大了。有固定值或比例的方式,任选其一

taskmanager.memory.managed.size: 256mb

taskmanager.memory.managed.fraction: 0.1

方案二

  • 彻底方案,还是应该调整Flink SQL不要group by大字段。