Flink报The implementation of the XxxFunction is not serializable
这个报错一般是因为
XxxFunction
引用了外部类的非静态方法导致。下面让我们使用如下示例代码复现一下:
c
public class FlinkExceptionTest {
public static void main(String[] args) throws Exception {
// 1. 创建执行环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
// 2. 创建数据流
DataStream<String> source = env.fromElements("1", "2", "3").returns(String.class).uid("source").name("source");
// 3. 自定义处理数据
new FlinkJobBase() {
@Override
public SingleOutputStreamOperator<Integer> customTransform(DataStream<String> dataStream) {
return dataStream.flatMap(new FlatMapFunction<String, Integer>() {
@Override
public void flatMap(String value, Collector<Integer> out) throws Exception {
out.collect(get(value));
}
});
}
}.customTransform(source).print();
// 4. 执行程序
env.execute("FlinkExceptionTest");
}
}
abstract class FlinkJobBase {
// 将一个字符串转为Integer
public Integer get(String str) {
return Integer.parseInt(str);
}
//自定义转换
public abstract SingleOutputStreamOperator<Integer> customTransform(DataStream<String> dataStream);
}
代码执行后出现如下报错:
c
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the FlatMapFunction is not serializable. The implementation accesses fields of its enclosing class, which is a common reason for non-serializability. A common solution is to make the function a proper (non-inner) class, or a static inner class.
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:164)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:69)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:2052)
at org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:203)
at org.apache.flink.streaming.api.datastream.DataStream.flatMap(DataStream.java:613)
at com.xxxxxx.abcd.FlinkExceptionTest$1.customTransform(FlinkExceptionTest.java:19)
at com.xxxxxx.abcd.FlinkExceptionTest.main(FlinkExceptionTest.java:26)
Caused by: java.io.NotSerializableException: com.xxxxxx.abcd.FlinkExceptionTest$1
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:143)
... 6 more
这个报错是 org.apache.flink.api.common.InvalidProgramException 异常,提示 FlatMapFunction 的实现不可序列化。异常信息表明实现访问了其封闭类的字段,这是导致不可序列化的常见原因。报错提示解决方法是将函数设置为一个适当的(非内部)类,或者设置为静态内部类。
现总结三种处理方式如下:
方法1. 将被调用的方法get()
调整为静态方法
c
abstract class FlinkJobBase {
// 将一个字符串转为Integer
public static Integer get(String str) {
return Integer.parseInt(str);
}
//自定义转换
public abstract SingleOutputStreamOperator<Integer> customTransform(DataStream<String> dataStream);
}
方法2. 让被调用方法get()
的外部类实现Serializable
c
abstract class FlinkJobBase implements Serializable {
// 将一个字符串转为Integer
public Integer get(String str) {
return Integer.parseInt(str);
}
//自定义转换
public abstract SingleOutputStreamOperator<Integer> customTransform(DataStream<String> dataStream);
}
方法3. 将被调用的方法get()
转移到匿名内部类FlatMapFunction
c
abstract class FlinkJobBase implements Serializable {
// 将一个字符串转为Integer
public Integer get(String str) {
return Integer.parseInt(str);
}
//自定义转换
public abstract SingleOutputStreamOperator<Integer> customTransform(DataStream<String> dataStream);
}
上述三种方式根据具体场景需求任选其一即可解决。代码修改后的执行结果如下:
c
1
2
3
Process finished with exit code 0