通过Flink 1.14.*版本kafkaSource源码,了解了kafkaSource的源码,这里一起看一下kafkaSink的源码,
一、如何把sink加入到Flink中
首先先找一个类继承FlinkKafkaProducerBase
这个抽象类,之后加入到dataStream
中就可以了
如下:
java
DataStream<Tuple2<String, Integer>> dataStream = source的数据
dataStream..addSink(继承了FlinkKafkaProducerBase的实现类)
二、FlinkKafkaProducerBase的源码
java
@Internal
public abstract class FlinkKafkaProducerBase<IN> extends RichSinkFunction<IN> implements CheckpointedFunction {
public FlinkKafkaProducerBase(String defaultTopicId, KeyedSerializationSchema<IN> serializationSchema, Properties producerConfig, FlinkKafkaPartitioner<IN> customPartitioner) {
//省略
}
//这个是关键,这个是发送的执行逻辑
public void invoke(IN next, Context context) throws Exception {
byte[] serializedKey = this.schema.serializeKey(next);
byte[] serializedValue = this.schema.serializeValue(next);
String targetTopic = this.schema.getTargetTopic(next);
if (this.flinkKafkaPartitioner == null) {
record = new ProducerRecord(targetTopic, serializedKey, serializedValue);
} else {
record = new ProducerRecord(targetTopic, this.flinkKafkaPartitioner.partition(next, serializedKey, serializedValue, targetTopic, partitions), serializedKey, serializedValue);
}
this.producer.send(record, this.callback);
}
}
其中invoke
是父类RichSinkFunction
实现的接口SinkFunction
中的方法
三、调用SinkFunction的invoke的地方
java
@Public
public class DataStream<T> {
public DataStreamSink<T> addSink(SinkFunction<T> sinkFunction) {
this.transformation.getOutputType();
if (sinkFunction instanceof InputTypeConfigurable) {
((InputTypeConfigurable)sinkFunction).setInputType(this.getType(), this.getExecutionConfig());
}
//这个生成sink算子
StreamSink<T> sinkOperator = new StreamSink((SinkFunction)this.clean(sinkFunction));
DataStreamSink<T> sink = new DataStreamSink(this, sinkOperator);
this.getExecutionEnvironment().addOperator(sink.getLegacyTransformation());
return sink;
}
}
和kafkaSource
一样,会把sinkFunction
赋值给AbstractUdfStreamOperator
的userFunction
字段
java
public class StreamSink<IN> extends AbstractUdfStreamOperator<Object, SinkFunction<IN>> implements OneInputStreamOperator<IN, Object> {
public StreamSink(SinkFunction<IN> sinkFunction) {
super(sinkFunction);
this.chainingStrategy = ChainingStrategy.ALWAYS;
}
public void processElement(StreamRecord<IN> element) throws Exception {
this.sinkContext.element = element;
((SinkFunction)this.userFunction).invoke(element.getValue(), this.sinkContext);
}
}
java
@PublicEvolving
public abstract class AbstractUdfStreamOperator<OUT, F extends Function> extends AbstractStreamOperator<OUT> implements OutputTypeConfigurable<OUT> {
private static final long serialVersionUID = 1L;
protected final F userFunction;
public AbstractUdfStreamOperator(F userFunction) {
this.userFunction = (Function)Objects.requireNonNull(userFunction);
this.checkUdfCheckpointingPreconditions();
}
}
这里StreamSink
也是实现了implements OneInputStreamOperator
接口,等Flink
运行时调用StreamSink
的processElement
方法触发实现类的invoke
方法,
下面是FlinkKafkaProducerBase
类关系图