flink处理函数--副输出功能

背景

在flink中,如果你想要访问记录的处理时间或者事件时间,注册定时器,或者是将记录输出到多个输出流中,你都需要处理函数的帮助,本文就来通过一个例子来讲解下副输出

副输出

本文还是基于streaming-with-flink这本书的例子作为演示,它实现一个把温度低于32度的记录输出到副输出的功能,正常的记录还是从主输出中输出.代码如下:

java 复制代码
package wikiedits.processfunc.job;

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.OutputTag;

import wikiedits.processfunc.pojo.SensorReading;
import wikiedits.processfunc.process.FreezingMonitor;
import wikiedits.processfunc.source.SensorSource;

public class SideOutPutJob {

    public static void main(String[] args) throws Exception {

        StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment();

        DataStream<SensorReading> readings = see.addSource(new SensorSource());

        SingleOutputStreamOperator<SensorReading> monitoredReadings = readings.process(new FreezingMonitor());
        // 打印附输出
        monitoredReadings.getSideOutput(new OutputTag<String>("freezing-alarms"){}).print();
        // 打印主输出
        monitoredReadings.print();
        see.execute();
    }
}


package wikiedits.processfunc.process;

import org.apache.flink.streaming.api.functions.ProcessFunction;
import org.apache.flink.util.Collector;
import org.apache.flink.util.OutputTag;

import wikiedits.processfunc.pojo.SensorReading;

public class FreezingMonitor extends ProcessFunction<SensorReading, SensorReading> {

    private OutputTag<String> freezingAlarmOutput = new OutputTag<String>("freezing-alarms") {};


    @Override
    public void processElement(SensorReading value, Context ctx, Collector<SensorReading> out) throws Exception {
        if (value.temperature < 32.0) {
            ctx.output(freezingAlarmOutput, "freezing alarm for " + value.id + " :" + value.temperature);
        }
        out.collect(value);
    }

}
package wikiedits.processfunc.source;

/*
 * Copyright 2015 Fabian Hueske / Vasia Kalavri
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *  http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
import wikiedits.processfunc.pojo.SensorReading;

import java.util.Calendar;
import java.util.Random;

/**
 * Flink SourceFunction to generate SensorReadings with random temperature values.
 *
 * Each parallel instance of the source simulates 10 sensors which emit one sensor reading every 100 ms.
 *
 * Note: This is a simple data-generating source function that does not checkpoint its state.
 * In case of a failure, the source does not replay any data.
 */
public class SensorSource extends RichParallelSourceFunction<SensorReading> {

    // flag indicating whether source is still running
    private boolean running = true;

    /** run() continuously emits SensorReadings by emitting them through the SourceContext. */
    @Override
    public void run(SourceContext<SensorReading> srcCtx) throws Exception {

        // initialize random number generator
        Random rand = new Random();
        // look up index of this parallel task
        int taskIdx = this.getRuntimeContext().getIndexOfThisSubtask();

        // initialize sensor ids and temperatures
        String[] sensorIds = new String[10];
        double[] curFTemp = new double[10];
        for (int i = 0; i < 10; i++) {
            sensorIds[i] = "sensor_" + (taskIdx * 10 + i);
            curFTemp[i] = 65 + (rand.nextGaussian() * 20);
        }

        while (running) {

            // get current time
            long curTime = Calendar.getInstance().getTimeInMillis();

            // emit SensorReadings
            for (int i = 0; i < 10; i++) {
                // update current temperature
                curFTemp[i] += rand.nextGaussian() * 0.5;
                // emit reading
                srcCtx.collect(new SensorReading(sensorIds[i], curTime, curFTemp[i]));
            }

            // wait for 100 ms
            Thread.sleep(3000);
        }
    }

    /** Cancels this SourceFunction. */
    @Override
    public void cancel() {
        this.running = false;
    }
}

程序运行结果:

相关推荐
郝学胜-神的一滴1 分钟前
深入浅出 C++20:新特性与实践
开发语言·c++·程序人生·算法·c++20
郑洁文4 分钟前
豆瓣网影视数据分析与应用
大数据·python·数据挖掘·数据分析
Jelena技术达人15 分钟前
淘宝/天猫按图搜索(拍立淘)item_search_img API接口实战指南
算法·图搜索算法
Adorable老犀牛26 分钟前
阿里云-基于通义灵码实现高效 AI 编码 | 8 | 上手实操:LeetCode学习宝典,通义灵码赋能算法高效突破
学习·算法·leetcode
望获linux31 分钟前
【实时Linux实战系列】规避缺页中断:mlock/hugetlb 与页面预热
java·linux·服务器·数据库·chrome·算法
菜就多练,以前是以前,现在是现在33 分钟前
Codeforces Round 1048 (Div. 2)
数据结构·c++·算法
计算机编程-吉哥1 小时前
大数据毕业设计-大数据-基于大数据的热门游戏推荐与可视化系统(高分计算机毕业设计选题·定制开发·真正大数据)
大数据·毕业设计·计算机毕业设计选题·机器学习毕业设计·大数据毕业设计·大数据毕业设计选题推荐·大数据毕设项目
林木辛1 小时前
LeetCode 热题 160.相交链表(双指针)
算法·leetcode·链表
野生的编程萌新1 小时前
【C++深学日志】从0开始的C++生活
c语言·开发语言·c++·算法
DDC楼宇自控与IBMS集成系统解读1 小时前
IBMS智能化集成系统:构建建筑全场景协同管控中枢
大数据·网络·人工智能·能耗监测系统·ibms智能化集成系统·楼宇自控系统·智能照明系统