Flink 自定义数据源开发流程

1 继承SourceFunction和ParallelSourceFunction

复制代码
import org.apache.flink.streaming.api.functions.source.SourceFunction;

重新run()和cancel()方法

2 AccessSource 代码

复制代码
package com.zyb.flink.basic.source;
import com.zyb.flink.basic.bean.Access;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import java.util.Random;

public class AccessSource implements SourceFunction<Access> {
    boolean isRunning = true;

    @Override
    public void run(SourceContext<Access> ctx) throws Exception {
        Random random = new Random();
        String[] domains = {"pk1.com","pk2.com","pk3.com","pk4.com","pk5."};

        while (isRunning){
            long time = System.currentTimeMillis();
            ctx.collect(new Access(time,domains[random.nextInt(domains.length)],random.nextInt(1000)));
        }
        Thread.sleep(2000);
    }

    @Override
    public void cancel() {
        isRunning = false;
    }
}

3 Access代码

复制代码
package com.zyb.flink.basic.bean;

public class Access {
    private long time;
    private String domain;
    private double traffic;

    @Override
    public String toString() {
        return "Access{" +
                "time=" + time +
                ", domain='" + domain + '\'' +
                ", traffic=" + traffic +
                '}';
    }

    public Access() {
    }

    public Access(long time, String domain, double traffic) {
        this.time = time;
        this.domain = domain;
        this.traffic = traffic;
    }

    public long getTime() {
        return time;
    }

    public void setTime(long time) {
        this.time = time;
    }

    public String getDomain() {
        return domain;
    }

    public void setDomain(String domain) {
        this.domain = domain;
    }

    public double getTraffic() {
        return traffic;
    }

    public void setTraffic(double traffic) {
        this.traffic = traffic;
    }
}

4 测试代码

复制代码
package com.zyb.flink.basic.source;
import com.zyb.flink.basic.bean.Access;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import java.util.Random;

public class AccessSource implements SourceFunction<Access> {
    boolean isRunning = true;

    @Override
    public void run(SourceContext<Access> ctx) throws Exception {
        Random random = new Random();
        String[] domains = {"pk1.com","pk2.com","pk3.com","pk4.com","pk5."};

        while (isRunning){
            long time = System.currentTimeMillis();
            ctx.collect(new Access(time,domains[random.nextInt(domains.length)],random.nextInt(1000)));
        }
        Thread.sleep(2000);
    }

    @Override
    public void cancel() {
        isRunning = false;
    }
}
相关推荐
大大大大晴天2 天前
Hudi技术内幕:深入解析Index索引机制
大数据
阿里云大数据AI技术2 天前
Flink Forward Asia 2026 深圳启幕:Agentic Streaming for AI,开启实时智能新范式
大数据·flink
SelectDB2 天前
阶跃星辰基于 SelectDB 构建 PB 级 Agent 可观测平台
大数据·数据库·aigc
tonyabasy3 天前
Flink 实时数仓开发实战:SQL中也能做到资源精细化管理
flink
大大大大晴天4 天前
浅聊Flink实时关联计算的不适用场景
flink
大大大大晴天5 天前
深入解析 Flink Kafka Connector:原理、配置与最佳实践
flink
大大大大晴天6 天前
Hudi技术内幕:RecordPayload到RecordMerger
大数据
SelectDB6 天前
秒级弹性、最高降本 70%:SelectDB Serverless 如何重塑云数仓资源效率
大数据·后端·云原生
WhoAmI6 天前
MapReduce框架原理解析一:InputFormat
大数据·hadoop
WhoAmI6 天前
MapReduce框架原理解析三:OutputFormat
大数据·hadoop