聊聊PowerJob的MapReduceProcessor

本文主要研究一下PowerJob的MapReduceProcessor

MapReduceProcessor

public interface MapReduceProcessor extends MapProcessor {

    /**
     * reduce方法将在所有任务结束后调用
     * @param context 任务上下文
     * @param taskResults 保存了各个子Task的执行结果
     * @return reduce产生的结果将作为任务最终的返回结果
     */
    ProcessResult reduce(TaskContext context, List<TaskResult> taskResults);
}

MapReduceProcessor继承了MapProcessor,它新增了reduce方法

TaskResult

tech/powerjob/worker/core/processor/TaskResult.java

@Data
public class TaskResult {

    private String taskId;
    private boolean success;
    private String result;

}

TaskResult定义了taskId、success、result属性

handleLastTask

tech/powerjob/worker/core/processor/runnable/HeavyProcessorRunnable.java

    private void handleLastTask(String taskId, Long instanceId, TaskContext taskContext, ExecuteType executeType) {
        final BasicProcessor processor = processorBean.getProcessor();
        ProcessResult processResult;
        Stopwatch stopwatch = Stopwatch.createStarted();
        log.debug("[ProcessorRunnable-{}] the last task(taskId={}) start to process.", instanceId, taskId);

        List<TaskResult> taskResults = workerRuntime.getTaskPersistenceService().getAllTaskResult(instanceId, task.getSubInstanceId());
        try {
            switch (executeType) {
                case BROADCAST:

                    if (processor instanceof BroadcastProcessor) {
                        BroadcastProcessor broadcastProcessor = (BroadcastProcessor) processor;
                        processResult = broadcastProcessor.postProcess(taskContext, taskResults);
                    } else {
                        processResult = BroadcastProcessor.defaultResult(taskResults);
                    }
                    break;
                case MAP_REDUCE:

                    if (processor instanceof MapReduceProcessor) {
                        MapReduceProcessor mapReduceProcessor = (MapReduceProcessor) processor;
                        processResult = mapReduceProcessor.reduce(taskContext, taskResults);
                    } else {
                        processResult = new ProcessResult(false, "not implement the MapReduceProcessor");
                    }
                    break;
                default:
                    processResult = new ProcessResult(false, "IMPOSSIBLE OR BUG");
            }
        } catch (Throwable e) {
            processResult = new ProcessResult(false, e.toString());
            log.warn("[ProcessorRunnable-{}] execute last task(taskId={}) failed.", instanceId, taskId, e);
        }

        TaskStatus status = processResult.isSuccess() ? TaskStatus.WORKER_PROCESS_SUCCESS : TaskStatus.WORKER_PROCESS_FAILED;
        reportStatus(status, suit(processResult.getMsg()), null, taskContext.getWorkflowContext().getAppendedContextData());

        log.info("[ProcessorRunnable-{}] the last task execute successfully, using time: {}", instanceId, stopwatch);
    }

HeavyProcessorRunnable的handleLastTask方法先通过workerRuntime.getTaskPersistenceService().getAllTaskResult获取taskResults,然后对于MapReduceProcessor则回调mapReduceProcessor.reduce方法

getAllTaskResult

tech/powerjob/worker/persistence/TaskPersistenceService.java

    public List<TaskResult> getAllTaskResult(Long instanceId, Long subInstanceId) {
        try {
            return execute(() -> taskDAO.getAllTaskResult(instanceId, subInstanceId));
        }catch (Exception e) {
            log.error("[TaskPersistenceService] getTaskId2ResultMap for instance(id={}) failed.", instanceId, e);
        }
        return Lists.newLinkedList();
    }

TaskPersistenceService的getAllTaskResult方法根据instanceId, subInstanceId查询task_info表select task_id, status, result from task_info where instance_id = ? and sub_instance_id = ?,最后只返回状态是WORKER_PROCESS_SUCCESS或者WORKER_PROCESS_FAILED的任务信息

小结

MapReduceProcessor继承了MapProcessor,它新增了reduce方法;HeavyProcessorRunnable的handleLastTask方法先通过workerRuntime.getTaskPersistenceService().getAllTaskResult获取taskResults,然后对于MapReduceProcessor则回调mapReduceProcessor.reduce方法;getAllTaskResult方法根据instanceId, subInstanceId查询task_info表返回状态是WORKER_PROCESS_SUCCESS或者WORKER_PROCESS_FAILED的任务信息(task_info表只在worker节点上),默认是h2(~/powerjob/worker/h2/{uuid}/powerjob_worker_db.mv.db)

相关推荐
晨曦_子画8 分钟前
编程语言之战:AI 之后的 Kotlin 与 Java
android·java·开发语言·人工智能·kotlin
Black_Friend16 分钟前
关于在VS中使用Qt不同版本报错的问题
开发语言·qt
希言JY40 分钟前
C字符串 | 字符串处理函数 | 使用 | 原理 | 实现
c语言·开发语言
残月只会敲键盘40 分钟前
php代码审计--常见函数整理
开发语言·php
xianwu54341 分钟前
反向代理模块
linux·开发语言·网络·git
ktkiko111 小时前
Java中的远程方法调用——RPC详解
java·开发语言·rpc
y5236481 小时前
Javascript监控元素样式变化
开发语言·javascript·ecmascript
IT技术分享社区2 小时前
C#实战:使用腾讯云识别服务轻松提取火车票信息
开发语言·c#·云计算·腾讯云·共识算法
极客代码2 小时前
【Python TensorFlow】入门到精通
开发语言·人工智能·python·深度学习·tensorflow
疯一样的码农2 小时前
Python 正则表达式(RegEx)
开发语言·python·正则表达式