规则链系统
规则节点总共分为6大类
Filter nodes
根据不同条件过滤和路由消息
Filter nodes are the routing and conditional logic components of ThingsBoard's rule engine that examine messages and determine how they should be routed to downstream nodes based on various criteria.
Enrichment nodes
利用数据可给节点增加(丰富)元数据信息?
这些节点可以丰富消息的属性、最新的时间序列值、历史时间序列数据以及从各种来源(包括消息发送者、相关实体、当前租户或客户)获取的实体详细信息。
Transformation nodes
修改现有消息或创建新消息
Transformation nodes are the data processing and manipulation(操纵) components of ThingsBoard's rule engine that modify the content, structure, or format of incoming messages.
Action nodes
https://thingsboard.io/docs/user-guide/rule-engine-2-0/nodes/action/
Perform various actions based on configuration and message
根据配置和消息执行各种操作
- 创建警报 --- 为消息发送者创建新警报或更新现有的活动警报。
- 清除警报 --- 清除消息发送者现有的活动警报。
- 计算字段 --- 触发时间序列或属性数据的计算字段处理,而不将原始数据持久化到数据库中。
- 删除关系 --- 根据类型和方向,从所选实体到消息发送者的关系中删除关系。
External nodes
https://thingsboard.io/docs/user-guide/rule-engine-2-0/nodes/external/
External nodes are the integration components of ThingsBoard's rule engine that send messages to third-party services and external systems.
- AI 请求 --- 向大型语言模型发送带有可定制提示和可选文件附件的请求,将 AI 生成的响应作为传出消息数据返回。
- MQTT --- 使用 QoS 1(至少一次)将传入的消息数据发布到外部 MQTT 代理,支持动态主题模式、多种身份验证方法和 TLS/SSL 加密。
Flow nodes
https://thingsboard.io/docs/user-guide/rule-engine-2-0/nodes/flow/
流程节点控制消息如何在规则引擎的不同部分之间移动。 这些节点可以在队列之间传输消息,以实现顺序处理或分离不同的工作负载,将消息转发到其他规则链,并将结果返回给调用规则链。
- 规则链 --- 将传入的消息转发到指定的规则链。
- 检查点 --- 将传入的消息转移到指定的队列。
在规则链上下文中
拿设备属性
使用 Enrichment - originator attributes 节点, 它会往 Metadata 添加设备属性, 注意key有前缀的
服务端属性是: ss_
规则链处理源码分析
关键知识
Actor 模型
首先在 ThingsBoard 中,这套自研框架遵循一个叫 Actor 模型概念
在计算机科学中,演员模型(英语:Actor model)是一种并发运算上的模型。"演员"是一种程序上的抽象概念,被视为并发运算的基本单元:当一个演员接收到一则消息,它可以做出一些决策、建立更多的演员、发送更多的消息、决定要如何回答接下来的消息。演员可以修改它们自己的私有状态,但是只能通过消息间接的相互影响(避免了基于锁的同步)。
演员模型推崇的哲学是"一切皆是演员",这与面向对象编程的"一切皆是对象"类似。演员是一个运算实体,响应接收到的消息,相互间是并发的:
- 发送有限数量的消息给其他演员;
- 创建有限数量的新演员;
- 指定接收到下一个消息时要用到的行为。
- 以上动作不含有顺序执行的假设,因此可以并行进行。
https://zh.wikipedia.org/zh-cn/演员模型
Actor 的三大概念
演员(Actor)
演员是模型中的基本计算实体,每个演员都是一个独立、自主的单元,具备以下能力:
- 封装私有状态:每个演员维护自己的内部状态,其他演员无法直接访问。
- 接收消息:通过邮箱(Mailbox)接收来自其他演员的消息。
- 发送消息:可以向其他演员异步发送有限数量的消息。
- 创建新演员:可以动态创建有限数量的新演员。
- 改变行为 :在接收到消息后,可以决定如何响应下一条消息,即改变自身的行为。
上述操作没有预设的执行顺序,可以并行进行。
消息传递(Message Passing)
消息是演员之间通信的唯一方式,具有以下特征:
- 异步性:发送方无需等待接收方处理消息,发送者与通信完全解耦。
- 不可变性:消息通常被封装为不可变的数据结构,确保发送方和接收方不会共享可变状态。
- 地址识别:消息接收者通过"邮件地址"进行识别,演员只能与它拥有地址的演员通信。
邮箱(Mailbox)
每个演员都有一个邮箱,用于存储接收到的消息。演员按顺序从邮箱中取出消息进行处理,这确保了状态变化以受控的方式进行,同时也管理了消息传递的异步性。
ThingsBoard 中的 Actor 模型
Actor 包含三大核心组成部分:状态(State)、行为(Behavior)和邮箱(Mailbox)。
它将系统中的所有功能模块抽象为独立的 "Actor"(演员)单元。
- 每个 Actor 内部维护自己的变量信息(状态),且状态仅由 Actor 自身管理。Actor 之间不共享内存,完全通过消息传递进行通信。
- Actor 之间的通信桥梁是邮箱(Mailbox)。邮箱内部采用 FIFO(先进先出)的消息队列来存储消息。发送方 Actor 通过
tell方法将消息异步投递到接收方 Actor 的邮箱中,接收方则按照顺序从邮箱中获取消息并触发相应的计算逻辑(行为)来改变自身状态。 - ThingsBoard 的 Actor 按照严格的层级结构进行组织。根节点是
App Actor,它负责管理下属的所有租户 Actor;Tenant Actor进而管理该租户下的设备 Actor 和规则链 Actor。
\[N_ThingsBoard 业务源码分析\]
对于规则链消息是从集群订阅而来的, 调用栈:
org.thingsboard.server.service.queue.DefaultTbCoreConsumerService#forwardToDeviceActor
org.thingsboard.server.actors.ActorSystemContext#tell
org.thingsboard.server.actors.TbActorMailbox#tell(org.thingsboard.server.common.msg.TbActorMsg)
org.thingsboard.server.actors.TbActorMailbox#enqueue
org.thingsboard.server.actors.TbActorMailbox#tryProcessQueue
org.thingsboard.server.actors.TbActorMailbox#processMailbox
org.thingsboard.server.actors.app.AppActor#doProcess
最终到该方法
org.thingsboard.server.actors.ruleChain.RuleChainActor#doProcess
java
@Override
protected boolean doProcess(TbActorMsg msg) {
switch (msg.getMsgType()) {
case COMPONENT_LIFE_CYCLE_MSG:
onComponentLifecycleMsg((ComponentLifecycleMsg) msg);
break;
case QUEUE_TO_RULE_ENGINE_MSG: // 队列到规则链
processor.onQueueToRuleEngineMsg((QueueToRuleEngineMsg) msg);
break;
case RULE_TO_RULE_CHAIN_TELL_NEXT_MSG:// 规则链 到 规则链?
processor.onTellNext((RuleNodeToRuleChainTellNextMsg) msg);
break;
case RULE_CHAIN_TO_RULE_CHAIN_MSG:
processor.onRuleChainToRuleChainMsg((RuleChainToRuleChainMsg) msg);
break;
case RULE_CHAIN_INPUT_MSG:
processor.onRuleChainInputMsg((RuleChainInputMsg) msg);
break;
case RULE_CHAIN_OUTPUT_MSG:
processor.onRuleChainOutputMsg((RuleChainOutputMsg) msg);
break;
case PARTITION_CHANGE_MSG:
processor.onPartitionChangeMsg((PartitionChangeMsg) msg);
break;
case STATS_PERSIST_TICK_MSG:
onStatsPersistTick(id);
break;
default:
return false;
}
return true;
}
org.thingsboard.server.actors.ruleChain.RuleChainActorMessageProcessor#onQueueToRuleEngineMsg
java
/**
* 接收来自队列的消息入口(由 RuleChainActor.doProcess 分发)
* 区分两种情况:
* 1. relationTypes 为空 → 全新消息,从 firstNode 开始执行
* 2. relationTypes 非空 → 重试/恢复消息,从消息携带的 ruleNodeId 位置继续执行
*/
void onQueueToRuleEngineMsg(QueueToRuleEngineMsg envelope) {
TbMsg msg = envelope.getMsg();
// 校验消息是否已过期(callback 被取消)
if (!checkMsgValid(msg)) {
return;
}
log.trace("[{}][{}] Processing message [{}]: {}", entityId, firstId, msg.getId(), msg);
if (envelope.getRelationTypes() == null || envelope.getRelationTypes().isEmpty()) {
// 首次进入规则链:useRuleNodeIdFromMsg=true,但 msg.getRuleNodeId()==null → 路由到 firstNode
onTellNext(msg, true);
} else {
// 消息在某节点处理后通过 tellNext 重入,或从队列重试恢复,携带目标 nodeId 和 relationType
onTellNext(msg, envelope.getMsg().getRuleNodeId(), envelope.getRelationTypes(), envelope.getFailureMessage());
}
}
首次入口
java
/**
* 将消息路由到具体 RuleNode
* @param useRuleNodeIdFromMsg true=从 msg.getRuleNodeId() 查找目标节点(null 时使用 firstNode)
*/
private void onTellNext(TbMsg msg, boolean useRuleNodeIdFromMsg) {
try {
// 检查规则链组件是否处于 ACTIVE 状态,否则抛出 RuleNodeException
checkComponentStateActive(msg);
RuleNodeId targetId = useRuleNodeIdFromMsg ? msg.getRuleNodeId() : null;
RuleNodeCtx targetCtx;
if (targetId == null) {
// 没有指定目标节点 → 从入口节点 firstNode 开始
// 同时重置 msg 上的 ruleNodeId,绑定当前 ruleChainId
targetCtx = firstNode;//这个是 org.thingsboard.rule.engine.filter.TbMsgTypeSwitchNode
msg = msg.copy()
.ruleChainId(entityId)
.resetRuleNodeId()
.build();
} else {
// 已有目标节点 ID(如 tellNext 后恢复执行)
targetCtx = nodeActors.get(targetId);
}
if (targetCtx != null) {
log.trace("[{}][{}] Pushing message to target rule node", entityId, targetId);
// 构建 DefaultTbContext 并 tell RuleChainToRuleNodeMsg 到目标 RuleNodeActor
pushMsgToNode(targetCtx, msg, NA_RELATION_TYPE);
} else {
// 目标节点不存在(可能已被删除),直接 ack 成功,不阻塞批处理
log.trace("[{}][{}] Rule node does not exist. Probably old message", entityId, targetId);
msg.getCallback().onSuccess();
}
} catch (RuleNodeException rne) {
msg.getCallback().onFailure(rne);
} catch (Exception e) {
msg.getCallback().onFailure(new RuleEngineException(e.getMessage(), e));
}
}
节点重入
org.thingsboard.server.actors.ruleChain.RuleChainActorMessageProcessor#onTellNext(org.thingsboard.server.common.msg.TbMsg, org.thingsboard.server.common.data.id.RuleNodeId, java.util.Set<java.lang.String>, java.lang.String)
java
/**
* 核心路由逻辑:根据 originatorNodeId 查找 nodeRoutes,按 relationType 过滤出下游节点列表
* 三种分支:
* 0 个下游 → 终止节点,回调 onSuccess 或 onFailure(流程结束)
* 1 个下游 → 直接 pushToTarget(同分区内存投递 或 跨分区入队)
* N 个下游 → 用 MultipleTbQueueTbMsgCallbackWrapper 包装,对每个下游副本独立回调
*/
private void onTellNext(TbMsg msg, RuleNodeId originatorNodeId, Set<String> relationTypes, String failureMessage) {
try {
checkComponentStateActive(msg);
EntityId entityId = msg.getOriginator();
// 解析消息归属的分区(用于判断是否需要跨分区入队)
TopicPartitionInfo tpi = systemContext.resolve(tenantId, entityId, msg);
List<RuleNodeRelation> ruleNodeRelations = nodeRoutes.get(originatorNodeId);
if (ruleNodeRelations == null) {
log.warn("[{}][{}][{}] No outbound relations (null). Probably rule node does not exist. Probably old message.", tenantId, entityId, msg.getId());
ruleNodeRelations = Collections.emptyList();
}
// 按 relationType 过滤出匹配的下游连线(大小写不敏感)
List<RuleNodeRelation> relationsByTypes = ruleNodeRelations.stream()
.filter(r -> contains(relationTypes, r.getType()))
.collect(Collectors.toList());
int relationsCount = relationsByTypes.size();
if (relationsCount == 0) {// 0 个下游
// 当前节点是叶子节点(无匹配下游)
log.trace("[{}][{}][{}] No outbound relations to process", tenantId, entityId, msg.getId());
if (relationTypes.contains(TbNodeConnectionType.FAILURE)) {
// 以 Failure relationType 到达叶子节点 → 整条消息处理失败
// → callback.onFailure() → packCtx.onFailure() → pendingCount--
RuleNodeCtx ruleNodeCtx = nodeActors.get(originatorNodeId);
if (ruleNodeCtx != null) {
msg.getCallback().onFailure(new RuleNodeException(failureMessage, ruleChainName, ruleNodeCtx.getSelf()));
} else {
log.debug("[{}] Failure during message processing by Rule Node [{}]. Enable and see debug events for more info", entityId, originatorNodeId.getId());
msg.getCallback().onFailure(new RuleEngineException("Failure during message processing by Rule Node [" + originatorNodeId.getId().toString() + "]"));
}
} else {
// 以 Success/其他 relationType 到达叶子节点 → 消息正常结束
// → callback.onSuccess() → packCtx.onSuccess() → pendingCount--
// → 若 pendingCount==0 则 latch.countDown(),processMsgs await() 解除阻塞
msg.getCallback().onSuccess();
}
} else if (relationsCount == 1) {// 1 个下游
for (RuleNodeRelation relation : relationsByTypes) {
log.trace("[{}][{}][{}] Pushing message to single target: [{}]", tenantId, entityId, msg.getId(), relation.getOut());
// 单下游:直接路由,callback 不复制
pushToTarget(tpi, msg, relation.getOut(), relation.getType());
}
} else {//N 个下游
// 多下游(如同时连 Success→A 和 Success→B):
// 用 MultipleTbQueueTbMsgCallbackWrapper 包装原 callback,需要全部副本都完成才触发原 callback
MultipleTbQueueTbMsgCallbackWrapper callbackWrapper = new MultipleTbQueueTbMsgCallbackWrapper(relationsCount, msg.getCallback());
log.trace("[{}][{}][{}] Pushing message to multiple targets: [{}]", tenantId, entityId, msg.getId(), relationsByTypes);
for (RuleNodeRelation relation : relationsByTypes) {
EntityId target = relation.getOut();
// 多下游场景必须走队列(为每个分支复制 msg + 新 UUID,避免并发修改同一对象)
putToQueue(tpi, msg, callbackWrapper, target);
}
}
} catch (RuleNodeException rne) {
msg.getCallback().onFailure(rne);
} catch (Exception e) {
log.warn("[" + tenantId + "]" + "[" + entityId + "]" + "[" + msg.getId() + "]" + " onTellNext failure", e);
msg.getCallback().onFailure(new RuleEngineException("onTellNext - " + e.getMessage(), e));
}
}
一个下游
org.thingsboard.server.actors.ruleChain.RuleChainActorMessageProcessor#pushToTarget
java
/**
* 单下游路由分发
* 同分区:直接内存投递(pushMsgToNode / RuleChainToRuleChainMsg),零序列化开销
* 跨分区:序列化写队列,由目标分区的 consumerLoop 消费后继续执行
*/
private void pushToTarget(TopicPartitionInfo tpi, TbMsg msg, EntityId target, String fromRelationType) {
if (tpi.isMyPartition()) {
switch (target.getEntityType()) {
case RULE_NODE:
// 同分区内直接 tell → RuleNodeActor.onRuleChainToRuleNodeMsg()
pushMsgToNode(nodeActors.get(new RuleNodeId(target.getId())), msg, fromRelationType);
break;
case RULE_CHAIN:
// 跳转到其他规则链:tell TenantActor → 目标 RuleChainActor.onRuleChainToRuleChainMsg()
parent.tell(new RuleChainToRuleChainMsg(new RuleChainId(target.getId()), entityId, msg, fromRelationType));
break;
}
} else {
// 目标节点在其他分区(集群模式),写入队列等待目标分区消费者处理
putToQueue(tpi, msg, new TbQueueTbMsgCallbackWrapper(msg.getCallback()), target);
}
}
多个下游
org.thingsboard.server.actors.ruleChain.RuleChainActorMessageProcessor#putToQueue(org.thingsboard.server.common.msg.queue.TopicPartitionInfo, org.thingsboard.server.common.msg.TbMsg, org.thingsboard.server.queue.TbQueueCallback, org.thingsboard.server.common.data.id.EntityId)
java
/**
* 多下游场景:为目标节点复制消息(新 UUID),序列化后写入队列
* 下游是 RULE_NODE → 记录目标 ruleNodeId,消费后直接从该节点恢复执行
* 下游是 RULE_CHAIN → 记录目标 ruleChainId,消费后从目标链 firstNode 开始
*/
private void putToQueue(TopicPartitionInfo tpi, TbMsg msg, TbQueueCallback callbackWrapper, EntityId target) {
switch (target.getEntityType()) {
case RULE_NODE:
putToQueue(tpi, msg.copy()
.id(UUID.randomUUID())
.ruleChainId(entityId)
.ruleNodeId(new RuleNodeId(target.getId()))
.build(), callbackWrapper);
break;
case RULE_CHAIN:
putToQueue(tpi, msg.copy()
.id(UUID.randomUUID())
.ruleChainId(new RuleChainId(target.getId()))
.resetRuleNodeId()
.build(), callbackWrapper);
break;
}
}
org.thingsboard.server.actors.ruleChain.RuleChainActorMessageProcessor#putToQueue(org.thingsboard.server.common.msg.queue.TopicPartitionInfo, org.thingsboard.server.common.msg.TbMsg, org.thingsboard.server.queue.TbQueueCallback)
java
/**
* 将消息序列化为 ToRuleEngineMsg proto,通过 clusterService 写入目标 topic/partition
* 对应消费端:TbRuleEngineQueueConsumerManager.processMsgs() → submitMessage()
*/
private void putToQueue(TopicPartitionInfo tpi, TbMsg newMsg, TbQueueCallback callbackWrapper) {
ToRuleEngineMsg toQueueMsg = ToRuleEngineMsg.newBuilder()
.setTenantIdMSB(tenantId.getId().getMostSignificantBits())
.setTenantIdLSB(tenantId.getId().getLeastSignificantBits())
.setTbMsgProto(TbMsg.toProto(newMsg))
.build();
clusterService.pushMsgToRuleEngine(tpi, newMsg.getId(), toQueueMsg, callbackWrapper);
}
对于路由过来的规则链
java
/**
* 规则链路由到本节点的消息入口(主处理路径)
* 由 RuleChainActorMessageProcessor.pushToTarget() 通过 Actor tell 触发
*
* 流程:
* 1. 单例节点分区检查:若本节点配置了 singletonMode 且当前服务不是该分区的负责者
* → 调用 putToNodePartition() 将消息序列化后重新入队,路由到正确的服务实例
* 2. 若在本节点所在分区:
* a. callback.onProcessingStart(info):记录当前节点为"最后访问节点",供超时日志使用
* b. checkComponentStateActive:节点未激活(初始化失败/已停止)时直接 callback.onFailure
* c. 统计本条消息已经过的 RuleNode 数量,超过租户配置上限时直接 callback.onFailure
* d. tbNode.onMsg(ctx, tbMsg):调用具体规则节点实现,执行业务逻辑
* 节点实现完成后通过 ctx.tellSuccess / ctx.tellNext / ctx.tellFailure / ctx.ack 输出结果
* e. 若 tbNode.onMsg() 抛出异常,调用 ctx.tellFailure() 向 FAILURE 链路由
*/
void onRuleChainToRuleNodeMsg(RuleChainToRuleNodeMsg msg) throws Exception {
if (!isMyNodePartition()) {
// 单例模式(singletonMode=true)且本实例不是负责分区 → 转发到正确分区
putToNodePartition(msg.getMsg());
} else {
// 记录最后访问节点信息,用于超时/失败时的日志诊断
msg.getMsg().getCallback().onProcessingStart(info);
// 节点未就绪时立即失败,不调用 tbNode
checkComponentStateActive(msg.getMsg());
TbMsg tbMsg = msg.getMsg();
// 累加节点执行计数,防止消息在规则链中无限循环
int ruleNodeCount = tbMsg.getAndIncrementRuleNodeCounter();
var tenantProfileConfiguration = getTenantProfileConfiguration();
int maxRuleNodeExecutionsPerMessage = tenantProfileConfiguration.getMaxRuleNodeExecsPerMessage();
if (maxRuleNodeExecutionsPerMessage == 0 || ruleNodeCount < maxRuleNodeExecutionsPerMessage) {
// 上报 API 用量统计(用于配额控制)
apiUsageClient.report(tenantId, tbMsg.getCustomerId(), ApiUsageRecordKey.RE_EXEC_COUNT);
// Debug 模式下持久化输入事件(供规则链调试界面展示)
persistDebugInputIfAllowed(msg.getMsg(), msg.getFromRelationType());
try {
// 核心调用:执行具体规则节点逻辑(如 Save Timeseries / Filter / Transform 等)
// 节点内部通过 ctx.tellSuccess/tellNext/tellFailure/ack 将处理结果回传给规则链
tbNode.onMsg(msg.getCtx(), msg.getMsg());
} catch (Exception e) {
// 节点执行异常 → 路由到 FAILURE 链,最终可能触发 callback.onFailure
msg.getCtx().tellFailure(msg.getMsg(), e);
}
} else {
// 超过最大节点执行次数,直接标记消息失败(防止死循环)
tbMsg.getCallback().onFailure(new RuleNodeException("Message is processed by more than " + maxRuleNodeExecutionsPerMessage + " rule nodes!", ruleChainName, ruleNode));
}
}
}
是别的分区节点处理的 > 转发到正确分区
org.thingsboard.server.actors.ruleChain.RuleNodeActorMessageProcessor#putToNodePartition
java
/**
* 将消息路由到正确服务实例上的节点分区(单例节点跨服务转发)
*
* 仅在 singletonMode=true 且当前服务不负责该节点分区时触发。
* 典型场景:集群部署中,某个规则节点设为全局单例,只在特定服务实例运行,
* 其他实例收到消息时需将消息转发过去。
* 流程:
* 1. 拷贝消息并将 ruleNodeId 设为本节点 ID(确保目标服务路由到正确节点)
* 2. 解析节点 ID 对应的 TPI(目标分区),该分区属于负责运行此单例节点的服务实例
* 3. 序列化并推送到 Rule Engine 队列(目标实例消费后再次进入 onRuleChainToRuleNodeMsg)
* 4. ack(source):释放当前消息的 callback 计数(原消息在本实例已完成,由目标实例重新追踪)
* // 消息处理完成后会重新路由回来,参见 RuleChainActorMessageProcessor.pushToTarget
*/
private void putToNodePartition(TbMsg source) {
// 拷贝消息,将 ruleNodeId 记录为本节点(目标实例据此找到对应 RuleNodeActor)
TbMsg tbMsg = TbMsg.newMsg(source, source.getQueueName(), source.getRuleChainId(), entityId);
// 以节点 ID 为 originator 解析分区,得到负责运行该单例节点的服务实例的分区
TopicPartitionInfo tpi = systemContext.resolve(ServiceType.TB_RULE_ENGINE, tbMsg.getQueueName(), tenantId, ruleNode.getId());
TransportProtos.ToRuleEngineMsg toQueueMsg = TransportProtos.ToRuleEngineMsg.newBuilder()
.setTenantIdMSB(tenantId.getId().getMostSignificantBits())
.setTenantIdLSB(tenantId.getId().getLeastSignificantBits())
.setTbMsgProto(TbMsg.toProto(tbMsg))
.build();
// 推送到队列,由目标分区的服务实例消费处理
systemContext.getClusterService().pushMsgToRuleEngine(tpi, tbMsg.getId(), toQueueMsg, null);
// 原消息在本实例已完成流转(转发出去),调用 ack 释放 packCtx 计数
defaultCtx.ack(source);
}
对于路由过来的节点
java
/**
* 处理节点向自身发送的延迟消息(ctx.tellSelf(msg, delayMs))
* 常用于需要定时重试或延迟执行的场景(如等待设备响应后重新评估)
* 处理逻辑与 onRuleChainToRuleNodeMsg 相同,但不需要分区检查(必然在本节点)
*/
public void onRuleToSelfMsg(RuleNodeToSelfMsg msg) throws Exception {
checkComponentStateActive(msg.getMsg());
TbMsg tbMsg = msg.getMsg();
int ruleNodeCount = tbMsg.getAndIncrementRuleNodeCounter();
var tenantProfileConfiguration = getTenantProfileConfiguration();
int maxRuleNodeExecutionsPerMessage = tenantProfileConfiguration.getMaxRuleNodeExecsPerMessage();
if (maxRuleNodeExecutionsPerMessage == 0 || ruleNodeCount < maxRuleNodeExecutionsPerMessage) {
apiUsageClient.report(tenantId, tbMsg.getCustomerId(), ApiUsageRecordKey.RE_EXEC_COUNT);
persistDebugInputIfAllowed(msg.getMsg(), "Self");
try {
tbNode.onMsg(defaultCtx, msg.getMsg());
} catch (Exception e) {
defaultCtx.tellFailure(msg.getMsg(), e);
}
} else {
tbMsg.getCallback().onFailure(new RuleNodeException("Message is processed by more than " + maxRuleNodeExecutionsPerMessage + " rule nodes!", ruleChainName, ruleNode));
}
}
对于 Debugr 模式的 保存调试数据
org.thingsboard.server.actors.ruleChain.DefaultTbContext#tellNext(org.thingsboard.server.common.msg.TbMsg, java.util.Set<java.lang.String>)
java
/**
* 规则节点核心输出方法:将消息按指定关系类型路由到下游节点
* 由规则节点实现(TbNode)在完成业务逻辑后调用
*
* 流程:
* 1. persistDebugOutput:Debug 模式下持久化输出事件(供前端调试界面展示)
* 2. callback.onProcessingEnd(ruleNodeId):通知 TbMsgPackProcessingContext 本节点处理完毕
* (更新最后访问节点信息,用于超时日志;注意此时消息并未计入完成,仍在 pendingMap 中)
* 3. tell RuleNodeToRuleChainTellNextMsg 到 RuleChainActor:
* 由 RuleChainActorMessageProcessor.onTellNext() 根据 relationTypes 查找下游节点并继续路由
* 若没有下游节点 → 触发 callback.onSuccess(),消息从 pendingMap 移除,packCtx 计数减一
*/
@Override
public void tellNext(TbMsg msg, Set<String> relationTypes) {
RuleNode ruleNode = nodeCtx.getSelf();
persistDebugOutput(msg, relationTypes);
// 标记当前节点处理完毕,更新诊断信息(不触发 packCtx 计数)
msg.getCallback().onProcessingEnd(ruleNode.getId());
// 将路由决策交还给规则链 Actor,由其负责查找并分发到下游节点
nodeCtx.getChainActor().tell(new RuleNodeToRuleChainTellNextMsg(ruleNode.getRuleChainId(), ruleNode.getId(), relationTypes, msg, null));
}
内存和Kafka队列
关于 QueueFactory 选择(由Spring 启动时注入)
| Factory | 条件 |
|---|---|
InMemoryMonolithQueueFactory |
queue.type=in-memory + service.type=monolith |
KafkaMonolithQueueFactory |
queue.type=kafka + service.type=monolith |
KafkaTbRuleEngineQueueFactory |
queue.type=kafka + service.type=tb-rule-engine |
自定义节点
https://thingsboard.io/docs/user-guide/contribution/rule-node-development/
核心概念
在 ThingsBoard 中:
- 每条数据 → 被包装成
TbMsg - 每个节点 → 处理一条消息
- 输出 → 发送给下一个节点(带 relation)
| ThingsBoard | 类比 |
|---|---|
| Rule Chain | 工作流 DAG |
| Rule Node | 函数节点 |
| TbMsg | 消息对象 |
| relationType | 分支条件 |
节点生命周期
py
init() → 初始化(解析配置)
onMsg() → 处理消息(核心写逻辑的地方)
destroy() → 销毁
核心逻辑(onMsg)
真正写业务逻辑的地方:
java
@Override
public void onMsg(TbContext ctx, TbMsg msg) {
// 处理 msg
}
ThingsBoard 不会自动继续流程,必须手动:
ctx.tellSuccess(msg);
或者:
ctx.tellNext(msg, "True");
另外, 还有两种流转
ctx.tellSelf // 只用于"延迟一段时间后叫醒自己",1个/节点(单实例只有一个)
ctx.enqueueForTellNext //它走的是持久化的消息队列(Kafka/内存队列),通常用于触发下游节点,N个/触发批次(与设备数相关,但异步消费)
节点定义(@RuleNode 注解)
这是"声明元数据",类似前端组件注册
java
/**
* Copyright © 2016-2026 The Thingsboard Authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.thingsboard.rule.engine.api;
import org.thingsboard.server.common.data.msg.TbNodeConnectionType;
import org.thingsboard.server.common.data.plugin.ComponentClusteringMode;
import org.thingsboard.server.common.data.plugin.ComponentScope;
import org.thingsboard.server.common.data.plugin.ComponentType;
import org.thingsboard.server.common.data.rule.RuleChainType;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
public @interface RuleNode {
ComponentType type();
String name();
String nodeDescription();
String nodeDetails();
Class<? extends NodeConfiguration> configClazz();
ComponentClusteringMode clusteringMode() default ComponentClusteringMode.ENABLED;
boolean hasQueueName() default false;
boolean inEnabled() default true;
boolean outEnabled() default true;
ComponentScope scope() default ComponentScope.TENANT;
String[] relationTypes() default {TbNodeConnectionType.SUCCESS, TbNodeConnectionType.FAILURE};
String[] uiResources() default {};
String configDirective() default "";
String icon() default "";
String iconUrl() default "";
String docUrl() default "";
boolean customRelations() default false;//自定义类型分支, 否则只有 Success 或者 Failual
boolean ruleChainNode() default false;
RuleChainType[] ruleChainTypes() default {RuleChainType.CORE, RuleChainType.EDGE};
int version() default 0;
}
| 字段 | 本质 |
|---|---|
| type | 节点分类(影响UI分组) |
| name | 节点名字 |
| relationTypes | 输出分支(非常重要) |
| configClazz | 配置结构 |
| configDirective | UI配置(可选) |
relationTypes = "你这个节点会产生几种出口"
例如:
- True / False
- Success / Failure
核心能力
获取输入内容能力
java
msg.getData() // JSON 数据
msg.getMetadata() // 元数据
msg.getOriginator() // 来源(设备)
使用平台能力
java
ctx.getDeviceService() //设备服务
ctx.getTelemetryService()//遥测数据服务
ctx.getAlarmService() //告警服务
参考 org.thingsboard.rule.engine.api.TbContext
java
RuleNodeId getSelfId();
RuleNode getSelf();
String getRuleChainName();
String getQueueName();
TenantId getTenantId();
AttributesService getAttributesService();
CustomerService getCustomerService();
TenantService getTenantService();
UserService getUserService();
AssetService getAssetService();
DeviceService getDeviceService();
DeviceProfileService getDeviceProfileService();
AssetProfileService getAssetProfileService();
DeviceCredentialsService getDeviceCredentialsService();
DeviceStateManager getDeviceStateManager();
String getDeviceStateNodeRateLimitConfig();
TbClusterService getClusterService();
DashboardService getDashboardService();
RuleEngineAlarmService getAlarmService();
AlarmCommentService getAlarmCommentService();
RuleChainService getRuleChainService();
RuleEngineRpcService getRpcService();
RuleEngineTelemetryService getTelemetryService();
TimeseriesService getTimeseriesService();
RelationService getRelationService();
EntityViewService getEntityViewService();
ResourceService getResourceService();
TbResourceDataCache getTbResourceDataCache();
OtaPackageService getOtaPackageService();
RuleEngineDeviceProfileCache getDeviceProfileCache();
RuleEngineAssetProfileCache getAssetProfileCache();
EdgeService getEdgeService();
EdgeEventService getEdgeEventService();
QueueService getQueueService();
QueueStatsService getQueueStatsService();
ListeningExecutor getMailExecutor();
ListeningExecutor getSmsExecutor();
ListeningExecutor getDbCallbackExecutor();
ListeningExecutor getExternalCallExecutor();
ListeningExecutor getNotificationExecutor();
ExecutorProvider getPubSubRuleNodeExecutorProvider();
MailService getMailService(boolean isSystem);
SmsService getSmsService();
SmsSenderFactory getSmsSenderFactory();
NotificationCenter getNotificationCenter();
NotificationTargetService getNotificationTargetService();
NotificationTemplateService getNotificationTemplateService();
NotificationRequestService getNotificationRequestService();
NotificationRuleService getNotificationRuleService();
OAuth2ClientService getOAuth2ClientService();
DomainService getDomainService();
MobileAppService getMobileAppService();
MobileAppBundleService getMobileAppBundleService();
SlackService getSlackService();
CalculatedFieldService getCalculatedFieldService();
RuleEngineCalculatedFieldQueueService getCalculatedFieldQueueService();
JobService getJobService();
JobManager getJobManager();
ApiKeyService getApiKeyService();
调用通常是异步的, 所以:
- 不要阻塞, 使用回调
同一个设备的数据 → 永远发到同一个节点实例, 所以:
- 可以放心做缓存, 状态管理(按设备)
输出消息
生成新消息
java
TbMsg newMsg = TbMsg.transformMsg(msg, newData);
ctx.tellNext(newMsg, "Success");
a sample project
Clone the repository and navigate to the repo folder:
sh
git clone -b release-4.3 https://github.com/thingsboard/rule-node-examples
cd rule-node-examples
自定义实现"设备定时器"功能
TB规则链居然没有类似定时触发器这样的功能, 比如设定一个时间段定时的开关, 这种很常见的场景都办法实现
想了非常多, 要实时性, 但不能扫数据库; 干脆使用类似 xxl-job 调度系统在外部触发, 但是不契合tb的设计逻辑, 与TB内部的实体 数据联动, 扩展性非常的差...
类似的有一个 generator 节点, 周期性的产生一条消息到下游, 可以参考一下它的实现
两个核心的问题:
- 在哪里保存/记录定时, 类似"中断"不要消耗CPU?
- 怎么保证集群更改/服务重启时不会丢失?
TbMsgGeneratorNode 参考
通常,规则引擎是"事件驱动"的(设备发一条消息,规则链处理一条)。它是一个"无中生有"的消息触发器,按照你设定的固定时间间隔,自动生成消息并驱动规则链的后续逻辑
它的配置包含两个部分:
- Period (ms) / 触发间隔:设定每隔多少毫秒生成一次消息(例如 60000 代表每分钟一次)。
- Generator Script / 生成脚本:一段 JavaScript 代码,用于定义每次生成的消息内容。脚本必须返回一个包含 msg(消息体)、metadata(元数据)和 msgType(消息类型)的对象。
服务重启或者集群新增节点重新分区时, 规则链怎么重建的?
TbMsgGeneratorNode 的解法------依赖 onPartitionChangeMsg 重新初始化:
RuleNodeActorMessageProcessor.onPartitionChangeMsg():
├── 如果该节点不属于本分区 → stop() → tbNode.destroy()
└── 如果该节点属于本分区
├── 已运行中 → tbNode.onPartitionChangeMsg(ctx, msg)
└── 已停止 → start() → tbNode.init(ctx, config) ← 重启后在这里重建
org.thingsboard.server.actors.ruleChain.RuleNodeActorMessageProcessor#onPartitionChangeMsg
java
@Override
public void onPartitionChangeMsg(PartitionChangeMsg msg) throws Exception {
log.debug("[{}][{}] onPartitionChangeMsg: [{}]", tenantId, entityId, msg);
if (tbNode != null) {
if (!isMyNodePartition()) {//如果该节点不属于本分区 ctx.getSelfId()(Rule Node 自身的 ID)作为 originator → 规则节点 ID 被分配到固定分区 → 集群中只有一个节点负责该规则节点,不会重复触发
stop(null);
tbNode = null;
} else {
tbNode.onPartitionChangeMsg(defaultCtx, msg);
}
} else if (isMyNodePartition()) {//如果该节点属于本分区
start(null);
}
}
在这两个入口里 TbMsgGeneratorNode 重新 scheduleNextTick()
就是 tellSelf 给自己再发一个队列消息..
org.thingsboard.rule.engine.debug.TbMsgGeneratorNode#scheduleTickMsg
java
private void scheduleTickMsg(TbContext ctx, TbMsg msg) {
long curTs = System.currentTimeMillis();
if (lastScheduledTs == 0L) {
lastScheduledTs = curTs;
}
lastScheduledTs = lastScheduledTs + delay;
long curDelay = Math.max(0L, (lastScheduledTs - curTs));
TbMsg tickMsg = ctx.newMsg(queueName, TbMsgType.GENERATOR_NODE_SELF_MSG, ctx.getSelfId(),
getCustomerIdFromMsg(msg), TbMsgMetaData.EMPTY, TbMsg.EMPTY_STRING);
nextTickId = tickMsg.getId();
ctx.tellSelf(tickMsg, curDelay);//注意这个给自己发一个延迟消息! 延迟放到队列中, 但会在服务重启时会丢失
log.trace("[{}] Scheduled tick msg with delay {}, msg: {}, config: {}", originatorId, curDelay, tickMsg, config);
}
so, 回答上述两个问题
- 使用
ctx.tellSelf(tickMsg, curDelay)把定时记录逻辑放到队列中 - 在 onPartitionChangeMsg 节点回调时再重建规则消息
完整实现代码 TbDeviceSchedulerNode
关键问题解决了, 照搬就行, 注意一点在重建规则消息时, 要补偿一条消息 经验之谈
java
/**
* 设备定时调度触发器节点(per-device 精确定时版本)。
*
* <p>启动时一次性扫描归属本规则链的设备,为每台设备的每条 trigger 注册独立的延迟消息(tellSelf),
* 在 cron 命中时刻精确触发,触发后自动重新注册下一次。运行期间无全局周期扫描。
*
* <p>scheduleConfig 属性示例(存于设备 Server Attribute):
* <pre>{@code
* {
* "enabled": true,
* "triggers": [
* { "cron": "0 8 * * 1-5", "type": "open", "param": { "level": 100, "mode": "auto" } },
* { "cron": "0 22 * * *", "type": "close", "param": { "level": 0 } }
* ]
* }
* }</pre>
*
* <p>使用:
* 规则链将 ATTRIBUTES_UPDATED(Server scope,scheduleConfig key)路由到本节点,节点处理后 ack,不产生输出。其他属性变更会被 tellSuccess 透传,不影响其他节点。
* 节点取消旧 tick 并注册新 tick,无需重启。
*
* <p>取消机制:tellSelf 无 cancel API,取消靠 UUID 失效------新 tick 覆盖 activeTickIds 中的值,
* 旧 tick 到来时 UUID 不匹配被静默丢弃。
*
*/
@Slf4j
@RuleNode(
type = ComponentType.ACTION,
name = "device schedule trigger",
configClazz = TbDeviceSchedulerNodeConfig.class,
nodeDescription = "按 cron 表达式为每台设备独立注册定时触发,在精确时刻发出自定义类型消息",
nodeDetails = "启动时一次性扫描属于本规则链 Profile 的设备,注册 per-device 定时 tick。<br/>" +
"调度配置存于设备 Server Attribute(key 可配置,默认 scheduleConfig)。<br/>" +
"格式:{\"enabled\":true,\"triggers\":[{\"cron\":\"0 8 * * 1-5\",\"type\":\"open\"},...]}。<br/>" +
"cron 为 5 字段(分 时 日 月 周),type 即输出连接名,完全自定义。<br/>" +
"ATTRIBUTES_UPDATED 消息路由到本节点可实时更新调度配置(无需重启)。<br/>" +
"服务重启后自动补偿当天最后一次已过触发,isReconcile=true 时下游节点可选择跳过。",
inEnabled = true,
icon = "alarm"
)
public class TbDeviceSchedulerNode implements TbNode {
private static final DateTimeFormatter FMT = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm");
// tick 消息 metadata 内部字段 key
private static final String META_TICK_KEY = "__tickKey";
private static final String META_DEVICE_ID = "__tickDeviceId";
private static final String META_CUSTOMER_ID = "__tickCustomerId";
private static final String META_CRON = "__tickCron";
private static final String META_TRIGGER_TYPE = "__tickTriggerType";
private static final String META_SCHED_TIME = "__tickScheduledTime";
private static final String META_PARAM = "__tickParam";
private TbDeviceSchedulerNodeConfig config;
private final AtomicBoolean initialized = new AtomicBoolean(false);
/** key = deviceId:triggerIndex;value = 当前有效 tick UUID,UUID 不匹配时旧 tick 被静默丢弃 */
private final ConcurrentHashMap<String, UUID> activeTickIds = new ConcurrentHashMap<>();
/** 归属本规则链的 DeviceProfile ID,启动时一次性收集 */
private volatile Set<DeviceProfileId> targetProfileIds = null;
@Override
public void init(TbContext ctx, TbNodeConfiguration configuration) throws TbNodeException {
config = TbNodeUtils.convert(configuration, TbDeviceSchedulerNodeConfig.class);
startIfMine(ctx);
}
@Override
public void onPartitionChangeMsg(TbContext ctx, PartitionChangeMsg msg) {
if (ctx.isLocalEntity(ctx.getSelfId())) {
if (initialized.compareAndSet(false, true)) {
startAsync(ctx);
}
} else {
initialized.set(false);
activeTickIds.clear();
targetProfileIds = null;
}
}
@Override
public void onMsg(TbContext ctx, TbMsg msg) {
if (!initialized.get()) {
ctx.ack(msg);
return;
}
String type = msg.getType();
if (TbMsgType.GENERATOR_NODE_SELF_MSG.name().equals(type)) {
handleTick(ctx, msg);
} else if (TbMsgType.ATTRIBUTES_UPDATED.name().equals(type)) {
handleAttributeUpdate(ctx, msg);
} else {
ctx.tellSuccess(msg);
}
}
@Override
public void destroy() {
initialized.set(false);
activeTickIds.clear();
}
// -------- 启动 --------
private void startIfMine(TbContext ctx) {
if (ctx.isLocalEntity(ctx.getSelfId())) {
initialized.set(true);
startAsync(ctx);
}
}
/** 异步:收集 Profile → 加载设备注册 tick → reconcile 补偿 */
private void startAsync(TbContext ctx) {
ctx.getDbCallbackExecutor().execute(() -> {
try {
refreshTargetProfiles(ctx);
loadAndScheduleAll(ctx);
reconcileAll(ctx);
} catch (Exception e) {
log.warn("[{}] Device scheduler startup failed", ctx.getSelfId(), e);
}
});
}
// -------- tick 处理 --------
private void handleTick(TbContext ctx, TbMsg msg) {
String tickKey = msg.getMetaData().getValue(META_TICK_KEY);
if (tickKey == null) {
ctx.ack(msg);
return;
}
UUID expected = activeTickIds.get(tickKey);
if (expected == null || !msg.getId().equals(expected)) {
log.debug("[{}] Stale/cancelled tick discarded, key={}, msgId={}", ctx.getSelfId(), tickKey, msg.getId());
ctx.ack(msg);
return;
}
String deviceIdStr = msg.getMetaData().getValue(META_DEVICE_ID);
String customerIdStr = msg.getMetaData().getValue(META_CUSTOMER_ID);
String cron = msg.getMetaData().getValue(META_CRON);
String triggerType = msg.getMetaData().getValue(META_TRIGGER_TYPE);
String scheduledTime = msg.getMetaData().getValue(META_SCHED_TIME);
String paramJson = msg.getMetaData().getValue(META_PARAM);
log.debug("[{}] Tick fired: key={}, device={}, type={}, scheduledTime={}",
ctx.getSelfId(), tickKey, deviceIdStr, triggerType, scheduledTime);
ctx.getDbCallbackExecutor().execute(() -> {
try {
DeviceId deviceId = new DeviceId(UUID.fromString(deviceIdStr));
CustomerId customerId = parseCustomerId(customerIdStr);
emitTrigger(ctx, deviceId, customerId, triggerType, cron, scheduledTime, paramJson, false);
rescheduleAfterFire(ctx, tickKey, deviceIdStr, customerIdStr, cron, triggerType, paramJson);
} catch (Exception e) {
log.warn("[{}] Tick processing failed for key {}", ctx.getSelfId(), tickKey, e);
}
});
ctx.ack(msg);
}
/** 触发后重新注册下一次 cron 触发 */
private void rescheduleAfterFire(TbContext ctx, String tickKey, String deviceIdStr,
String customerIdStr, String cron, String triggerType, String paramJson) {
if (!initialized.get()) return;
LocalDateTime now = LocalDateTime.now();
LocalDateTime next = nextCronTime(cron, now);
if (next == null) return;
long delayMs = ChronoUnit.MILLIS.between(now, next);
if (delayMs <= 0) {
log.warn("[{}] Non-positive delay for key={}, cron='{}', next={}, skipped", ctx.getSelfId(), tickKey, cron, next);
return;
}
TbMsgMetaData meta = buildTickMeta(tickKey, deviceIdStr, customerIdStr, cron, triggerType, next, paramJson);
TbMsg tick = ctx.newMsg(null, TbMsgType.GENERATOR_NODE_SELF_MSG,
ctx.getSelfId(), null, meta, TbMsg.EMPTY_STRING);
activeTickIds.put(tickKey, tick.getId());
ctx.tellSelf(tick, delayMs);
log.debug("[{}] Rescheduled: key={}, type={}, next={}, delayMs={}", ctx.getSelfId(), tickKey, triggerType, next, delayMs);
}
// -------- ATTRIBUTES_UPDATED 处理 --------
private void handleAttributeUpdate(TbContext ctx, TbMsg msg) {
try {
if (!EntityType.DEVICE.equals(msg.getOriginator().getEntityType())) {
ctx.tellSuccess(msg);
return;
}
JsonNode data = JacksonUtil.toJsonNode(msg.getData());
if (data == null || !data.has(config.getScheduleAttributeKey())) {
ctx.tellSuccess(msg); // 不含调度配置的属性变更,透传
return;
}
DeviceId deviceId = new DeviceId(msg.getOriginator().getId());
String deviceIdStr = deviceId.getId().toString();
String customerIdStr = msg.getCustomerId() != null ? msg.getCustomerId().getId().toString() : null;
cancelDeviceTicks(deviceIdStr);
log.debug("[{}] Cancelled existing ticks for device={}", ctx.getSelfId(), deviceIdStr);
DeviceScheduleConfig schedule = parseScheduleConfig(data.get(config.getScheduleAttributeKey()));
if (schedule != null && schedule.isEnabled() && schedule.getTriggers() != null) {
List<ScheduleTrigger> triggers = schedule.getTriggers();
log.debug("[{}] Re-registering {} trigger(s) for device={}", ctx.getSelfId(), triggers.size(), deviceIdStr);
for (int i = 0; i < triggers.size(); i++) {
scheduleDeviceTrigger(ctx, deviceIdStr, customerIdStr, triggers.get(i), i);
}
} else {
log.debug("[{}] Schedule disabled or empty for device={}, no ticks registered", ctx.getSelfId(), deviceIdStr);
}
} catch (Exception e) {
log.warn("[{}] Failed to handle attribute update", ctx.getSelfId(), e);
}
ctx.ack(msg);
}
// -------- 启动时一次性加载 --------
private void loadAndScheduleAll(TbContext ctx) {
Set<DeviceProfileId> profiles = targetProfileIds;
if (profiles == null || profiles.isEmpty()) {
log.debug("[{}] No target profiles found, skipping schedule load", ctx.getSelfId());
return;
}
log.debug("[{}] Loading schedules for {} profile(s)", ctx.getSelfId(), profiles.size());
for (DeviceProfileId profileId : profiles) {
forEachDevice(ctx, profileId, device -> loadDeviceSchedule(ctx, device));
}
log.debug("[{}] Schedule load complete, active tick count={}", ctx.getSelfId(), activeTickIds.size());
}
private void loadDeviceSchedule(TbContext ctx, Device device) {
try {
Optional<AttributeKvEntry> attrOpt = ctx.getAttributesService()
.find(ctx.getTenantId(), device.getId(), AttributeScope.SERVER_SCOPE, config.getScheduleAttributeKey())
.get();
if (attrOpt.isEmpty()) return;
String json = attrOpt.get().getJsonValue().or(() -> attrOpt.get().getStrValue()).orElse(null);
if (json == null) return;
DeviceScheduleConfig schedule = JacksonUtil.fromString(json, DeviceScheduleConfig.class);
if (schedule == null || !schedule.isEnabled() || schedule.getTriggers() == null) return;
String deviceIdStr = device.getId().getId().toString();
String customerIdStr = device.getCustomerId() != null ? device.getCustomerId().getId().toString() : null;
List<ScheduleTrigger> triggers = schedule.getTriggers();
for (int i = 0; i < triggers.size(); i++) {
scheduleDeviceTrigger(ctx, deviceIdStr, customerIdStr, triggers.get(i), i);
}
} catch (Exception e) {
log.warn("[{}][{}] Failed to load device schedule", ctx.getSelfId(), device.getId(), e);
}
}
// -------- 调度注册 --------
private void scheduleDeviceTrigger(TbContext ctx, String deviceIdStr, String customerIdStr,
ScheduleTrigger trigger, int index) {
if (!initialized.get()) return;
String tickKey = deviceIdStr + ":" + index;
LocalDateTime now = LocalDateTime.now();
LocalDateTime next = nextCronTime(trigger.getCron(), now);
if (next == null) return;
long delayMs = ChronoUnit.MILLIS.between(now, next);
if (delayMs <= 0) return;
String paramJson = trigger.getParam() != null ? trigger.getParam().toString() : null;
TbMsgMetaData meta = buildTickMeta(tickKey, deviceIdStr, customerIdStr,
trigger.getCron(), trigger.getType(), next, paramJson);
TbMsg tick = ctx.newMsg(null, TbMsgType.GENERATOR_NODE_SELF_MSG,
ctx.getSelfId(), null, meta, TbMsg.EMPTY_STRING);
activeTickIds.put(tickKey, tick.getId());
ctx.tellSelf(tick, delayMs);
log.debug("[{}] Scheduled tick: key={}, type={}, next={}, delayMs={}", ctx.getSelfId(), tickKey, trigger.getType(), next, delayMs);
}
private void cancelDeviceTicks(String deviceIdStr) {
String prefix = deviceIdStr + ":";
activeTickIds.keySet().removeIf(key -> key.startsWith(prefix));
}
// -------- 触发消息 --------
private void emitTrigger(TbContext ctx, DeviceId deviceId, CustomerId customerId,
String triggerType, String cron, String scheduledTime, String paramJson, boolean isReconcile) {
TbMsgMetaData metadata = new TbMsgMetaData();
metadata.putValue("triggerType", triggerType);
metadata.putValue("scheduledCron", cron);
metadata.putValue("scheduledTime", scheduledTime);
metadata.putValue("isReconcile", String.valueOf(isReconcile));
String data = (paramJson != null && !paramJson.isBlank()) ? paramJson : TbMsg.EMPTY_JSON_OBJECT;
TbMsg triggerMsg = ctx.newMsg(null, TbMsgType.NA, deviceId,
customerId, metadata, data);
ctx.enqueueForTellNext(triggerMsg, TbNodeConnectionType.SUCCESS,
() -> log.trace("[{}][{}] Schedule trigger enqueued: type={}", ctx.getSelfId(), deviceId, triggerType),
t -> log.warn("[{}][{}] Failed to enqueue schedule trigger", ctx.getSelfId(), deviceId, t));
}
// -------- reconcile --------
private void reconcileAll(TbContext ctx) {
Set<DeviceProfileId> profiles = targetProfileIds;
if (profiles == null || profiles.isEmpty()) return;
LocalDateTime now = LocalDateTime.now().truncatedTo(ChronoUnit.MINUTES);
for (DeviceProfileId profileId : profiles) {
forEachDevice(ctx, profileId, device -> reconcileDevice(ctx, device, now));
}
}
private void reconcileDevice(TbContext ctx, Device device, LocalDateTime now) {
try {
Optional<AttributeKvEntry> attrOpt = ctx.getAttributesService()
.find(ctx.getTenantId(), device.getId(), AttributeScope.SERVER_SCOPE, config.getScheduleAttributeKey())
.get();
if (attrOpt.isEmpty()) return;
String json = attrOpt.get().getJsonValue().or(() -> attrOpt.get().getStrValue()).orElse(null);
if (json == null) return;
DeviceScheduleConfig schedule = JacksonUtil.fromString(json, DeviceScheduleConfig.class);
if (schedule == null || !schedule.isEnabled() || schedule.getTriggers() == null) return;
// 从今天 00:00 向前推进,找最近一次已过触发
LocalDateTime cursor = now.toLocalDate().atStartOfDay();
ScheduleTrigger lastTrigger = null;
LocalDateTime lastTime = null;
while (!cursor.isAfter(now)) {
for (ScheduleTrigger trigger : schedule.getTriggers()) {
if (matchesCron(trigger.getCron(), cursor)) {
lastTrigger = trigger;
lastTime = cursor;
}
}
cursor = cursor.plusMinutes(1);
}
if (lastTrigger != null) {
log.debug("[{}][{}] Reconcile: emitting type={}, lastTime={}", ctx.getSelfId(), device.getId(), lastTrigger.getType(), lastTime);
String paramJson = lastTrigger.getParam() != null ? lastTrigger.getParam().toString() : null;
emitTrigger(ctx, device.getId(), device.getCustomerId(),
lastTrigger.getType(), lastTrigger.getCron(), lastTime.format(FMT), paramJson, true);
} else {
log.debug("[{}][{}] Reconcile: no trigger found today before {}", ctx.getSelfId(), device.getId(), now);
}
} catch (Exception e) {
log.warn("[{}][{}] Reconcile device failed", ctx.getSelfId(), device.getId(), e);
}
}
// -------- cron 工具 --------
/** 返回 cron 在 after 之后的下一次触发时刻;调用方传入同一个 now,避免多次 now() 产生时间差 */
private LocalDateTime nextCronTime(String cron, LocalDateTime after) {
if (cron == null || cron.isBlank()) return null;
try {
return CronExpression.parse("0 " + cron).next(after);
} catch (Exception e) {
log.warn("Invalid cron expression: '{}'", cron, e);
return null;
}
}
private boolean matchesCron(String cron, LocalDateTime time) {
if (cron == null || cron.isBlank()) return false;
try {
CronExpression expr = CronExpression.parse("0 " + cron);
LocalDateTime next = expr.next(time.minusSeconds(1));
return time.equals(next);
} catch (Exception e) {
log.warn("Invalid cron expression: '{}'", cron, e);
return false;
}
}
// -------- 调度配置数据类 --------
@Data
@JsonIgnoreProperties(ignoreUnknown = true)
public static class DeviceScheduleConfig {
private boolean enabled;
private List<ScheduleTrigger> triggers = Collections.emptyList();
}
@Data
@JsonIgnoreProperties(ignoreUnknown = true)
public static class ScheduleTrigger {
/** 5 字段 cron:分 时 日 月 周,如 "0 8 * * 1-5" */
private String cron;
/** 输出连接名,完全自定义,如 "open"、"close"、"setLevel50" */
private String type;
/** 触发参数,任意 JSON 结构,透传至触发消息的 data 字段,可为 null */
private JsonNode param;
}
}
辅佐方法代码
java
/** 按 Profile 分页查 DeviceId,批量加载 Device 后逐个回调 */
private void forEachDevice(TbContext ctx, DeviceProfileId profileId, java.util.function.Consumer<Device> action) {
PageLink pageLink = new PageLink(config.getMaxDevicesPerPage());
PageData<DeviceId> idPage;
do {
idPage = ctx.getDeviceService().findDeviceIdsByTenantIdAndDeviceProfileId(
ctx.getTenantId(), profileId, pageLink);
List<DeviceId> batch = idPage.getData();
if (!batch.isEmpty()) {
try {
List<Device> devices = ctx.getDeviceService()
.findDevicesByTenantIdAndIdsAsync(ctx.getTenantId(), batch)
.get(30, TimeUnit.SECONDS);
for (Device device : devices) {
try {
action.accept(device);
} catch (Exception e) {
log.warn("[{}][{}] Device action failed", ctx.getSelfId(), device.getId(), e);
}
}
} catch (Exception e) {
log.warn("[{}] Failed to load devices for profile {}", ctx.getSelfId(), profileId, e);
}
}
pageLink = pageLink.nextPageLink();
} while (idPage.hasNext());
}
private TbMsgMetaData buildTickMeta(String tickKey, String deviceIdStr, String customerIdStr,
String cron, String triggerType, LocalDateTime scheduledTime, String paramJson) {
TbMsgMetaData meta = new TbMsgMetaData();
meta.putValue(META_TICK_KEY, tickKey);
meta.putValue(META_DEVICE_ID, deviceIdStr);
if (customerIdStr != null) meta.putValue(META_CUSTOMER_ID, customerIdStr);
meta.putValue(META_CRON, cron);
meta.putValue(META_TRIGGER_TYPE, triggerType);
meta.putValue(META_SCHED_TIME, scheduledTime.format(FMT));
if (paramJson != null) meta.putValue(META_PARAM, paramJson);
return meta;
}
private DeviceScheduleConfig parseScheduleConfig(JsonNode node) {
if (node == null) return null;
try {
String json = node.isTextual() ? node.asText() : node.toString();
return JacksonUtil.fromString(json, DeviceScheduleConfig.class);
} catch (Exception e) {
log.warn("Failed to parse scheduleConfig: {}", node, e);
return null;
}
}
private CustomerId parseCustomerId(String str) {
if (str == null || str.isBlank()) return null;
try {
return new CustomerId(UUID.fromString(str));
} catch (Exception e) {
return null;
}
}
private void refreshTargetProfiles(TbContext ctx) {
RuleChainId myChainId = ctx.getSelf().getRuleChainId();
if (myChainId == null) {
targetProfileIds = null;
return;
}
Set<DeviceProfileId> ids = new HashSet<>();
PageLink pl = new PageLink(200);
PageData<DeviceProfile> page;
do {
page = ctx.getDeviceProfileService().findDeviceProfiles(ctx.getTenantId(), pl);
for (DeviceProfile profile : page.getData()) {
if (myChainId.equals(profile.getDefaultRuleChainId())) {
ids.add(profile.getId());
}
}
pl = pl.nextPageLink();
} while (page.hasNext());
targetProfileIds = ids.isEmpty() ? null : ids;
log.debug("[{}] Target device profiles: {}", ctx.getSelfId(), ids);
}