聊天状态以及流畅运行

AI 聊天页看起来只是"发消息 + 播语音"，但线上真正难的是：

流式文本（SSE）和语音合成（WebSocket）要并行协同
弱网下要能恢复，不能丢句、重复播、卡死
播放中要锁交互，防止用户误触导致会话错乱
页面隐藏/返回后要可恢复，不留脏状态

这篇文章就是围绕这些问题做的工程化治理。

1) 先把交互状态"显式化"：避免并发错乱

这页没有把"是否可交互"写成隐含逻辑，而是显式维护多个状态位：

复制代码

data() {
	return {
		mode: 'hold', // 当前模式：auto/hold/text（默认语音模式）
		waiting: false,
		// AI 语音播放期间锁定交互（避免"没播完就能切模式/继续录"）
		audioOutputLock: false,
		...
		// 回复进行中：从发送请求到流结束+播完，期间禁止再次按住说话，避免对话错乱
		replyInProgress: false,
		streamEndedReceived: false, // 本次回复是否已收到流结束
		receivedAnyAudioThisReply: false, // 本次回复是否收到过 WS 音频

模式切换时先做"硬门禁"，不满足条件就直接拦截：

复制代码

switchMode(mode) {
	// 语音还在播/正在生成时，不允许切换
	if (this.waiting || this.isTyping || this.typeNow || this.audioOutputLock) {
		this.$prompt({ text: '小乐正在回复中，模式暂时不可切换~' });
		return
	}

这类门禁是聊天页稳定性的第一道防线。

2) 双通道架构：SSE 负责文本，WS 负责语音

这一页不是单通道，而是：

SSE：拿 LLM 增量文本
WebSocket：做 TTS 会话控制与音频 chunk 下发

await fetchEventSource(urls, {
signal: controller.signal,
method: 'POST',
headers: {
'Accept': 'text/event-stream',
'Authorization': 'Bearer ' + getToken(),
'Content-Type': 'application/json',
'Tenantname': getTenantName()
},
body: JSON.stringify(list),

WS 则单独走语音通道

复制代码

const url = config.wsUrl + '/happyplanet/websocket/speech?' + this.getWsIdentityQuery();
this.ws = uni.connectSocket({
	url: url,

这种分层的好处是：文本和语音各自可控，某一侧抖动时更容易恢复。

3) 关键设计：语音发送做"队列 + 回执 + 超时兜底"

很多聊天页卡死都出在这：文本发给 TTS 后没回执，队列就一直阻塞。

这里用了一套完整的可靠性机制：

speechTextQueue 排队
speechSending 防重入
textAdded 回执后出队
ACK 超时触发恢复重连

入队：

复制代码

addText(text) {
	if (!text) return;
	...
	this.speechTextQueue.push(this.makeSpeechQueueItem(text));
	this.flushSpeechQueue();
},

发送首条：

复制代码

if (this.speechSending) return;
const head = this.getSpeechQueueHead();
...
this.speechSending = true;
this.speechInflightId = head.id;
...
this.ws.send({
	data: JSON.stringify(message),

收到 textadded 才真正确认：

复制代码

if (status === 'textadded') {
	...
	const shouldShift =
		(head && this.speechInflightId && head.id === this.speechInflightId) ||
		(head && !this.speechInflightId && this.speechCurrentText && head.text === this.speechCurrentText);
	if (shouldShift) {
		this.speechTextQueue.shift();
	}
	...
	this.flushSpeechQueue();
	return;
}

ACK 超时兜底，避免"第一条卡死"：

复制代码

this.speechAckTimeoutTimer = setTimeout(() => {
	// 回执偶发丢失时的兜底：避免队列卡死在第一条
	if (!this.speechSending) return;
	...
	this.speechStarted = false;
	this.speechStartInFlight = false;
	this.tryReconnect();
}, this.SPEECH_ACK_TIMEOUT_MS);

4) start 回执不可靠时，做"兜底放行"

有些环境服务端不回 status: started，那 addText 永远发不出去。

这里直接做了 fallback 定时器：

复制代码

speechStartAckFallbackTimer: null,
SPEECH_START_ACK_FALLBACK_MS: 2000,

armSpeechStartAckFallback() {
	...
	this.speechStartAckFallbackTimer = setTimeout(() => {
		...
		console.warn('[WS] 未收到服务端 started 回执，超时后兜底放行队列（含首句 addText）');
		this.speechStarted = true;
		this.speechStartInFlight = false;
		this.flushSpeechQueue();
	}, this.SPEECH_START_ACK_FALLBACK_MS);
},

这是一个非常实用的"工程兜底"，能明显降低线上卡首句概率。

5) 音频结束判定不是"播空即结束"，而是"空窗确认"

语音流常见问题是尾包迟到：

如果"播放器一播空就解锁"，容易误判结束，用户就会抢说、切模式，导致错乱。

这里做了空窗检测：

audioIdleMs: 900, // 播空后空窗期；过小易误判尾包未到，过大则「请先听完」解锁慢

复制代码

onPlayEnd(() => {
	...
	this.audioEndTimer = setTimeout(() => {
		const idle = Date.now() - (this.audioLastChunkAt || 0);
		if (idle >= this.audioIdleMs) {
			// 真正结束：允许下一步操作
			this.audioOutputLock = false;
			...
			if (this.streamEndedReceived) this.replyInProgress = false;

这个策略能在"及时解锁"和"避免误判"之间取一个更稳的平衡点。

6) 页面生命周期要"强收敛"：离开就彻底回收

聊天页在 uni-app + keep-alive 环境里，beforeDestroy 不一定总触发。

所以这里在多个生命周期都做了统一 teardown：

复制代码

// keep-alive 情况下页面会先 deactivated 而不是触发 beforeDestroy
deactivated() {
	this.teardownOnLeave();
},

onHide() {
	this.teardownOnLeave();
},

onUnload() {
	this.teardownOnLeave();
},

teardownOnLeave() 里包含：停录音、停流式请求、停心跳、关 WS、清计时器、清状态，避免"返回页面后串会话"。

7) 这页最值得复用的经验

把聊天拆成状态机，而不是事件堆叠（replyInProgress / audioOutputLock / speechSending）
发送链路一定要有 ACK 与超时恢复
音频完成要做空窗确认，不要播空即结束
离开页面做强回收，多生命周期兜底
UI 交互必须受后端状态约束（模式切换、录音按钮）