在数字化浪潮中,AI 应用遍地开花。最近,我接到任务,要为老板开发一个 H5 的 AI 对话页面。起初,我打算借助 UniApp 现成的插件快速完成,可深入了解后发现,这些插件无法满足项目需求,于是踏上了自主开发的征程。接下来,我将详细分享开发过程中的关键技术与解决方案,希望能给大家带来一些启发。
一、攻克流式数据 SSE
这是我首次接触 AI 对话项目,通过百度才了解到 SSE 流式数据返回。SSE 全称 Server - Sent Events,能让服务器实时推送更新到客户端,特别适合 AI 对话场景,实现逐字返回对话结果,给用户带来流畅的交流体验。
一开始,我想用原生的EventSource
接收数据。但查阅大量资料后发现,EventSource
不支持POST
请求。由于项目需向服务器发送携带参数的请求,这个方案只能放弃。随后,我开始搜索 Vue 相关插件,经过对比,最终选择fetch - event - source
。下面是具体实现代码:
ini
const fetchAskDataFunc = (length: number, currenStr: string = currenContentStr.value) => {
abortController = new AbortController();
const signal = abortController.signal;
isStreaming.value = true;
fetchEventSource(`${import.meta.env.VITE_APP_AI_BASE_URL}/ali/ai/streamAsk`, {
signal,
method: "POST",
// retryInterval: 2000,
headers: {
"Content-Type": "application/json",
Accept: "text/event-stream",
"Cache-Control": "no-cache",
Authorization: getToken,
},
body: JSON.stringify({
question: currenStr,
sessionId: sessionId.value,
accountUid: getToken,
}),
openWhenHidden: true,
onmessage: (event) => {
const data = JSON.parse(event.data);
sessionId.value = data.sessionId;
currenContentArr.value[length] = {
type: "resutl",
content: data.thoughts[1].response,
text: data.text,
finishReason: data.finishReason,
userContent: currenStr,
resultContentDom: "resultContent" + length,
thinkContentDom: "thinkContent" + length,
timeNum: timeNum.value,
dataType: "streamAsk",
...data,
};
if (data.text) {
isThink.value = false;
timerObj && clearInterval(timerObj);
}
},
onerror: (error) => {
timerObj && clearInterval(timerObj);
isThink.value = false;
console.error("Fetch event source error:", error);
},
onclose() {
timerObj && clearInterval(timerObj);
isThink.value = false;
isStreaming.value = false;
// 在这里可以添加请求完成后的逻辑
},
});
};
在这段代码中,fetchEventSource
发起POST
请求,通过onmessage
事件处理服务器返回的数据流,更新对话内容。onerror
和onclose
事件则分别处理请求过程中的错误和关闭操作。
二、突破语音识别难关
最初,我计划在前端实现语音转文字。但老板是福建人,方言复杂,担心前端识别准确率低,便放弃了这个想法。我尝试使用浏览器自带的navigator.mediaDevices.getUserMedia({ audio: true })
获取语音输入,可将音频传送到后端后,一直无法识别。经过排查,发现设置wav
格式无效(或许有解决方法,只是我没找到)。于是,我改用recorder - core
插件,成功实现语音识别。以下是核心代码:
javascript
/*
* @Author: Robin LEI
* @Date: 2025-04-03 10:32:08
* @LastEditTime: 2025-04-03 10:53:52
* @FilePath: \uniapp\插件模板\前端页面模板\uniapp-ai-mobile\src\hooks\useRecord.ts
*/
import { ref, onUnmounted } from 'vue';
import Recorder from 'recorder-core';
import 'recorder-core/src/engine/wav';
// 处理旧浏览器兼容性
navigator.getUserMedia = navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia;
export function useRecorder() {
const recorder = ref(null);
const isRecording = ref(false);
const audioBlob = ref(null);
const requestPermission = async () => {
try {
if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
recorder.value = Recorder({
type: 'wav',
sampleRate: 16000,
bitRate: 16,
stream
});
} else if (navigator.getUserMedia) {
// 旧浏览器支持
return new Promise((resolve, reject) => {
navigator.getUserMedia({ audio: true }, (stream) => {
recorder.value = Recorder({
type: 'wav',
sampleRate: 16000,
bitRate: 16,
stream
});
resolve(true);
}, (error) => {
console.error('权限请求失败:', error);
reject(false);
});
});
} else {
console.error('浏览器不支持音频录制');
return false;
}
// 等待 open 方法完成
await new Promise((resolve, reject) => {
recorder.value.open(() => {
resolve();
}, (error) => {
console.error('打开录音器失败:', error);
reject(error);
});
});
return true;
} catch (error) {
console.error('权限请求失败:', error);
return false;
}
};
const startRecording = async () => {
if (isRecording.value) return;
const hasPermission = await requestPermission();
if (hasPermission) {
try {
recorder.value.start();
isRecording.value = true;
} catch (error) {
console.error('开始录音失败:', error);
}
}
};
const stopRecording = () => {
if (!isRecording.value) return;
isRecording.value = false;
return recorder.value
};
onUnmounted(() => {
if (recorder.value) {
recorder.value.destroy();
recorder.value = null;
}
});
return {
isRecording,
audioBlob,
requestPermission,
startRecording,
stopRecording,
};
}
为满足领导添加取消功能的要求,我增加了手势控制逻辑。点击录音开始,上滑取消。以下是相关代码:
ini
let timeOutEvent: any = 0; // 判断是否长按
/**
* @description: 手指长按录音
* @param {*} event
* @return {*}
*/
const gtouchstart = (event) => {
timeOutEvent = setTimeout(() => {
longPress();
}, 500); //这里设置定时器,定义长按500毫秒触发长按事件
return false;
};
/**
* @description: pc点击开始录音,再点击完成录音
* @return {*}
*/
const gtouchstartPc = async () => {
isVoice.value = !isVoice.value;
// await record.requestPermission();
if (isPcRecording.value) {
record.startRecording();
} else {
stopRecording();
}
isPcRecording.value = !isPcRecording.value;
return false;
};
//手释放,如果在500毫秒内就释放,则取消长按事件,此时可以执行onclick应该执行的事件
const showDeleteButton = () => {
clearTimeout(timeOutEvent); //清除定时器
isVoice.value = false;
stopRecording();
return false;
};
//如果手指有移动,则取消所有事件,此时说明用户只是要移动而不是长按
const gtouchmove = (event) => {
const currentX = event.touches[0].clientX;
const currentY = event.touches[0].clientY;
const FooterDomRect = FooterDom.value.getBoundingClientRect();
if (
currentX < FooterDomRect.left ||
currentX > FooterDomRect.right ||
currentY < FooterDomRect.top ||
currentY > FooterDomRect.bottom
) {
isCancelVoice.value = true;
} else {
isCancelVoice.value = false;
}
clearTimeout(timeOutEvent); //清除定时器
timeOutEvent = 0;
};
//真正长按后应该执行的内容
const longPress = () => {
timeOutEvent = 0;
startRecording();
};
// 开始录音
const startRecording = async () => {
isCancelVoice.value = false;
isVoice.value = true;
record.startRecording();
};
// 停止录音
const stopRecording = () => {
const recorder = record.stopRecording();
if (isCancelVoice.value) {
recorder.stop(
(blob) => {
console.log("录音已取消");
},
(error) => {
Toast.clear();
console.error("录音停止时出错:", error);
}
);
return;
}
Toast.loading({
message: "正在识别",
forbidClick: true,
duration: 0,
});
try {
recorder.stop(
(blob) => {
const audioBlob = blob;
const formDataObj = new FormData();
formDataObj.append("voice", audioBlob);
service({
url: "/ali/ai/recognize",
method: "post",
data: formDataObj,
})
.then((res) => {
if (res.data && !isPc.value) {
emits("pushContentFunc", res.data);
} else if (res.data) {
contentStr.value = res.data;
InputFocusFunc();
}
Toast.clear();
})
.finally(() => {
Toast.clear();
});
},
(error) => {
Toast.clear();
console.error("录音停止时出错:", error);
}
);
} catch (error) {
Toast.clear();
console.error("停止录音时出现异常:", error);
}
};
const stopSSEFunc = () => {
emits("stopSSEFunc");
};
三、优化流式数据自动滚动与手势控制
虽然领导没要求,但我看到腾讯元宝在流式输出文本时的自动滚动和手势拖拽功能,觉得很实用,决定在项目中实现类似效果。最初,我想通过内容的scrollTop
和可视窗口的scrollHeight
控制自动滚动,用touchmove
监测手势,停止自动滚动。但实际开发中,touchmove
有时无法触发。为解决这个问题,我增加touchstart
和touchend
辅助控制。下面是具体代码:
ini
const messagesRef = ref<HTMLElement>(); // 根据结果下滚
const messageRefs = ref<any[]>([]); // 根据用户信息上滚
const lastTouchY = ref(0); // 当前y点
const isScroStop = ref<boolean>(false); //是否停止滚动
const isUp = ref<boolean>(false); // 是否显示下滚
let timer: any = null;
const initScrollToBottomFunc = () => {
!isUp.value && !isScroStop.value && scrollToBottomFunc();
};
let time = 0;
let storeTime = 0;
const getTimeFunc = () => {
timer = setInterval(() => {
storeTime = time;
}, 1000);
};
getTimeFunc();
watch(
() => currenContentArr.value,
() => {
if (storeTime === time) {
initScrollToBottomFunc();
}
storeTime++;
if (dataType.value === 2) {
const index = currenContentArr.value.length - 1;
nextTick(() => {
initChartFunc(currenContentArr.value[index].content, "chartRef" + index);
});
}
if (currenContentArr.value.length == 0) {
arrDom = [];
}
},
{
deep: true,
}
);
/**
* @description: 滚动到最下面
* @return {*}
*/
const scrollToBottomFunc = (type = "") => {
if (type === "click") {
isScroStop.value = false;
}
nextTick(() => {
const messagesContainer = messagesRef.value;
if (messagesContainer) {
messagesContainer.scrollTop = messagesContainer.scrollHeight;
}
});
};
/**
* @description: 信息展示在最顶部(暂时无法实现)
* @param {*} id
* @return {*}
*/
const scrollTopFunc = async (id) => {
// await nextTick();
// const newUserMessage = messageRefs.value[id];
// if (newUserMessage) {
// console.log(newUserMessage, 'newUserMessage', newUserMessage.scrollIntoView)
// newUserMessage.scrollIntoView({ behavior: "smooth", block: "start" });
// }
};
const handleScrollFunc = () => {
const element = messagesRef.value;
if (element) {
// 获取元素的滚动高度
const scrollHeight = element.scrollHeight;
// 获取当前滚动条的位置
const scrollTop = element.scrollTop;
// 获取元素的可视区域高度
const clientHeight = element.clientHeight;
// 判断是否滚动到了底部
if (scrollTop + clientHeight + 5 >= scrollHeight) {
isUp.value = false;
isScroStop.value = false;
// 在这里可以添加滚动到底部后的逻辑,比如加载更多数据
} else {
if (isScroStop.value) {
isUp.value = true;
}
}
}
};
/**
* @description: 新输入内容或点击悬浮下键的时候
* @return {*}
*/
const inputContentFunc = () => {
isScroStop.value = true;
};
defineExpose({ scrollTopFunc, inputContentFunc });
/**
* @description: 鼠标上滚
* @return {*}
*/
const handleScrollTopFunc = (event) => {
if (event.deltaY < 0) {
isScroStop.value = true;
}
};
/**
* @description: 手势上滑
* @return {*}
*/
const handleTouchMoveFunc = (event) => {
const messagesContainer = messagesRef.value;
if (!messagesContainer) return;
const currentTouchY = event.touches[0].clientY;
if (currentTouchY > 0 && messagesContainer.scrollTop > 0) {
isScroStop.value = true;
}
lastTouchY.value = event.touches[0].clientY;
};
const startX = ref(0);
const startY = ref(0);
const threshold = 10; // 滑动阈值,可根据需要调整
const handleTouchStart = (event: TouchEvent) => {
isScroStop.value = true;
const touch = event.touches[0];
startX.value = touch.clientX;
startY.value = touch.clientY;
};
const handleTouchEnd = (event: TouchEvent) => {
const touch = event.changedTouches[0];
const endX = touch.clientX;
const endY = touch.clientY;
const deltaX = endX - startX.value;
const deltaY = endY - startY.value;
const isSliding = Math.abs(deltaX) > threshold || Math.abs(deltaY) > threshold;
if (isSliding) {
if (Math.abs(deltaX) > Math.abs(deltaY)) {
} else {
isScroStop.value = true;
}
} else {
isScroStop.value = false;
}
};
const initFunc = () => {
const element = messagesRef.value;
if (element) {
// 添加滚动事件监听器
element.addEventListener("scroll", handleScrollFunc);
}
至此,一个基本的 AI 对话交互页面已完成。后续,我会针对 SSE 返回流数据识别和 Echart 图显示问题,再写一篇文章。如果你感兴趣,可以从 GitHub 上下载代码运行:github.com/xknk/uniapp... ,如果觉得项目还不错,麻烦给个 Star!希望本文能为大家开发 AI 对话页面提供帮助,也欢迎各位大佬在评论区交流讨论。