【鸿蒙开发】第二十四章 AI - Core Speech Kit(基础语音服务)

目录

[1 简介](#1 简介)

[1.1 场景介绍](#1.1 场景介绍)

[1.2 约束与限制](#1.2 约束与限制)

[2 文本转语音](#2 文本转语音)

[2.1 场景介绍](#2.1 场景介绍)

[2.2 约束与限制](#2.2 约束与限制)

[2.3 开发步骤](#2.3 开发步骤)

[2.4 设置播报策略](#2.4 设置播报策略)

[2.4.1 设置单词播报方式](#2.4.1 设置单词播报方式)

[2.4.2 设置数字播报策略](#2.4.2 设置数字播报策略)

[2.4.3 插入静音停顿](#2.4.3 插入静音停顿)

[2.4.4 指定汉字发音](#2.4.4 指定汉字发音)

[2.5 开发实例](#2.5 开发实例)

[3 语音识别](#3 语音识别)

[3.1 场景介绍](#3.1 场景介绍)

[3.2 约束与限制](#3.2 约束与限制)

[3.3 开发步骤](#3.3 开发步骤)

[3.4 开发实例](#3.4 开发实例)


1 简介

Core Speech Kit(基础语音服务)集成了语音类基础AI能力,包括文本转语音(TextToSpeech)及语音识别(SpeechRecognizer)能力,便于用户与设备进行互动,实现将实时输入的语音与文本之间相互转换。

1.1 场景介绍

  • 文本转语音:将一段不超过10000字符的文本合成为语音并进行播报。
  • 语音识别:将一段音频信息(短语音模式不超过60s,长语音模式不超过8h)转换为文本,可以将pcm音频文件或者实时语音转换为文字。

1.2 约束与限制

AI能力 约束
文本转语音 * 支持的语种类型:中文。(简体中文、繁体中文、中文语境下的英文) * 支持的音色类型:聆小珊女声音色。 * 文本长度:不超过10000字符。
语音识别 * 支持的语种类型:中文普通话。 * 支持的模型类型:离线。 * 语音时长:短语音模式不超过60s,长语音模式不超过8h。

2 文本转语音

Core Speech Kit支持将一篇不超过10000字符的中文文本(简体中文、繁体中文、数字、中文语境下的英文)合成为语音,并以聆小珊女声音色中文播报。

开发者可对播报的策略进行设置,包括单词播报、数字播报、静音停顿、汉字发音策略。

2.1 场景介绍

手机/平板等设备在无网状态下,系统应用无障碍(屏幕朗读)接入文本转语音能力,为视障人士或不方便阅读场景提供播报能力。

2.2 约束与限制

该能力当前不支持模拟器。

2.3 开发步骤

  1. 在使用文本转语音时,将实现文本转语音相关的类添加至工程。

    import { textToSpeech } from '@kit.CoreSpeechKit';
    import { BusinessError } from '@kit.BasicServicesKit';

  2. 调用createEngine接口,创建textToSpeechEngine实例。createEngine接口提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考

    let ttsEngine: textToSpeech.TextToSpeechEngine;

    // 设置创建引擎参数
    let extraParam: Record<string, Object> = {"style": 'interaction-broadcast', "locate": 'CN', "name": 'EngineName'};
    let initParamsInfo: textToSpeech.CreateEngineParams = {
    language: 'zh-CN',
    person: 0,
    online: 1,
    extraParams: extraParam
    };

    // 调用createEngine方法
    textToSpeech.createEngine(initParamsInfo, (err: BusinessError, textToSpeechEngine: textToSpeech.TextToSpeechEngine) => {
    if (!err) {
    console.info('Succeeded in creating engine');
    // 接收创建引擎的实例
    ttsEngine = textToSpeechEngine;
    } else {
    console.error(Failed to create engine. Code: ${err.code}, message: ${err.message}.);
    }
    });

  3. 得到TextToSpeechEngine实例对象后,实例化SpeakParams对象、SpeakListener对象,并传入待合成及播报的文本originalText,调用speak接口进行播报。

    // 设置speak的回调信息
    let speakListener: textToSpeech.SpeakListener = {
    // 开始播报回调
    onStart(requestId: string, response: textToSpeech.StartResponse) {
    console.info(onStart, requestId: ${requestId} response: ${JSON.stringify(response)});
    },
    // 合成完成及播报完成回调
    onComplete(requestId: string, response: textToSpeech.CompleteResponse) {
    console.info(onComplete, requestId: ${requestId} response: ${JSON.stringify(response)});
    },
    // 停止播报回调
    onStop(requestId: string, response: textToSpeech.StopResponse) {
    console.info(onStop, requestId: ${requestId} response: ${JSON.stringify(response)});
    },
    // 返回音频流
    onData(requestId: string, audio: ArrayBuffer, response: textToSpeech.SynthesisResponse) {
    console.info(onData, requestId: ${requestId} sequence: ${JSON.stringify(response)} audio: ${JSON.stringify(audio)});
    },
    // 错误回调
    onError(requestId: string, errorCode: number, errorMessage: string) {
    console.error(onError, requestId: ${requestId} errorCode: ${errorCode} errorMessage: ${errorMessage});
    }
    };
    // 设置回调
    ttsEngine.setListener(speakListener);
    let originalText: string = 'Hello HarmonyOS';
    // 设置播报相关参数
    let extraParam: Record<string, Object> = {"queueMode": 0, "speed": 1, "volume": 2, "pitch": 1, "languageContext": 'zh-CN',
    "audioType": "pcm", "soundChannel": 3, "playType": 1 };
    let speakParams: textToSpeech.SpeakParams = {
    requestId: '123456', // requestId在同一实例内仅能用一次,请勿重复设置
    extraParams: extraParam
    };
    // 调用播报方法
    // 开发者可以通过修改speakParams主动设置播报策略
    ttsEngine.speak(originalText, speakParams);

  4. (可选)当需要停止合成及播报时,可调用stop接口。

    ttsEngine.stop();

  5. (可选)当需要查询文本转语音服务是否处于忙碌状态时,可调用isBusy接口。

    ttsEngine.isBusy();

6.(可选)当需要查询支持的语种音色信息时,可调用listVoices接口。

listVoices接口提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考

复制代码
// 在组件中声明并初始化字符串voiceInfo
@State voiceInfo: string = "";

// 设置查询相关参数
let voicesQuery: textToSpeech.VoiceQuery = {
  requestId: '12345678', // requestId在同一实例内仅能用一次,请勿重复设置
  online: 1
};
// 调用listVoices方法,以callback返回
ttsEngine.listVoices(voicesQuery, (err: BusinessError, voiceInfo: textToSpeech.VoiceInfo[]) => {
  if (!err) {
    // 接收目前支持的语种音色等信息
    this.voiceInfo = JSON.stringify(voiceInfo);
    console.info(`Succeeded in listing voices, voiceInfo is ${this.voiceInfo}`);
  } else {
    console.error(`Failed to list voices. Code: ${err.code}, message: ${err.message}`);
  }
});

2.4 设置播报策略

由于不同场景下,模型自动判断所选择的播报策略可能与实际需求不同,此章节提供对于播报策略进行主动设置的方法。

说明

以下取值说明均为有效取值,若所使用的数值在有效取值之外则播报结果可能与预期不符,并产生错误的播报结果。

2.4.1 设置单词播报方式

文本格式:[hN] (N=0/1/2)

N取值说明:

取值 说明
0 智能判断单词播放方式。默认值为0。
1 逐个字母进行播报。
2 以单词方式进行播报。

文本示例:

复制代码
"hello[h1] world"

hello使用单词发音,world及后续单词将会逐个字母进行发音。

2.4.2 设置数字播报策略

格式:[nN] (N=0/1/2)

N取值说明:

取值 说明
0 智能判断数字处理策略。默认值为0。
1 作为号码逐个数字播报。
2 作为数值播报。超过18位数字不支持,自动按逐个数字进行播报。

文本示例

复制代码
"[n2]123[n1]456[n0]"

其中,123将会按照数值播报,456则会按照号码播报,而后的文本中的数字,均会自动判断

2.4.3 插入静音停顿

格式:[pN]

描述:N为无符号整数,单位为ms。

文本示例:

复制代码
"你好[p500]小艺"

该句播报时,将会在"你好"后插入500ms的静音停顿。

2.4.4 指定汉字发音

汉字声调用后接一位数字1~5分别表示阴平、阳平、上声、去声和轻声5个声调。

格式:[=MN]

描述:M表示拼音,N表示声调。

N取值说明:

取值 说明
1 阴平
2 阳平
3 上声
4 去声
5 轻声

文本示例:

复制代码
"着[=zhuo2]手"

2.5 开发实例

点击按钮,播报一段文本。

复制代码
import { textToSpeech } from '@kit.CoreSpeechKit';
import { BusinessError } from '@kit.BasicServicesKit';

let ttsEngine: textToSpeech.TextToSpeechEngine;
@Entry
@Component
struct Index {
  @State createCount: number = 0;
  @State result: boolean = false;
  @State voiceInfo: string = "";
  @State text: string = "";
  @State textContent: string = "";
  @State utteranceId: string = "123456";
  @State originalText: string = "\n\t\t古人学问无遗力,少壮工夫老始成;\n\t\t" +
    "纸上得来终觉浅,绝知此事要躬行。\n\t\t";
  @State illegalText: string = "";

  build() {
    Column() {
      Scroll() {
        Column() {
          TextArea({ placeholder: 'Please enter tts original text', text: `${this.originalText}` })
            .margin(20)
            .focusable(false)
            .border({ width: 5, color: 0x317AE7, radius: 10, style: BorderStyle.Dotted })
            .onChange((value: string) => {
              this.originalText = value;
              console.info(`original text: ${this.originalText}`);
            })

          Button() {
            Text("CreateEngineByCallback")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.createCount++;
            console.info(`CreateTtsEngine:createCount:${this.createCount}`);
            this.createByCallback();
          })

          Button() {
            Text("speak")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.createCount++;
            this.speak();
          })

          Button() {
            Text("listVoicesCallback")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.listVoicesCallback();
          })

          Button() {
            Text("stop")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 停止播报
            console.info("Stop button clicked.");
            ttsEngine.stop();
          })

          Button() {
            Text("isBusy")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 查询播报状态
            let isBusy = ttsEngine.isBusy();
            console.info(`isBusy: ${isBusy}`);
          })

          Button() {
            Text("shutdown")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AA7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 释放引擎
            ttsEngine.shutdown();
          })
        }
        .layoutWeight(1)
      }
      .width('100%')
      .height('100%')
    }
  }

  // 创建引擎,通过callback形式返回
  private createByCallback() {
    // 设置创建引擎参数
    let extraParam: Record<string, Object> = {"style": 'interaction-broadcast', "locate": 'CN', "name": 'EngineName'};
    let initParamsInfo: textToSpeech.CreateEngineParams = {
      language: 'zh-CN',
      person: 0,
      online: 1,
      extraParams: extraParam
    };
    
    // 调用createEngine方法
    textToSpeech.createEngine(initParamsInfo, (err: BusinessError, textToSpeechEngine: textToSpeech.TextToSpeechEngine) => {
      if (!err) {
        console.info('Succeeded in creating engine.');
        // 接收创建引擎的实例
        ttsEngine = textToSpeechEngine;
      } else {
        console.error(`Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
      }
    });
  };

  // 调用speak播报方法
  private speak() {
    let speakListener: textToSpeech.SpeakListener = {
      // 开始播报回调
      onStart(requestId: string, response: textToSpeech.StartResponse) {
        console.info(`onStart, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      },
      // 完成播报回调
      onComplete(requestId: string, response: textToSpeech.CompleteResponse) {
        console.info(`onComplete, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      }, 
      // 停止播报完成回调,调用stop方法并完成时会触发此回调
      onStop(requestId: string, response: textToSpeech.StopResponse) {
        console.info(`onStop, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      },
      // 返回音频流
      onData(requestId: string, audio: ArrayBuffer, response: textToSpeech.SynthesisResponse) {
        console.info(`onData, requestId: ${requestId} sequence: ${JSON.stringify(response)} audio: ${JSON.stringify(audio)}`);
      },
      // 错误回调,播报过程发生错误时触发此回调
      onError(requestId: string, errorCode: number, errorMessage: string) {
        console.error(`onError, requestId: ${requestId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
      }
    };
   // 设置回调
   ttsEngine.setListener(speakListener);
   // 设置播报相关参数
   let extraParam: Record<string, Object> = {"queueMode": 0, "speed": 1, "volume": 2, "pitch": 1, "languageContext": 'zh-CN', "audioType": "pcm", "soundChannel": 3, "playType":1}
   let speakParams: textToSpeech.SpeakParams = {
      requestId: '123456-a', // requestId在同一实例内仅能用一次,请勿重复设置
      extraParams: extraParam
     };
   // 调用speak播报方法
   ttsEngine.speak(this.originalText, speakParams);
  };

  // 查询语种音色信息,以callback形式返回
  private listVoicesCallback() {
    // 设置查询相关参数
    let voicesQuery: textToSpeech.VoiceQuery = {
      requestId: '123456-b', // requestId在同一实例内仅能用一次,请勿重复设置
      online: 1
    };

    // 调用listVoices方法,以callback返回语种音色查询结果
    ttsEngine.listVoices(voicesQuery, (err: BusinessError, voiceInfo: textToSpeech.VoiceInfo[]) => {
      if (!err) {
        // 接收目前支持的语种音色等信息
        this.voiceInfo = JSON.stringify(voiceInfo);
        console.info(`Succeeded in listing voices, voiceInfo is ${voiceInfo}`);
      } else {
        console.error(`Failed to list voices. Code: ${err.code}, message: ${err.message}`);
      }
    });
  };
}

3 语音识别

将一段中文音频信息(中文、中文语境下的英文;短语音模式不超过60s,长语音模式不超过8h)转换为文本,音频信息可以为pcm音频文件或者实时语音。

3.1 场景介绍

手机/平板等设备在无网状态下,为听障人士或不方便收听音频场景提供音频转文本能力。

3.2 约束与限制

该能力当前不支持模拟器。

3.3 开发步骤

  1. 在使用语音识别时,将实现语音识别相关的类添加至工程。

    import { speechRecognizer } from '@kit.CoreSpeechKit';
    import { BusinessError } from '@kit.BasicServicesKit';

  2. 调用createEngine方法,对引擎进行初始化,并创建SpeechRecognitionEngine实例。

createEngine方法提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考

复制代码
let asrEngine: speechRecognizer.SpeechRecognitionEngine;
let sessionId: string = '123456';
// 创建引擎,通过callback形式返回
// 设置创建引擎参数
let extraParam: Record<string, Object> = {"locate": "CN", "recognizerMode": "short"};
let initParamsInfo: speechRecognizer.CreateEngineParams = {
  language: 'zh-CN',
  online: 1,
  extraParams: extraParam
};
// 调用createEngine方法
speechRecognizer.createEngine(initParamsInfo, (err: BusinessError, speechRecognitionEngine: speechRecognizer.SpeechRecognitionEngine) => {
  if (!err) {
    console.info('Succeeded in creating engine.');
    // 接收创建引擎的实例
    asrEngine = speechRecognitionEngine;
  } else {
    console.error(`Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
  }
});
  1. 得到SpeechRecognitionEngine实例对象后,实例化RecognitionListener对象,调用setListener方法设置回调,用来接收语音识别相关的回调信息。

    // 创建回调对象
    let setListener: speechRecognizer.RecognitionListener = {
    // 开始识别成功回调
    onStart(sessionId: string, eventMessage: string) {
    console.info(onStart, sessionId: ${sessionId} eventMessage: ${eventMessage});
    },
    // 事件回调
    onEvent(sessionId: string, eventCode: number, eventMessage: string) {
    console.info(onEvent, sessionId: ${sessionId} eventCode: ${eventCode} eventMessage: ${eventMessage});
    },
    // 识别结果回调,包括中间结果和最终结果
    onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
    console.info(onResult, sessionId: ${sessionId} sessionId: ${JSON.stringify(result)});
    },
    // 识别完成回调
    onComplete(sessionId: string, eventMessage: string) {
    console.info(onComplete, sessionId: ${sessionId} eventMessage: ${eventMessage});
    },
    // 错误回调,错误码通过本方法返回
    // 如:返回错误码1002200006,识别引擎正忙,引擎正在识别中
    // 更多错误码请参考错误码参考
    onError(sessionId: string, errorCode: number, errorMessage: string) {
    console.error(onError, sessionId: ${sessionId} errorCode: ${errorCode} errorMessage: ${errorMessage});
    }
    }
    // 设置回调
    asrEngine.setListener(setListener);

  2. 分别为音频文件转文字和麦克风转文字功能设置开始识别的相关参数,调用startListening方法,开始合成。

    // 开始识别
    private startListeningForWriteAudio() {
    // 设置开始识别的相关参数
    let recognizerParams: speechRecognizer.StartParams = {
    sessionId: this.sessionId,
    audioInfo: { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 } //audioInfo参数配置请参考AudioInfo
    }
    // 调用开始识别方法
    asrEngine.startListening(recognizerParams);
    };

    private startListeningForRecording() {
    let audioParam: speechRecognizer.AudioInfo = { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 }
    let extraParam: Record<string, Object> = {
    "recognitionMode": 0,
    "vadBegin": 2000,
    "vadEnd": 3000,
    "maxAudioDuration": 20000
    }
    let recognizerParams: speechRecognizer.StartParams = {
    sessionId: this.sessionId,
    audioInfo: audioParam,
    extraParams: extraParam
    }
    console.info('startListening start');
    asrEngine.startListening(recognizerParams);
    };

  3. 传入音频流,调用writeAudio方法,开始写入音频流。读取音频文件时,开发者需预先准备一个pcm格式音频文件。

    let uint8Array: Uint8Array = new Uint8Array();
    // 可以通过如下方式获取音频流:1、通过录音获取音频流;2、从音频文件中读取音频流
    // 2、从音频文件中读取音频流:demo参考
    // 写入音频流,音频流长度仅支持640或1280
    asrEngine.writeAudio(sessionId, uint8Array);

  4. (可选)当需要查询语音识别服务支持的语种信息,可调用listLanguages方法。

listLanguages方法提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考

复制代码
// 设置查询相关的参数
let languageQuery: speechRecognizer.LanguageQuery = {
  sessionId: sessionId
};
// 调用listLanguages方法
asrEngine.listLanguages(languageQuery).then((res: Array<string>) => {
  console.info(`Succeeded in listing languages, result: ${JSON.stringify(res)}.`);
}).catch((err: BusinessError) => {
  console.error(`Failed to list languages. Code: ${err.code}, message: ${err.message}.`);
});
  1. (可选)当需要结束识别时,可调用finish方法。

    // 结束识别
    asrEngine.finish(sessionId);

8.(可选)当需要取消识别时,可调用cancel方法。

复制代码
// 取消识别
asrEngine.cancel(sessionId);
  1. (可选)当需要释放语音识别引擎资源时,可调用shutdown方法。

    // 释放识别引擎资源
    asrEngine.shutdown();

  2. 需要在module.json5配置文件中添加ohos.permission.MICROPHONE权限,确保麦克风使用正常。详细步骤可查看声明权限章节。

    //...
    "requestPermissions": [
    {
    "name" : "ohos.permission.MICROPHONE",
    "reason": "$string:reason",
    "usedScene": {
    "abilities": [
    "EntryAbility"
    ],
    "when":"inuse"
    }
    }
    ],
    //...

3.4 开发实例

点击按钮,将一段音频信息转换为文本。index.ets文件如下:

复制代码
import { speechRecognizer } from '@kit.CoreSpeechKit';
import { BusinessError } from '@kit.BasicServicesKit';
import { fileIo } from '@kit.CoreFileKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import AudioCapturer from './AudioCapturer';

const TAG = 'CoreSpeechKitDemo';

let asrEngine: speechRecognizer.SpeechRecognitionEngine;

@Entry
@Component
struct Index {
  @State createCount: number = 0;
  @State result: boolean = false;
  @State voiceInfo: string = "";
  @State sessionId: string = "123456";
  private mAudioCapturer = new AudioCapturer();

  build() {
    Column() {
      Scroll() {
        Column() {
          Button() {
            Text("CreateEngineByCallback")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.createCount++;
            hilog.info(0x0000, TAG, `CreateAsrEngine:createCount:${this.createCount}`);
            this.createByCallback();
          })

          Button() {
            Text("setListener")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.setListener();
          })

          Button() {
            Text("startRecording")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.startRecording();
          })

          Button() {
            Text("writeAudio")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.writeAudio();
          })

          Button() {
            Text("queryLanguagesCallback")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            this.queryLanguagesCallback();
          })

          Button() {
            Text("finish")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 结束识别
            hilog.info(0x0000, TAG, "finish click:-->");
            asrEngine.finish(this.sessionId);
          })

          Button() {
            Text("cancel")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AE7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 取消识别
            hilog.info(0x0000, TAG, "cancel click:-->");
            asrEngine.cancel(this.sessionId);
          })

          Button() {
            Text("shutdown")
              .fontColor(Color.White)
              .fontSize(20)
          }
          .type(ButtonType.Capsule)
          .backgroundColor("#0x317AA7")
          .width("80%")
          .height(50)
          .margin(10)
          .onClick(() => {
            // 释放引擎
            asrEngine.shutdown();
          })
        }
        .layoutWeight(1)
      }
      .width('100%')
      .height('100%')

    }
  }

  // 创建引擎,通过callback形式返回
  private createByCallback() {
    // 设置创建引擎参数
    let extraParam: Record<string, Object> = {"locate": "CN", "recognizerMode": "short"};
    let initParamsInfo: speechRecognizer.CreateEngineParams = {
      language: 'zh-CN',
      online: 1,
      extraParams: extraParam
    };

    // 调用createEngine方法
    speechRecognizer.createEngine(initParamsInfo, (err: BusinessError, speechRecognitionEngine:
      speechRecognizer.SpeechRecognitionEngine) => {
      if (!err) {
        hilog.info(0x0000, TAG, 'Succeeded in creating engine.');
        // 接收创建引擎的实例
        asrEngine = speechRecognitionEngine;
      } else {
        // 无法创建引擎时返回错误码1002200001,原因:语种不支持、模式不支持、初始化超时、资源不存在等导致创建引擎失败
        // 无法创建引擎时返回错误码1002200006,原因:引擎正在忙碌中,一般多个应用同时调用语音识别引擎时触发
        // 无法创建引擎时返回错误码1002200008,原因:引擎已被销毁
        hilog.error(0x0000, TAG, `Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
      }
    });
  }

  // 查询语种信息,以callback形式返回
  private queryLanguagesCallback() {
    // 设置查询相关参数
    let languageQuery: speechRecognizer.LanguageQuery = {
      sessionId: '123456'
    };
    // 调用listLanguages方法
    asrEngine.listLanguages(languageQuery, (err: BusinessError, languages: Array<string>) => {
      if (!err) {
        // 接收目前支持的语种信息
        hilog.info(0x0000, TAG, `Succeeded in listing languages, result: ${JSON.stringify(languages)}`);
      } else {
        hilog.error(0x0000, TAG, `Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
      }
    });
  };

  // 开始识别
  private startListeningForWriteAudio() {
    // 设置开始识别的相关参数
    let recognizerParams: speechRecognizer.StartParams = {
      sessionId: this.sessionId,
      audioInfo: { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 } //audioInfo参数配置请参考AudioInfo
    }
    // 调用开始识别方法
    asrEngine.startListening(recognizerParams);
  };

  private startListeningForRecording() {
    let audioParam: speechRecognizer.AudioInfo = { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 }
    let extraParam: Record<string, Object> = {
      "recognitionMode": 0,
      "vadBegin": 2000,
      "vadEnd": 3000,
      "maxAudioDuration": 20000
    }
    let recognizerParams: speechRecognizer.StartParams = {
      sessionId: this.sessionId,
      audioInfo: audioParam,
      extraParams: extraParam
    }
    hilog.info(0x0000, TAG, 'startListening start');
    asrEngine.startListening(recognizerParams);
  };



  // 写音频流
  private async writeAudio() {
    this.startListeningForWriteAudio();
    hilog.error(0x0000, TAG, `Failed to read from file. Code`);
    let ctx = getContext(this);
    let filenames: string[] = fileIo.listFileSync(ctx.filesDir);
    if (filenames.length <= 0) {
      hilog.error(0x0000, TAG, `Failed to read from file. Code`);
      return;
    }
    hilog.error(0x0000, TAG, `Failed to read from file. Code`);
    let filePath: string = `${ctx.filesDir}/${filenames[0]}`;
    let file = fileIo.openSync(filePath, fileIo.OpenMode.READ_WRITE);
    try {
      let buf: ArrayBuffer = new ArrayBuffer(1280);
      let offset: number = 0;
      while (1280 == fileIo.readSync(file.fd, buf, {
        offset: offset
      })) {
        let uint8Array: Uint8Array = new Uint8Array(buf);
        asrEngine.writeAudio("123456", uint8Array);
        await this.countDownLatch(1);
        offset = offset + 1280;
      }
    } catch (err) {
      hilog.error(0x0000, TAG, `Failed to read from file. Code: ${err.code}, message: ${err.message}.`);
    } finally {
      if (null != file) {
        fileIo.closeSync(file);
      }
    }
  }

  // 麦克风语音转文本
  private async startRecording() {
    this.startListeningForRecording();
    // 录音获取音频
    let data: ArrayBuffer;
    hilog.info(0x0000, TAG, 'create capture success');
    this.mAudioCapturer.init((dataBuffer: ArrayBuffer) => {
      hilog.info(0x0000, TAG, 'start write');
      hilog.info(0x0000, TAG, 'ArrayBuffer ' + JSON.stringify(dataBuffer));
      data = dataBuffer
      let uint8Array: Uint8Array = new Uint8Array(data);
      hilog.info(0x0000, TAG, 'ArrayBuffer uint8Array ' + JSON.stringify(uint8Array));
      // 写入音频流
      asrEngine.writeAudio("1234567", uint8Array);
    });
  };
  // 计时
  public async countDownLatch(count: number) {
    while (count > 0) {
      await this.sleep(40);
      count--;
    }
  }
  // 睡眠
  private sleep(ms: number):Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }

  // 设置回调
  private setListener() {
    // 创建回调对象
    let setListener: speechRecognizer.RecognitionListener = {
      // 开始识别成功回调
      onStart(sessionId: string, eventMessage: string) {
        hilog.info(0x0000, TAG, `onStart, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
      },
      // 事件回调
      onEvent(sessionId: string, eventCode: number, eventMessage: string) {
        hilog.info(0x0000, TAG, `onEvent, sessionId: ${sessionId} eventCode: ${eventCode} eventMessage: ${eventMessage}`);
      },
      // 识别结果回调,包括中间结果和最终结果
      onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
        hilog.info(0x0000, TAG, `onResult, sessionId: ${sessionId} sessionId: ${JSON.stringify(result)}`);
      },
      // 识别完成回调
      onComplete(sessionId: string, eventMessage: string) {
        hilog.info(0x0000, TAG, `onComplete, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
      },
      // 错误回调,错误码通过本方法返回
      // 返回错误码1002200002,开始识别失败,重复启动startListening方法时触发
      // 更多错误码请参考错误码参考
      onError(sessionId: string, errorCode: number, errorMessage: string) {
        hilog.error(0x0000, TAG, `onError, sessionId: ${sessionId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
      },
    }
    // 设置回调
    asrEngine.setListener(setListener);
  };
}

添加AudioCapturer.ts文件用于获取麦克风音频流。

复制代码
'use strict';
/*
 * Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved.
 */

import {audio} from '@kit.AudioKit';
import { hilog } from '@kit.PerformanceAnalysisKit';

const TAG = 'AudioCapturer';

/**
 * Audio collector tool
 */
export default class AudioCapturer {
  /**
   * Collector object
   */
  private mAudioCapturer = null;

  /**
   * Audio Data Callback Method
   */
  private mDataCallBack: (data: ArrayBuffer) => void = null;

  /**
   * Indicates whether recording data can be obtained.
   */
  private mCanWrite: boolean = true;

  /**
   * Audio stream information
   */
  private audioStreamInfo = {
    samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_16000,
    channels: audio.AudioChannel.CHANNEL_1,
    sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,
    encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW
  }

  /**
   * Audio collector information
   */
  private audioCapturerInfo = {
    source: audio.SourceType.SOURCE_TYPE_MIC,
    capturerFlags: 0
  }

  /**
   * Audio Collector Option Information
   */
  private audioCapturerOptions = {
    streamInfo: this.audioStreamInfo,
    capturerInfo: this.audioCapturerInfo
  }

  /**
   *  Initialize
   * @param audioListener
   */
  public async init(dataCallBack: (data: ArrayBuffer) => void) {
    if (null != this.mAudioCapturer) {
      hilog.error(0x0000, TAG, 'AudioCapturerUtil already init');
      return;
    }
    this.mDataCallBack = dataCallBack;
    this.mAudioCapturer = await audio.createAudioCapturer(this.audioCapturerOptions).catch(error => {
      hilog.error(0x0000, TAG, `AudioCapturerUtil init createAudioCapturer failed, code is ${error.code}, message is ${error.message}`);
    });
  }

  /**
   * start recording
   */
  public async start() {
    hilog.error(0x0000, TAG, `AudioCapturerUtil start`);
    let stateGroup = [audio.AudioState.STATE_PREPARED, audio.AudioState.STATE_PAUSED, audio.AudioState.STATE_STOPPED];
    if (stateGroup.indexOf(this.mAudioCapturer.state) === -1) {
      hilog.error(0x0000, TAG, `AudioCapturerUtil start failed`);
      return;
    }
    this.mCanWrite = true;
    await this.mAudioCapturer.start();
    while (this.mCanWrite) {
      let bufferSize = await this.mAudioCapturer.getBufferSize();
      let buffer = await this.mAudioCapturer.read(bufferSize, true);
      this.mDataCallBack(buffer)
    }
  }

  /**
   * stop recording
   */
  public async stop() {
    if (this.mAudioCapturer.state !== audio.AudioState.STATE_RUNNING && this.mAudioCapturer.state !== audio.AudioState.STATE_PAUSED) {
      hilog.error(0x0000, TAG, `AudioCapturerUtil stop Capturer is not running or paused`);
      return;
    }
    this.mCanWrite = false;
    await this.mAudioCapturer.stop();
    if (this.mAudioCapturer.state === audio.AudioState.STATE_STOPPED) {
      hilog.info(0x0000, TAG, `AudioCapturerUtil Capturer stopped`);
    } else {
      hilog.error(0x0000, TAG, `Capturer stop failed`);
    }
  }

  /**
   * release
   */
  public async release() {
    if (this.mAudioCapturer.state === audio.AudioState.STATE_RELEASED || this.mAudioCapturer.state === audio.AudioState.STATE_NEW) {
      hilog.error(0x0000, TAG, `Capturer already released`);
      return;
    }
    await this.mAudioCapturer.release();
    this.mAudioCapturer = null;
    if (this.mAudioCapturer.state == audio.AudioState.STATE_RELEASED) {
      hilog.info(0x0000, TAG, `Capturer released`);
    } else {
      hilog.error(0x0000, TAG, `Capturer release failed`);
    }
  }
}

在EntryAbility.ets文件中添加麦克风权限。

复制代码
import { abilityAccessCtrl, AbilityConstant, UIAbility, Want } from '@kit.AbilityKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import { window } from '@kit.ArkUI';
import { BusinessError } from '@kit.BasicServicesKit';

export default class EntryAbility extends UIAbility {
  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');
  }

  onDestroy(): void {
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');
  }

  onWindowStageCreate(windowStage: window.WindowStage): void {
    // Main window is created, set main page for this ability
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');

    let atManager = abilityAccessCtrl.createAtManager();
    atManager.requestPermissionsFromUser(this.context, ['ohos.permission.MICROPHONE']).then((data) => {
      hilog.info(0x0000, 'testTag', 'data:' + JSON.stringify(data));
      hilog.info(0x0000, 'testTag', 'data permissions:' + data.permissions);
      hilog.info(0x0000, 'testTag', 'data authResults:' + data.authResults);
    }).catch((err: BusinessError) => {
      hilog.error(0x0000, 'testTag', 'errCode: ' + err.code + 'errMessage: ' + err.message);
    });

    windowStage.loadContent('pages/Index', (err, data) => {
      if (err.code) {
        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');
        return;
      }
      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content. Data: %{public}s', JSON.stringify(data) ?? '');
    });
  }

  onWindowStageDestroy(): void {
    // Main window is destroyed, release UI related resources
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');
  }

  onForeground(): void {
    // Ability has brought to foreground
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');
  }

  onBackground(): void {
    // Ability has back to background
    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');
  }
}
相关推荐
泡泡大魔王8 小时前
鸿蒙ArkTS开发:微信/系统来电通话监听功能实现
华为·harmonyos
黑臂麒麟8 小时前
harmonyOS基础- 快速弄懂HarmonyOS ArkTs基础组件、布局容器(前端视角篇)
harmonyos·arkui
The旺10 小时前
《HarmonyOS Next开发进阶:打造功能完备的Todo应用华章》
harmonyos
用户5457483417713 小时前
Harmonyos5应用开发实战——地图组件集成与定位功能实现(part1)
harmonyos
用户5457483417713 小时前
Harmonyos5应用开发实战——订单页面开发(part2)
harmonyos
用户5457483417713 小时前
Harmonyos5应用开发实战——地图组件集成与定位功能实现(part2)
harmonyos
用户5457483417713 小时前
HarmonyOS Next应用开发实战——登录页面实现(part1)
harmonyos
用户5457483417713 小时前
HarmonyOS Next应用开发实战——底部弹框组件的实现(part1)
harmonyos
用户5457483417714 小时前
HarmonyOS Next应用开发实战——底部弹框组件的实现(part2)
harmonyos
用户5457483417714 小时前
HarmonyOS Next应用开发实战——多功能页面组件构建(part1)
harmonyos