【音频测试】03-WPF 实现声道自动验证 + Whisper 语音识别录音检测
系列 :WPF 产线功能测试实战
本篇目标 :读完可以写出一个能自动验证左右声道、并用 Whisper 识别录音内容的 WPF 音频测试程序
关键词:WPF 音频测试、NAudio 声道控制、WaveChannel32 Pan、mciSendString 录音、Whisper 语音识别、产测
一、前言:音频产测为什么难做
音频测试是产线功能测试中最容易出错、也最难自动化的项目之一。喇叭焊虚、声道接反、声卡驱动异常------这些问题如果靠人耳判断,效率低、一致性差;如果完全依赖电声测量仪,成本高、部署复杂。
本篇介绍的是一套纯软件自动化方案:
- 声道验证:程序在指定声道播放随机两位数字音频,操作员从三个选项中选出听到的数字------选错说明该声道异常
- 录音检测:麦克风录制操作员朗读"1, 2, 3",调用本地 Whisper 模型识别内容,自动判定通过/失败
整套流程无需额外硬件,WPF 程序自动完成声道播放、录音采集、语音识别三个阶段的判断,操作员只需配合说话和点选。
二、基础概念:五个核心技术点
2.1 NAudio 库与声道控制
NAudio 是 .NET 平台主流的音频处理库(NuGet 包名 NAudio),提供从音频文件读取、格式转换到设备输出的完整管道。
控制播放声道的关键类是 WaveChannel32,它有一个 Pan 属性:
| Pan 值 | 效果 |
|---|---|
-1.0f |
纯左声道(右声道静音) |
0.0f |
双声道等音量(默认) |
1.0f |
纯右声道(左声道静音) |
这个属性是声道测试的核心机制,不需要系统级设置,代码层面即可精确控制。
2.2 WaveOutEvent:跨线程安全的音频输出
WaveOutEvent 是 NAudio 的事件驱动音频输出类,适合在非 UI 线程(如 Task.Run 中)使用。与 WaveOut 的区别是不依赖窗口句柄,在后台线程播放时不会出现线程安全问题。
2.3 mciSendString:Windows 原生录音 API
mciSendString 是 Windows 多媒体控制接口(MCI)的 P/Invoke 调用,控制录音设备的开/停/保存:
open new type waveaudio alias mic → 打开录音设备
record mic → 开始录音
stop mic → 停止录音
save mic "path/to/file.wav" → 保存录音到文件
close mic → 关闭设备
这套 API 在 Windows 上几乎零依赖,不需要额外的 NuGet 包。
2.4 MMDeviceEnumerator:软件控制系统音量
NAudio 的 MMDeviceEnumerator(来自 NAudio.CoreAudioApi 命名空间)可以枚举和操作 Windows 音频端点设备,包括读写主音量、静音状态等:
csharp
var enumerator = new MMDeviceEnumerator();
var device = enumerator.GetDefaultAudioEndpoint(DataFlow.Render, Role.Multimedia);
device.AudioEndpointVolume.MasterVolumeLevelScalar = 1.0f; // 设为最大音量
测试前必须先设置最大音量,否则测试结果受系统音量影响,结果不可靠。
2.5 Whisper:本地离线语音识别
Whisper 是 OpenAI 开源的语音识别模型,支持中文,可在本地 CPU 上运行(无需联网)。在产测场景中,将 Python 版 Whisper 包装成一个长驻进程,通过标准输入输出(stdin/stdout)与 C# 主程序通信:C# 发送音频文件路径,Python 返回识别文本。这种进程间通信方式避免了每次识别都要重新加载模型的开销(模型加载一次约需 3-5 秒)。
三、方案设计:三阶段自动化测试流程
3.1 完整测试流程

3.2 MVVM 分层设计
| 层 | 类 | 职责 |
|---|---|---|
| View | MainWindow.xaml.cs |
测试主流程(本项目 UI 交互较紧密,直接在 code-behind 实现) |
| 工具类 | NumberSpeaker |
封装 WAV 音频文件声道播放逻辑 |
| 外部服务 | Whisper Python 进程 | 语音识别,通过 stdin/stdout 通信 |
注:本测试项的 UI 交互(选项按钮点击、录音计时)与业务逻辑耦合较紧,直接在 code-behind 实现,不强行拆分 ViewModel,是合理的权衡。
3.3 NuGet 依赖
NAudio 版本 2.x (音频播放、音量控制)
四、核心代码详解
4.1 测试前:设置系统音量为最大
csharp
private void SetVolumeToMax()
{
using (var enumerator = new MMDeviceEnumerator())
{
// 设置播放(扬声器)音量
var playback = enumerator.GetDefaultAudioEndpoint(DataFlow.Render, Role.Multimedia);
if (playback?.AudioEndpointVolume != null)
{
playback.AudioEndpointVolume.MasterVolumeLevelScalar = 1.0f;
playback.AudioEndpointVolume.Mute = false;
}
// 设置录音(麦克风)音量
var capture = enumerator.GetDefaultAudioEndpoint(DataFlow.Capture, Role.Multimedia);
if (capture?.AudioEndpointVolume != null)
{
capture.AudioEndpointVolume.MasterVolumeLevelScalar = 1.0f;
capture.AudioEndpointVolume.Mute = false;
}
}
}
两个端点都要设置:播放音量低了操作员听不清,录音音量低了 Whisper 识别不到声音。
4.2 声道测试核心:NumberSpeaker
NumberSpeaker 是核心工具类,读取预录的单个数字 WAV 文件,通过 WaveChannel32.Pan 控制声道。
PlaySingleFileAsync --- 单次播放(指定声道)
csharp
private async Task PlaySingleFileAsync(string filePath, bool isLeftChannel, CancellationToken token)
{
using (var reader = new WaveFileReader(filePath))
using (var channel32 = new WaveChannel32(reader))
using (var waveOut = new WaveOutEvent())
{
// Pan: -1.0f = 纯左声道, 1.0f = 纯右声道
channel32.Pan = isLeftChannel ? -1.0f : 1.0f;
waveOut.Init(channel32);
// TaskCompletionSource 将"播放结束"事件转为可 await 的 Task
var tcs = new TaskCompletionSource<bool>();
waveOut.PlaybackStopped += (s, e) => tcs.TrySetResult(true);
waveOut.Play();
// 等待播放完成,设 300ms 超时兜底
await Task.WhenAny(tcs.Task, Task.Delay(300));
if (waveOut.PlaybackState == PlaybackState.Playing)
waveOut.Stop();
}
}
PlayNumberAsync --- 播放两位数(循环 N 次)
两位数拆成两个字符,分别对应 Audio/AR/ 目录下的 0.wav ~ 9.wav 文件(注:可自行录制0~9的音频),循环播放指定次数:
csharp
public async Task PlayNumberAsync(string twoDigitNumber, bool isLeftChannel, int loopCount)
{
await StopAsync(); // 先停止上一次播放
char d1 = twoDigitNumber[0];
char d2 = twoDigitNumber[1];
string file1 = Path.Combine(_audioDirectory, $"{d1}.wav");
string file2 = Path.Combine(_audioDirectory, $"{d2}.wav");
_cts = new CancellationTokenSource();
var token = _cts.Token;
_playTask = Task.Run(async () =>
{
for (int i = 0; i < loopCount; i++)
{
if (token.IsCancellationRequested) return;
await PlaySingleFileAsync(file1, isLeftChannel, token);
await Task.Delay(100, token); // 两个数字间隔 100ms
await PlaySingleFileAsync(file2, isLeftChannel, token);
if (i < loopCount - 1)
await Task.Delay(1000, token); // 循环间隔 1s
}
}, token);
try { await _playTask; }
catch (OperationCanceledException) { /* 用户选完后中断播放,正常情况 */ }
}
为什么用 CancellationToken :操作员可能在第一次播放时就选出了答案,此时需要立即停止剩余循环,不能让声音继续播放干扰后续步骤。
StopAsync()调用_cts.Cancel()中断正在进行的播放循环。
声道测试主流程(RunSingleChannelTest)
csharp
private async Task RunSingleChannelTest()
{
// 生成随机两位数及三个选项
correctAnswer = random.Next(10, 100).ToString();
currentOptions = GenerateRandomOptions(correctAnswer); // 包含正确答案的3个选项,随机打乱
string channelName = isLeftChannelActiveNew ? "左声道" : "右声道";
ChannelHintText.Text = $"正在测试【{channelName}】";
NewStatusText.Text = "请仔细听声音...";
// 在指定声道播放数字(显示选项 + 播放同步进行)
ShowOptionsForSelection(); // 先显示3个按钮
await _numberSpeaker.PlayNumberAsync(correctAnswer, isLeftChannelActiveNew, PLAY_LOOP_COUNT);
}
选项按钮点击时:
csharp
private async void OptionBtn_Click(object sender, RoutedEventArgs e)
{
string selected = (sender as Button)?.Content.ToString();
// 立即禁用所有按钮,防止重复点击
Option1Btn.IsEnabled = Option2Btn.IsEnabled = Option3Btn.IsEnabled = false;
if (selected == correctAnswer)
{
await _numberSpeaker.StopAsync(); // 停止剩余循环播放
currentChannelTestIndex++;
if (currentChannelTestIndex >= 2)
StartNewRecordingTest(); // 两个声道都通过,进入录音测试
else
{
isLeftChannelActiveNew = !isLeftChannelActiveNew; // 切换声道
await RunSingleChannelTest();
}
}
else
{
// 选错直接 FAIL
CloseUIAction?.Invoke(255, new TestResult { ErrorCode = "008002",
Description = $"{channelName}声道选择错误: 选{selected},正确{correctAnswer}" });
}
}
4.3 录音与 Whisper 自动识别
录音(mciSendString Win32 API)
csharp
[DllImport("winmm.dll")]
private static extern uint mciSendString(string cmd, string ret, uint retLen, IntPtr hWnd);
private void StartRecording()
{
mciSendString("open new type waveaudio alias mic", null, 0, IntPtr.Zero);
mciSendString("record mic", null, 0, IntPtr.Zero);
recordingSeconds = 0;
recordingTimer.Start(); // DispatcherTimer,每秒 Tick,录满 3 秒后自动停止
}
private async Task StopRecording()
{
recordingTimer.Stop();
mciSendString("stop mic", null, 0, IntPtr.Zero);
mciSendString($"save mic \"{tempRecordingFile}\"", null, 0, IntPtr.Zero);
mciSendString("close mic", null, 0, IntPtr.Zero);
// 自动回放 + 识别
await PlayRecording();
await ProcessRecordingAndAutoJudge();
}
Whisper 进程通信
Whisper 服务是一个保持常驻的 Python 进程,通过标准输入发送文件路径,从标准输出读取识别结果:
csharp
// 启动时:等待 "MODEL_LOADED" 信号表示模型就绪
_whisperServiceProcess = new Process { StartInfo = startInfo };
_whisperServiceProcess.Start();
string response = await Task.Run(() => _whisperStdout.ReadLine()); // 超时 8s
if (response != "MODEL_LOADED") throw new Exception("Whisper 启动失败");
// 识别时:发送路径,读取结果(带超时)
await _whisperStdin.WriteLineAsync(audioFilePath);
using (var cts = new CancellationTokenSource(timeout))
{
result = await ReadLineWithTimeoutAsync(_whisperStdout, cts.Token);
}
// 停止时:发送 EXIT 命令
_whisperStdin.WriteLine("EXIT");
_whisperServiceProcess.WaitForExit(5000);
识别结果判定
csharp
private bool CheckTranscriptionResult(string text)
{
if (string.IsNullOrWhiteSpace(text)) return false;
// 只要识别文本中包含 ≥2 个关键词即认为通过
var keywords = new[] { "1","2","3","一","二","三","壹","贰","叁" };
int matchCount = keywords.Count(k => text.Contains(k));
return matchCount >= 2; // 说出"1, 2, 3"至少有2个能被识别到
}
匹配阈值设为 2 而不是 3,是为了容忍 Whisper 在噪声环境下偶尔漏识一个数字的情况,避免误判失败。
4.4 Whisper Python 服务脚本(附录)
以下是完整的 transcribe_whisper.py,可直接部署到产线工控机使用。
python
import sys
import os
import whisper # pip install openai-whisper
import io
# 强制 stdout/stderr 使用 UTF-8,避免中文乱码
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
_model = None
def load_model(model_name: str = "base", model_dir: str = None):
"""启动时预热模型,加载完成后向 stdout 输出 MODEL_LOADED 信号"""
global _model
if _model is not None:
return
try:
if model_dir:
os.makedirs(model_dir, exist_ok=True)
_model = whisper.load_model(model_name, download_root=model_dir)
else:
_model = whisper.load_model(model_name)
print("MODEL_LOADED", flush=True) # C# 侧等待这一行作为就绪信号
except Exception as e:
print(f"MODEL_LOAD_ERROR: {str(e)}", file=sys.stderr, flush=True)
sys.exit(1)
def transcribe_audio(audio_path: str) -> str:
"""调用已加载的 Whisper 模型转录音频,返回识别文本"""
global _model
if _model is None:
return "ERROR: 模型未加载"
try:
result = _model.transcribe(
audio_path,
language="zh", # 指定中文,提升准确率
fp16=False, # CPU 运行必须关闭 fp16
initial_prompt="一二三", # 提示词,引导模型关注数字词汇
)
return result["text"].strip()
except Exception as e:
return f"ERROR: {str(e)}"
def run_service_mode(model_name: str = "base", model_dir: str = None):
"""
服务模式:长驻进程,通过 stdin/stdout 与 C# 通信。
通信协议:
输入一行音频文件路径 → 输出一行转录结果
输入 "EXIT" → 输出 "SERVICE_EXIT" 并退出
"""
load_model(model_name, model_dir) # 加载模型(输出 MODEL_LOADED)
while True:
try:
line = sys.stdin.readline()
if not line: # EOF,C# 进程已关闭管道
break
line = line.strip()
if line.upper() == "EXIT":
print("SERVICE_EXIT", flush=True)
break
if not line:
continue
if not os.path.exists(line):
print(f"ERROR: 文件不存在: {line}", flush=True)
continue
result = transcribe_audio(line)
print(result, flush=True) # 每次输出必须 flush,否则 C# ReadLine 收不到
except Exception as e:
print(f"ERROR: {str(e)}", flush=True)
if __name__ == "__main__":
if len(sys.argv) < 2:
print("用法:", file=sys.stderr)
print(" 服务模式: python transcribe_whisper.py --service [模型名称] [模型目录]", file=sys.stderr)
print(" 单次转录: python transcribe_whisper.py <音频文件> [模型名称] [模型目录]", file=sys.stderr)
sys.exit(1)
first_arg = sys.argv[1]
if first_arg == "--service":
model_name = sys.argv[2] if len(sys.argv) > 2 else "base"
model_dir = sys.argv[3] if len(sys.argv) > 3 else None
run_service_mode(model_name, model_dir)
else:
# 单次转录模式(调试用)
audio_file = first_arg
if not os.path.exists(audio_file):
print(f"ERROR: 音频文件不存在: {audio_file}", file=sys.stderr)
sys.exit(1)
model_name = sys.argv[2] if len(sys.argv) > 2 else "base"
model_dir = sys.argv[3] if len(sys.argv) > 3 else None
load_model(model_name, model_dir)
print(transcribe_audio(audio_file))
脚本要点说明:
| 关键设计 | 说明 |
|---|---|
flush=True |
每次 print 必须强制刷新,否则 C# ReadLine 会一直阻塞等待缓冲区满 |
MODEL_LOADED 信号 |
模型加载完成后输出此固定字符串,C# 侧 ReadLine 读到它才认为服务就绪 |
initial_prompt="一二三" |
提示词引导模型关注中文数字,在短音频(3秒)场景下可明显提升识别率 |
fp16=False |
CPU 推理必须关闭半精度浮点,否则部分硬件会报错 |
EXIT 命令 |
C# 窗口关闭时向 stdin 写入 EXIT,Python 收到后正常退出,避免变为僵尸进程 |
部署准备:
# 1. 安装依赖(工控机需联网或离线 whl 安装)
pip install openai-whisper
# 2. 首次运行会自动下载模型到指定目录(base 模型约 150MB)
python transcribe_whisper.py --service base "D:\models\whisper_base"
# 3. 调试单次转录
python transcribe_whisper.py "C:\temp\recording.wav" base "D:\models\whisper_base"
五、完整可运行 Demo
以下 Demo 实现声道验证的完整逻辑(NAudio 播放 + 随机选题)。录音部分用手动确认按钮代替 Whisper 识别,开箱即可运行,不需要 Python 环境。
前置步骤:
- Visual Studio 新建"WPF 应用(.NET Framework)",目标框架 .NET Framework 4.8
- NuGet 安装:
NAudio(版本 2.x) - 在项目输出目录下新建
Audio/AR/文件夹,放入0.wav~9.wav共10个中文数字音频文件(可用 TTS 工具生成或从 Windows 语音合成器录制)
PropertyNotifyObject.cs(与导读篇相同)
csharp
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Windows.Input;
namespace AudioTestDemo
{
public class PropertyNotifyObject : INotifyPropertyChanged, IDisposable
{
private readonly Dictionary<string, object> _values = new Dictionary<string, object>();
public event PropertyChangedEventHandler PropertyChanged;
protected void OnPropertyChanged(string name)
=> PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(name));
public T GetPropertyValue<T>(string name)
=> _values.TryGetValue(name, out var v) ? (T)v : default(T);
public void SetPropertyValue<T>(string name, T value)
{
_values[name] = value;
OnPropertyChanged(name);
}
public void Dispose() { }
}
public class RelayCommand : ICommand
{
private readonly Action _execute;
public RelayCommand(Action execute) { _execute = execute; }
public bool CanExecute(object p) => true;
public void Execute(object p) => _execute();
public event EventHandler CanExecuteChanged
{
add => CommandManager.RequerySuggested += value;
remove => CommandManager.RequerySuggested -= value;
}
}
}
NumberSpeaker.cs(声道播放封装)
csharp
using NAudio.Wave;
using System;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
namespace AudioTestDemo
{
/// <summary>
/// 在指定声道循环播放预录数字音频文件
/// </summary>
public class NumberSpeaker : IDisposable
{
private readonly string _audioDir;
private CancellationTokenSource _cts;
private Task _playTask;
private bool _disposed;
public NumberSpeaker(string audioDir)
{
_audioDir = audioDir;
}
/// <summary>
/// 在指定声道播放两位数字,循环 loopCount 次
/// </summary>
public async Task PlayNumberAsync(string twoDigitNumber, bool isLeftChannel, int loopCount)
{
await StopAsync();
string file1 = Path.Combine(_audioDir, $"{twoDigitNumber[0]}.wav");
string file2 = Path.Combine(_audioDir, $"{twoDigitNumber[1]}.wav");
if (!File.Exists(file1) || !File.Exists(file2))
throw new FileNotFoundException($"找不到数字音频文件: {file1} / {file2}");
_cts = new CancellationTokenSource();
var token = _cts.Token;
_playTask = Task.Run(async () =>
{
for (int i = 0; i < loopCount; i++)
{
if (token.IsCancellationRequested) return;
await PlaySingleAsync(file1, isLeftChannel, token);
await Task.Delay(100, token);
await PlaySingleAsync(file2, isLeftChannel, token);
if (i < loopCount - 1)
await Task.Delay(1000, token);
}
}, token);
try { await _playTask; }
catch (OperationCanceledException) { }
}
private async Task PlaySingleAsync(string filePath, bool isLeftChannel, CancellationToken token)
{
using (var reader = new WaveFileReader(filePath))
using (var channel32 = new WaveChannel32(reader))
using (var waveOut = new WaveOutEvent())
{
// -1.0f = 纯左声道, 1.0f = 纯右声道
channel32.Pan = isLeftChannel ? -1.0f : 1.0f;
waveOut.Init(channel32);
var tcs = new TaskCompletionSource<bool>();
waveOut.PlaybackStopped += (s, e) => tcs.TrySetResult(true);
waveOut.Play();
await Task.WhenAny(tcs.Task, Task.Delay(500, token));
if (waveOut.PlaybackState == PlaybackState.Playing)
waveOut.Stop();
}
}
public async Task StopAsync()
{
_cts?.Cancel();
if (_playTask != null)
{
try { await _playTask; } catch { }
}
_cts?.Dispose();
_cts = null;
_playTask = null;
}
public void Dispose()
{
_cts?.Cancel();
_cts?.Dispose();
_disposed = true;
}
}
}
MainWindow.xaml
xml
<Window x:Class="AudioTestDemo.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="音频功能测试" Height="520" Width="680"
Background="#FF161B31"
WindowStartupLocation="CenterScreen"
ContentRendered="Window_ContentRendered"
KeyDown="Window_KeyDown">
<Window.Resources>
<BooleanToVisibilityConverter x:Key="BoolToVis" />
<Style TargetType="Button" x:Key="OptionBtnStyle">
<Setter Property="Width" Value="160" />
<Setter Property="Height" Value="60" />
<Setter Property="FontSize" Value="22" />
<Setter Property="FontWeight" Value="Bold" />
<Setter Property="Foreground" Value="White" />
<Setter Property="Background" Value="#FF1E90FF" />
<Setter Property="BorderThickness" Value="0" />
<Setter Property="Margin" Value="10,0" />
</Style>
</Window.Resources>
<Grid Margin="30">
<Grid.RowDefinitions>
<RowDefinition Height="Auto" />
<RowDefinition Height="Auto" />
<RowDefinition Height="Auto" />
<RowDefinition Height="*" />
<RowDefinition Height="Auto" />
</Grid.RowDefinitions>
<!-- 声道提示 -->
<TextBlock x:Name="ChannelHintText" Grid.Row="0"
Text="准备测试..." FontSize="20" Foreground="#AAFFFFFF"
HorizontalAlignment="Center" Margin="0,10,0,6" />
<!-- 主状态文字 -->
<TextBlock x:Name="StatusText" Grid.Row="1"
Text="" FontSize="16"
Foreground="White" TextWrapping="Wrap"
HorizontalAlignment="Center" TextAlignment="Center"
Margin="0,0,0,20" />
<!-- 播放状态 -->
<TextBlock x:Name="PlayStatusText" Grid.Row="2"
Text="" FontSize="14" Foreground="#FFD700"
HorizontalAlignment="Center" Margin="0,0,0,10" />
<!-- 选项区域 -->
<StackPanel x:Name="OptionsPanel" Grid.Row="3"
Orientation="Horizontal"
HorizontalAlignment="Center"
VerticalAlignment="Center"
Visibility="Collapsed">
<Button x:Name="Option1Btn" Style="{StaticResource OptionBtnStyle}" Click="OptionBtn_Click" />
<Button x:Name="Option2Btn" Style="{StaticResource OptionBtnStyle}" Click="OptionBtn_Click" />
<Button x:Name="Option3Btn" Style="{StaticResource OptionBtnStyle}" Click="OptionBtn_Click" />
</StackPanel>
<!-- 录音区域 -->
<StackPanel x:Name="RecordingPanel" Grid.Row="3"
HorizontalAlignment="Center"
VerticalAlignment="Center"
Visibility="Collapsed">
<TextBlock x:Name="RecordingHintText"
Text="请点击按钮或按 Enter 开始录音,然后说"一、二、三""
FontSize="16" Foreground="White"
TextAlignment="Center" TextWrapping="Wrap"
Margin="0,0,0,16" />
<TextBlock x:Name="RecordingStatusText"
Text="" FontSize="15"
Foreground="White" TextAlignment="Center"
TextWrapping="Wrap" Margin="0,0,0,20" />
<Button x:Name="StartRecordBtn"
Content="开始录音"
Width="160" Height="48" FontSize="16"
Foreground="White" Background="#FF2D8A2D"
BorderThickness="0"
HorizontalAlignment="Center"
Click="StartRecordBtn_Click" />
</StackPanel>
<!-- 底部:结果 PASS/FAIL 按钮 -->
<StackPanel Grid.Row="4" Orientation="Horizontal" HorizontalAlignment="Center" Margin="0,10,0,0">
<Button x:Name="PassBtn" Content="PASS"
Visibility="Collapsed"
Width="120" Height="44" FontSize="16"
Foreground="White" Background="#FF2D8A2D"
BorderThickness="0" Margin="10,0"
Click="PassBtn_Click" />
<Button x:Name="FailBtn" Content="FAIL"
Visibility="Collapsed"
Width="120" Height="44" FontSize="16"
Foreground="White" Background="#FF8A2D2D"
BorderThickness="0" Margin="10,0"
Click="FailBtn_Click" />
</StackPanel>
</Grid>
</Window>
MainWindow.xaml.cs
csharp
using NAudio.CoreAudioApi;
using NAudio.Wave;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Threading;
namespace AudioTestDemo
{
public partial class MainWindow : Window
{
// ── 配置 ──────────────────────────────────────────────
private const int PLAY_LOOP_COUNT = 3;
private const int RECORDING_SECONDS = 3;
// ── 录音 Win32 API ─────────────────────────────────────
[DllImport("winmm.dll")]
private static extern uint mciSendString(string cmd, string ret, uint retLen, IntPtr hWnd);
private readonly string _tempWav = Path.Combine(Path.GetTempPath(), "audio_test.wav");
// ── 状态 ──────────────────────────────────────────────
private NumberSpeaker _speaker;
private Random _random = new Random();
private string _correctAnswer;
private List<string> _options;
private bool _isLeftChannel;
private int _channelIndex; // 0=第一声道, 1=第二声道
private bool _isTesting = true;
// 录音计时
private DispatcherTimer _recordTimer;
private int _recordedSeconds;
public MainWindow()
{
InitializeComponent();
string audioDir = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Audio", "AR");
_speaker = new NumberSpeaker(audioDir);
_recordTimer = new DispatcherTimer { Interval = TimeSpan.FromSeconds(1) };
_recordTimer.Tick += RecordTimer_Tick;
}
private async void Window_ContentRendered(object sender, EventArgs e)
{
SetVolumeToMax();
_isLeftChannel = _random.Next(2) == 0; // 随机决定先测哪个声道
_channelIndex = 0;
_isTesting = true;
await RunChannelTest();
}
// ── 声道测试 ────────────────────────────────────────────
private async Task RunChannelTest()
{
if (!_isTesting) return;
_correctAnswer = _random.Next(10, 100).ToString();
_options = GenerateOptions(_correctAnswer);
string name = _isLeftChannel ? "左声道" : "右声道";
ChannelHintText.Text = $"正在测试【{name}】";
StatusText.Text = "请仔细听声音...";
OptionsPanel.Visibility = Visibility.Collapsed;
RecordingPanel.Visibility = Visibility.Collapsed;
ShowOptions();
await _speaker.PlayNumberAsync(_correctAnswer, _isLeftChannel, PLAY_LOOP_COUNT);
}
private void ShowOptions()
{
Option1Btn.Content = _options[0];
Option2Btn.Content = _options[1];
Option3Btn.Content = _options[2];
var blue = new SolidColorBrush(Color.FromRgb(30, 144, 255));
Option1Btn.Background = Option2Btn.Background = Option3Btn.Background = blue;
Option1Btn.IsEnabled = Option2Btn.IsEnabled = Option3Btn.IsEnabled = true;
OptionsPanel.Visibility = Visibility.Visible;
StatusText.Text = "请选择您听到的数字";
}
private async void OptionBtn_Click(object sender, RoutedEventArgs e)
{
var btn = sender as Button;
if (btn == null) return;
Option1Btn.IsEnabled = Option2Btn.IsEnabled = Option3Btn.IsEnabled = false;
string selected = btn.Content.ToString();
string channelName = _isLeftChannel ? "左" : "右";
if (selected == _correctAnswer)
{
btn.Background = Brushes.Green;
StatusText.Text = "✓ 正确!";
await _speaker.StopAsync();
_channelIndex++;
if (_channelIndex >= 2)
{
// 两个声道都通过,进入录音测试
await Task.Delay(800);
StartRecordingTest();
}
else
{
_isLeftChannel = !_isLeftChannel;
await Task.Delay(600);
await RunChannelTest();
}
}
else
{
btn.Background = Brushes.Red;
StatusText.Text = $"✗ 错误!正确答案是: {_correctAnswer}";
await _speaker.StopAsync();
await Task.Delay(1500);
ShowResult(false, $"{channelName}声道答案错误:选{selected},正确{_correctAnswer}");
}
}
private List<string> GenerateOptions(string correct)
{
var set = new HashSet<string> { correct };
while (set.Count < 3)
set.Add(_random.Next(10, 100).ToString());
return set.OrderBy(_ => _random.Next()).ToList();
}
// ── 录音测试 ────────────────────────────────────────────
private void StartRecordingTest()
{
OptionsPanel.Visibility = Visibility.Collapsed;
RecordingPanel.Visibility = Visibility.Visible;
ChannelHintText.Text = "录音测试";
RecordingHintText.Text = "请按 Enter 或点击按钮开始录音,然后说"一、二、三"";
RecordingStatusText.Text = "";
StartRecordBtn.IsEnabled = true;
}
private void StartRecordBtn_Click(object sender, RoutedEventArgs e)
{
StartRecordBtn.IsEnabled = false;
mciSendString("open new type waveaudio alias mic", null, 0, IntPtr.Zero);
mciSendString("record mic", null, 0, IntPtr.Zero);
_recordedSeconds = 0;
_recordTimer.Start();
RecordingStatusText.Text = "录音中...请说"一、二、三"";
}
private async void RecordTimer_Tick(object sender, EventArgs e)
{
_recordedSeconds++;
if (_recordedSeconds < RECORDING_SECONDS)
{
RecordingStatusText.Text = $"录音中... ({_recordedSeconds}/{RECORDING_SECONDS}秒)";
}
else
{
_recordTimer.Stop();
mciSendString("stop mic", null, 0, IntPtr.Zero);
mciSendString($"save mic \"{_tempWav}\"", null, 0, IntPtr.Zero);
mciSendString("close mic", null, 0, IntPtr.Zero);
RecordingStatusText.Text = "录音完成,正在回放...";
// 回放录音
await PlayRecording();
// Demo 中使用手动确认(实际产线接入 Whisper 自动识别)
RecordingStatusText.Text = "请确认是否能听到您说的"一、二、三"";
RecordingStatusText.Foreground = Brushes.White;
PassBtn.Visibility = Visibility.Visible;
FailBtn.Visibility = Visibility.Visible;
StartRecordBtn.IsEnabled = true;
}
}
private async Task PlayRecording()
{
if (!File.Exists(_tempWav)) return;
mciSendString("close all", null, 0, IntPtr.Zero);
mciSendString($"play \"{_tempWav}\"", null, 0, IntPtr.Zero);
await Task.Delay(RECORDING_SECONDS * 1000 + 500);
}
private void Window_KeyDown(object sender, KeyEventArgs e)
{
if (e.Key == Key.Enter && StartRecordBtn.IsEnabled)
StartRecordBtn_Click(StartRecordBtn, null);
}
// ── 结果 ────────────────────────────────────────────────
private void PassBtn_Click(object sender, RoutedEventArgs e)
=> ShowResult(true, "操作员确认:录音正常");
private void FailBtn_Click(object sender, RoutedEventArgs e)
=> ShowResult(false, "操作员确认:录音异常");
private void ShowResult(bool pass, string reason)
{
_isTesting = false;
PassBtn.Visibility = FailBtn.Visibility = Visibility.Collapsed;
OptionsPanel.Visibility = RecordingPanel.Visibility = Visibility.Collapsed;
ChannelHintText.Text = pass ? "PASS" : "FAIL";
ChannelHintText.Foreground = pass ? Brushes.LimeGreen : Brushes.OrangeRed;
ChannelHintText.FontSize = 52;
StatusText.Text = reason;
StatusText.Foreground = pass ? Brushes.LimeGreen : Brushes.OrangeRed;
}
// ── 工具方法 ────────────────────────────────────────────
private static void SetVolumeToMax()
{
try
{
using (var enumerator = new MMDeviceEnumerator())
{
var playback = enumerator.GetDefaultAudioEndpoint(DataFlow.Render, Role.Multimedia);
if (playback?.AudioEndpointVolume != null)
{
playback.AudioEndpointVolume.MasterVolumeLevelScalar = 1.0f;
playback.AudioEndpointVolume.Mute = false;
}
var capture = enumerator.GetDefaultAudioEndpoint(DataFlow.Capture, Role.Multimedia);
if (capture?.AudioEndpointVolume != null)
{
capture.AudioEndpointVolume.MasterVolumeLevelScalar = 1.0f;
capture.AudioEndpointVolume.Mute = false;
}
}
}
catch { /* 音量设置失败不影响测试流程继续 */ }
}
protected override void OnClosed(EventArgs e)
{
base.OnClosed(e);
_isTesting = false;
_speaker?.Dispose();
_recordTimer.Stop();
mciSendString("close all", null, 0, IntPtr.Zero);
if (File.Exists(_tempWav)) { try { File.Delete(_tempWav); } catch { } }
}
}
}
运行效果说明:
| 阶段 | 界面内容 | 操作员动作 |
|---|---|---|
| 声道1测试 | 显示声道名 + 3个选项按钮,正在播放数字 | 听声音,点击听到的数字 |
| 选正确 | 按钮变绿,切换下一声道 | 自动 |
| 选错误 | 按钮变红,显示正确答案 | 自动 FAIL |
| 录音测试 | 显示录音界面 | 按 Enter 录音,说"一、二、三" |
| 回放 | 自动播放录音 | 确认能否听到自己的声音 |
| 结果 | 顶部大字 PASS/FAIL | --- |
接入 Whisper 自动识别 :将
RecordTimer_Tick中录音完成后的手动 PASS/FAIL 按钮替换为await ProcessRecordingAndAutoJudge(),参考本文第四章 4.3 节中的 Whisper 进程通信实现,可完全去掉人工判断环节。后续会结合机器学习和AI模型,发布一版更好的AI自动识别音频质量的Demo。
六、总结与下篇预告
本篇核心要点:
- WaveChannel32.Pan 控制声道:-1.0f 纯左、1.0f 纯右,精确、无副作用,是 NAudio 声道测试的标准做法
- CancellationToken 中断播放:操作员选完答案后立即取消剩余循环播放,用户体验更流畅
- mciSendString 轻量录音:Win32 原生 API,零依赖,适合简单的录音采集场景
- Whisper 常驻进程:模型只加载一次,后续通过 stdin/stdout 管道通信,避免每次调用的冷启动延迟
下一篇:【CPU测试】WPF 实现 CPU 卡频检测:PerformanceCounter + Prime95 压力测试方案
如有问题欢迎评论区交流。觉得有用的话点个赞,系列会持续更新 😃