一、PyAudio介绍
1 简单介绍
PyAudio就是Python对PortAudio的封装,用来做音频采集和播放。
常见用途:
- 麦克风录音
- 播放音频
- 实时音频流(语音识别,降噪,VAD)
- AI语音前处理(ASR/声学特征)
2 安装
pip install pyaudio
python
import pyaudio
p = pyaudio.PyAudio()
- PyAudio():音频系统入口
- Stream:音频流(录音,放音,都靠它)
二、一些示例
示例一:播放WAV文件
python
import pyaudio
import wave
wf = wave.open("output.wav", 'rb')
p = pyaudio.PyAudio()
stream = p.open(
format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True
)
data = wf.readframes(1024)
while data:
stream.write(data)
data = wf.readframes(1024)
stream.stop_stream()
stream.close()
p.terminate()
关键点:
- output=True 播放
- write() 向声卡写数据
- WAV自带采样率/通道数
示例二:麦克风录音
功能:录音5秒,保存为WAV
python
import pyaudio
import wave
FORMAAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
CHUNK = 1024
RECORD_SECONDS = 5 # 录制5秒
OUTPUT_RATE = "record.wav"
p = pyaudio.PyAudio()
stream = p.open(
format=FORMAAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
frames = []
for _ in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(OUTPUT_RATE, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
示例3:实时录音+实时播放(回声)
常用于调试麦克风是否正常
python
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(
format=pyaudio.paInt16,
channels=1,
rate=16000,
input=True,
output=True,
frames_per_buffer=1024
)
while True:
data = stream.read(1024)
stream.write(data)
示例4:获取音频设备列表(非常重要)
python
import pyaudio
p = pyaudio.PyAudio()
for i in range(p.get_device_cout()):
info = p.get_device_info_by_index(i)
print(i, info["name"], info["maxInputChannels"])
输出格式如下:
text
0 Microsoft 音效對應表 - Input 2
1 麥克風 (BT-BT) 1
2 立體聲混音 (Realtek High Definition 2
3 Microsoft 音效對應表 - Output 0
4 喇叭 (BT-BT) 0
5 主要音效擷取驅動程式 2
6 麥克風 (BT-BT) 1
7 立體聲混音 (Realtek High Definition Audio) 2
8 主要音效驅動程式 0
9 喇叭 (BT-BT) 0
10 喇叭 (BT-BT) 0
11 立體聲混音 (Realtek High Definition Audio) 2
12 麥克風 (BT-BT) 1
13 Speakers (Realtek HD Audio output) 0
14 線路輸入 (Realtek HD Audio Line input) 2
15 立體聲混音 (Realtek HD Audio Stereo input) 2
16 麥克風 (Realtek HD Audio Mic input) 2
17 喇叭 (BT-BT) 0
18 麥克風 (BT-BT) 1
指定设备录音:
python
stream = p.open(
format = pyaudio.paInt16,
channels = 1,
rate = 16000,
input=True,
input_device_index = 2
)
示例5:PCM转numpy
python
import numpy as np
audio_bytes = stream.read(1024)
audio_np = np.frombuffer(audio_bytes, dtype=np.int16)
三、常见参数速查表
- format:数据格式(paInt16 最常用)
- channels:1=单声道 2=立体声
- rate:采样率(16000 / 44100)
- frames_per_buffer:缓冲区大小
- input:是否录音
- output:是否播放
- stream_callback:回调模式
典型应用组合
- 语音识别:PyAudio + numpy + whisper
- 实时VAD:PyAudio + vad
- 语音唤醒
- 音频可视化:PyAudio + matplotlib
本篇记录于2026年1月28日