前言
音频格式转换是音频处理工具中最常用的核心功能之一。本文将详细介绍如何基于 Qt 5.15 + QML 构建现代化的前端用户界面,并配合 FFmpeg 的强大音视频处理库(包括 libavcodec、libavformat 和 libswresample)实现 MP3、AAC、FLAC、OGG、WAV 等主流音频格式的编码与导出功能。

一、整体架构
ConvertPage.qml(前端 UI:参数选择 + 操作触发)
↓ 调用
MediaAnalyzer::convertCurrentAudio()(C++ 业务层,Q_INVOKABLE)
↓ 委托
AudioEncoder::encodePcmToFile()(核心编码器,直接操作 FFmpeg API)
输入源是已解码的原始 PCM 数据(由全局解码流程预先准备好),用户只需选择目标格式参数即可一键转换。
二、前端 UI 设计
2.1 参数布局
页面使用两个 Rectangle 卡片分区:编码参数 和 输出路径。
参数区第一行:格式 / 编码器 / 码率;第二行:采样率 / 通道数。
qml
// 支持的输出格式列表
property var formats: ["mp3", "aac", "flac", "ogg", "wav"]
// 格式→编码器映射
function codecForFormat(format) {
if (format === "mp3") return "libmp3lame"
if (format === "aac") return "aac"
if (format === "flac") return "flac"
if (format === "ogg") return "libvorbis"
if (format === "wav") return "pcm_s16le"
return "auto"
}
2.2 控件选型
| 参数 | 控件 | 说明 |
|---|---|---|
| 格式 | ToolComboBox |
下拉选择,切换时自动更新编码器和默认输出路径 |
| 编码器 | TextField |
允许用户手动覆盖编码器名称 |
| 码率 | ToolSpinBox |
32--512 kbps,步进 16 |
| 采样率 | ToolComboBox |
8000--96000 Hz,默认跟随当前 PCM |
| 通道数 | ToolComboBox |
1/2/4/6 通道 |
格式切换时联动更新编码器字段:
qml
ToolComboBox {
id: formatBox
model: page.formats
onCurrentTextChanged: {
codecField.text = page.codecForFormat(currentText)
page.refreshDefaultPath()
}
}
2.3 触发转换
"开始转换"按钮的 enabled 绑定 mediaAnalyzer.pcmSize > 0,点击时把所有参数传给 C++ 层:
qml
ActionButton {
text: "开始转换"
accent: true
enabled: !mediaAnalyzer.busy && mediaAnalyzer.pcmSize > 0
onClicked: mediaAnalyzer.convertCurrentAudio(
outputPathField.text.trim(),
formatBox.currentText,
codecField.text.trim(),
bitrateBox.value,
parseInt(sampleRateBox.currentText),
parseInt(channelsBox.currentText))
}
三、后端实现
3.1 业务层入口
MediaAnalyzer::convertCurrentAudio() 负责参数透传和状态管理:
cpp
bool MediaAnalyzer::convertCurrentAudio(const QString &filePath,
const QString &format,
const QString &codecName,
int bitrateKbps,
int sampleRate,
int channels)
{
if (m_pcmData.isEmpty()) {
setStatus("请先解码 PCM");
return false;
}
setBusy(true);
const bool ok = m_audioEncoder.encodePcmToFile(
outputPath, m_pcmData, currentPcmInfo,
format, codecName, bitrateKbps, sampleRate, channels,
&result, &errorText);
// 更新状态信息和 UI 反馈
if (ok) setConvertInfo(result);
setBusy(false);
return ok;
}
3.2 编码器核心逻辑
AudioEncoder::encodePcmToFile() 是真正与 FFmpeg 交互的地方,整体流程分为 初始化 → 编码循环 → 收尾 三个阶段。
① 初始化阶段
cpp
// 1. 创建输出容器
avformat_alloc_output_context2(&formatContext, nullptr, format, outputPath);
// 2. 查找编码器(优先按名称,否则取容器默认编码器)
const AVCodec *codec = avcodec_find_encoder_by_name(codecName);
if (!codec) codec = avcodec_find_encoder(formatContext->oformat->audio_codec);
// 3. 配置编码上下文
codecContext->sample_rate = closestSupportedSampleRate(codec, targetSampleRate);
codecContext->sample_fmt = chooseSampleFormat(codec);
codecContext->bit_rate = targetBitrate;
// 4. 初始化 SwrContext 重采样器
// 输入:原始 PCM 的采样率/格式/通道
// 输出:编码器要求的采样率/格式/通道
swr_alloc_set_opts2(&swrContext,
&outputLayout, codecContext->sample_fmt, codecContext->sample_rate,
&inputLayout, inputSpec.format, inputSampleRate, ...);
其中 chooseSampleFormat() 会按优先级从编码器支持的格式列表中挑选最佳采样格式(优先 FLTP → S16P → S16 ...)。
② 编码循环
逐帧读取 PCM,经 swr_convert 重采样后送入编码器:
cpp
while (inputOffsetFrames < totalInputFrames) {
const int remaining = min(totalInputFrames - inputOffsetFrames, frameSamples);
// 重采样:输入 interleaved PCM → 编码器目标格式
const int converted = swr_convert(swrContext,
frame->data, frameSamples,
inputData, remaining);
frame->nb_samples = converted;
frame->pts = encodedFrames;
// 编码 + 写入容器
avcodec_send_frame(codecContext, frame);
while (avcodec_receive_packet(codecContext, packet) == 0) {
av_packet_rescale_ts(packet, codecContext->time_base, stream->time_base);
av_interleaved_write_frame(formatContext, packet);
}
}
循环结束后还需要 drain 重采样器和编码器的缓冲:
cpp
// drain SwrContext 剩余采样
while ((converted = swr_convert(swrContext, frame->data, frameSamples, nullptr, 0)) > 0) {
encodeFrame(frame);
}
// drain encoder 剩余包
encodeFrame(nullptr);
③ 收尾
cpp
av_write_trailer(formatContext);
// 释放所有 FFmpeg 资源
releaseEncoderResources(&formatContext, &codecContext, &swrContext, ...);
encodePcmToFile 完整代码
cpp
bool AudioEncoder::encodePcmToFile(const QString &filePath,
const QByteArray &pcm,
const QVariantMap &pcmInfo,
const QString &format,
const QString &codecName,
int bitrateKbps,
int sampleRate,
int channels,
QVariantMap *encodeInfo,
QString *errorText) const
{
if (pcm.isEmpty()) {
if (errorText)
*errorText = QStringLiteral("没有可转换的 PCM 数据。");
return false;
}
const QString outputPath = normalizedOutputPath(filePath, format);
if (outputPath.isEmpty()) {
if (errorText)
*errorText = QStringLiteral("输出路径无效。");
return false;
}
const QFileInfo outputInfo(outputPath);
if (!outputInfo.absoluteDir().exists()) {
if (errorText)
*errorText = QStringLiteral("输出目录不存在。");
return false;
}
const int inputSampleRate = pcmInfo.value(QStringLiteral("sampleRate")).toInt();
const int inputChannels = pcmInfo.value(QStringLiteral("channels")).toInt();
const SampleSpec inputSpec = sampleSpecFromInfo(pcmInfo);
if (inputSampleRate <= 0 || inputChannels <= 0 || inputSpec.format == AV_SAMPLE_FMT_NONE) {
if (errorText)
*errorText = QStringLiteral("当前 PCM 格式无效。");
return false;
}
const int inputFrameSize = inputChannels * inputSpec.bytesPerSample;
if (inputFrameSize <= 0 || pcm.size() % inputFrameSize != 0) {
if (errorText)
*errorText = QStringLiteral("PCM 字节数与格式不匹配。");
return false;
}
const int targetSampleRate = sampleRate > 0 ? sampleRate : inputSampleRate;
const int targetChannels = channels > 0 ? channels : inputChannels;
const int targetBitrate = qBound(32, bitrateKbps, 1024) * 1000;
const QString requestedFormat = format.trimmed().toLower();
const QString requestedCodec = codecName.trimmed();
AVFormatContext *formatContext = nullptr;
AVCodecContext *codecContext = nullptr;
SwrContext *swrContext = nullptr;
AVFrame *frame = nullptr;
AVPacket *packet = nullptr;
AVChannelLayout inputLayout;
AVChannelLayout outputLayout;
memset(&inputLayout, 0, sizeof(inputLayout));
memset(&outputLayout, 0, sizeof(outputLayout));
int ret = avformat_alloc_output_context2(&formatContext,
nullptr,
requestedFormat.isEmpty() ? nullptr : requestedFormat.toUtf8().constData(),
outputPath.toUtf8().constData());
if (ret < 0 || !formatContext) {
if (errorText)
*errorText = ret < 0 ? FFmpegUtils::errorString(ret) : QStringLiteral("无法创建输出容器。");
return false;
}
const AVCodec *codec = nullptr;
if (!requestedCodec.isEmpty() && requestedCodec != QStringLiteral("auto"))
codec = avcodec_find_encoder_by_name(requestedCodec.toUtf8().constData());
if (!codec)
codec = avcodec_find_encoder(formatContext->oformat->audio_codec);
if (!codec) {
if (errorText)
*errorText = QStringLiteral("当前格式没有可用的音频编码器。");
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
codecContext = avcodec_alloc_context3(codec);
if (!codecContext) {
if (errorText)
*errorText = QStringLiteral("无法创建编码上下文。");
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
av_channel_layout_default(&inputLayout, inputChannels);
av_channel_layout_default(&outputLayout, targetChannels);
codecContext->codec_id = codec->id;
codecContext->codec_type = AVMEDIA_TYPE_AUDIO;
codecContext->sample_rate = sampleRateSupported(codec, targetSampleRate)
? targetSampleRate
: closestSupportedSampleRate(codec, targetSampleRate);
codecContext->sample_fmt = chooseSampleFormat(codec);
codecContext->bit_rate = targetBitrate;
codecContext->time_base = AVRational{1, codecContext->sample_rate};
ret = av_channel_layout_copy(&codecContext->ch_layout, &outputLayout);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
if (formatContext->oformat->flags & AVFMT_GLOBALHEADER)
codecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
ret = avcodec_open2(codecContext, codec, nullptr);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
AVStream *stream = avformat_new_stream(formatContext, nullptr);
if (!stream) {
if (errorText)
*errorText = QStringLiteral("无法创建输出音频流。");
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
stream->time_base = AVRational{1, codecContext->sample_rate};
ret = avcodec_parameters_from_context(stream->codecpar, codecContext);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
ret = swr_alloc_set_opts2(&swrContext,
&codecContext->ch_layout,
codecContext->sample_fmt,
codecContext->sample_rate,
&inputLayout,
inputSpec.format,
inputSampleRate,
0,
nullptr);
if (ret < 0 || !swrContext) {
if (errorText)
*errorText = ret < 0 ? FFmpegUtils::errorString(ret) : QStringLiteral("无法创建重采样上下文。");
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
ret = swr_init(swrContext);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
if ((formatContext->oformat->flags & AVFMT_NOFILE) == 0) {
ret = avio_open(&formatContext->pb, outputPath.toUtf8().constData(), AVIO_FLAG_WRITE);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
}
ret = avformat_write_header(formatContext, nullptr);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
const int frameSamples = codecContext->frame_size > 0 ? codecContext->frame_size : 1024;
frame = av_frame_alloc();
packet = av_packet_alloc();
if (!frame || !packet) {
if (errorText)
*errorText = QStringLiteral("无法分配编码缓存。");
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
frame->nb_samples = frameSamples;
frame->format = codecContext->sample_fmt;
ret = av_channel_layout_copy(&frame->ch_layout, &codecContext->ch_layout);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
ret = av_frame_get_buffer(frame, 0);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
auto encodeFrame = [&](AVFrame *encodeFrame) -> bool {
int sendRet = avcodec_send_frame(codecContext, encodeFrame);
if (sendRet < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(sendRet);
return false;
}
while (true) {
int receiveRet = avcodec_receive_packet(codecContext, packet);
if (receiveRet == AVERROR(EAGAIN) || receiveRet == AVERROR_EOF)
return true;
if (receiveRet < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(receiveRet);
return false;
}
av_packet_rescale_ts(packet, codecContext->time_base, stream->time_base);
packet->stream_index = stream->index;
receiveRet = av_interleaved_write_frame(formatContext, packet);
av_packet_unref(packet);
if (receiveRet < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(receiveRet);
return false;
}
}
};
qint64 inputOffsetFrames = 0;
const qint64 totalInputFrames = pcm.size() / inputFrameSize;
qint64 encodedFrames = 0;
while (inputOffsetFrames < totalInputFrames) {
const int remaining = static_cast<int>(std::min<qint64>(totalInputFrames - inputOffsetFrames, frameSamples));
ret = av_frame_make_writable(frame);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
const int outCapacity = frameSamples;
const uint8_t *inputData[1] = {
reinterpret_cast<const uint8_t *>(pcm.constData() + inputOffsetFrames * inputFrameSize)
};
const int converted = swr_convert(swrContext,
frame->data,
outCapacity,
inputData,
remaining);
if (converted < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(converted);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
frame->nb_samples = converted;
frame->pts = encodedFrames;
encodedFrames += converted;
inputOffsetFrames += remaining;
const bool encodeOk = converted == 0 || encodeFrame(frame);
if (!encodeOk) {
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
}
while (true) {
ret = av_frame_make_writable(frame);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
const int converted = swr_convert(swrContext, frame->data, frameSamples, nullptr, 0);
if (converted <= 0) {
if (converted < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(converted);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
break;
}
frame->nb_samples = converted;
frame->pts = encodedFrames;
encodedFrames += converted;
const bool encodeOk = encodeFrame(frame);
if (!encodeOk) {
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
}
if (!encodeFrame(nullptr)) {
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
ret = av_write_trailer(formatContext);
if (ret < 0) {
if (errorText)
*errorText = FFmpegUtils::errorString(ret);
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
return false;
}
QVariantMap result;
const QFileInfo savedFile(outputPath);
result.insert(QStringLiteral("outputPath"), outputPath);
result.insert(QStringLiteral("format"), QString::fromUtf8(formatContext->oformat->name));
result.insert(QStringLiteral("codec"), QString::fromUtf8(codec->name));
result.insert(QStringLiteral("sampleRate"), codecContext->sample_rate);
result.insert(QStringLiteral("channels"), codecContext->ch_layout.nb_channels);
result.insert(QStringLiteral("bitrate"), QStringLiteral("%1 kbps").arg(codecContext->bit_rate / 1000));
result.insert(QStringLiteral("sampleFormat"), FFmpegUtils::sampleFormatName(codecContext->sample_fmt));
result.insert(QStringLiteral("inputSize"), FFmpegUtils::formatBytes(pcm.size()));
result.insert(QStringLiteral("outputSize"), savedFile.exists() ? FFmpegUtils::formatBytes(savedFile.size()) : QStringLiteral("-"));
releaseEncoderResources(&formatContext, &codecContext, &swrContext, &frame, &packet, &inputLayout, &outputLayout);
if (encodeInfo)
*encodeInfo = result;
return true;
}
3.3 编码结果信息
编码完成后返回一个 QVariantMap 描述输出文件信息:
| 字段 | 内容 |
|---|---|
outputPath |
输出文件路径 |
format |
容器格式名称 |
codec |
实际使用的编码器 |
sampleRate |
输出采样率 |
channels |
输出通道数 |
bitrate |
目标码率 |
inputSize / outputSize |
输入 PCM 大小 / 输出文件大小 |
四、重采样策略
编码过程中涉及两次潜在的格式转换:
- 采样率转换 :用户指定的目标采样率可能与编码器支持的列表不一致,
closestSupportedSampleRate()会自动选择最接近的支持值 - 采样格式转换 :不同编码器要求的输入格式不同(如 AAC 要 FLTP,MP3 要 S16P),
SwrContext自动完成格式/通道/采样率的统一转换
五、小结
| 特性 | 说明 |
|---|---|
| 支持格式 | mp3 / aac / flac / ogg / wav |
| 编码器 | libmp3lame / aac / flac / libvorbis / pcm_s16le |
| 参数控制 | 码率 32--512 kbps、采样率 8k--96k、通道 1/2/4/6 |
| 重采样 | FFmpeg SwrContext 自动完成格式/采样率/通道转换 |
| UI 框架 | Qt Quick / QML 深色主题,ToolComboBox + ToolSpinBox 统一控件 |
整个流程体现了"前端声明式 UI + C++ FFmpeg 底层编码"的协作模式:QML 负责参数收集和交互体验,C++ 层封装所有 FFmpeg 细节,两者通过 Q_INVOKABLE 方法桥接。