Python 和 matplotlib 保存图像时，确保图像的分辨率和像素符合特定要求（如 64x64）,批量保存不溢出内存

在使用 Python 和 matplotlib 保存图像时，确保图像的分辨率和像素符合特定要求（如 64x64）需要正确设置图像的尺寸和分辨率参数。此外，批量保存图像时，内存溢出问题可能是由于同时处理过多图像导致的。以下是一些解决方案：

1. 确保保存的图像为 64x64 像素

设置图像尺寸和分辨率

在 matplotlib 中，可以通过 figsize 和 dpi 参数来控制图像的尺寸和分辨率。具体来说：

figsize 控制图像的物理尺寸（以英寸为单位）。
dpi 控制图像的分辨率（每英寸的像素数）。

要保存一个 64x64 像素的图像，可以这样设置：

python 复制代码

figsize = (64 / 100, 64 / 100)  # 图像尺寸为 0.64x0.64 英寸
dpi = 100  # 每英寸 100 像素

这样，图像的实际像素大小将是：

\\text{宽度} = \\text{figsize.width} \\times \\text{dpi} = 0.64 \\times 100 = 64

\\text{高度} = \\text{figsize.height} \\times \\text{dpi} = 0.64 \\times 100 = 64

示例代码

python 复制代码

import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np

# 加载音频文件
y, sr = librosa.load('example.wav', sr=16000)

# 计算Mel频谱图
mel = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128)
log_mel = librosa.power_to_db(mel, ref=np.max)

# 设置图像尺寸和分辨率
figsize = (64 / 100, 64 / 100)  # 图像尺寸为 0.64x0.64 英寸
dpi = 100  # 每英寸 100 像素

# 创建图像
plt.figure(figsize=figsize, dpi=dpi, frameon=False)
ax = plt.Axes(plt.gcf(), [0., 0., 1., 1.])
ax.set_axis_off()
plt.gcf().add_axes(ax)

# 绘制频谱图
librosa.display.specshow(log_mel, sr=sr, x_axis='time', y_axis='mel', ax=ax)

# 保存图像
plt.savefig('mel_spectrogram_64x64.png', bbox_inches='tight', pad_inches=0)
plt.close()

2. 解决批量保存时的内存溢出问题

使用线程池时的内存管理

在批量处理图像时，内存溢出通常是由于同时加载和处理过多数据导致的。可以通过以下方法来解决：

限制线程池的大小：减少同时运行的线程数量，避免同时处理过多图像。
分批处理：将数据分成多个批次，逐批处理，避免一次性加载所有数据。
清理内存 ：在每个任务完成后，显式地清理内存，例如使用 del 删除不再需要的变量，并调用 gc.collect()。

示例代码

python 复制代码

import os
import gc
from concurrent.futures import ThreadPoolExecutor

def save_mel_as_image(filepath, output_dir):
    try:
        y, sr = librosa.load(filepath, sr=16000)
        mel = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128)
        log_mel = librosa.power_to_db(mel, ref=np.max)

        figsize = (64 / 100, 64 / 100)
        dpi = 100

        plt.figure(figsize=figsize, dpi=dpi, frameon=False)
        ax = plt.Axes(plt.gcf(), [0., 0., 1., 1.])
        ax.set_axis_off()
        plt.gcf().add_axes(ax)

        librosa.display.specshow(log_mel, sr=sr, x_axis='time', y_axis='mel', ax=ax)

        filename = os.path.splitext(os.path.basename(filepath))[0] + ".png"
        plt.savefig(os.path.join(output_dir, filename), bbox_inches='tight', pad_inches=0)
        plt.close()

        # 清理内存
        del y, sr, mel, log_mel
        gc.collect()
    except Exception as e:
        print(f"Error processing file {filepath}: {e}")

def batch_convert_to_images(input_dir, output_dir, max_workers=4):
    file_list = [os.path.join(input_dir, f) for f in os.listdir(input_dir) if f.endswith(('.wav', '.mp3'))]
    os.makedirs(output_dir, exist_ok=True)

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        executor.map(lambda f: save_mel_as_image(f, output_dir), file_list)

if __name__ == "__main__":
    batch_convert_to_images("data/yasuoji/OK", "data/yasuoji/ok_mel_images", max_workers=2)
    batch_convert_to_images("data/yasuoji/NG", "data/yasuoji/ng_mel_images", max_workers=2)

3. 确保 MATLAB 另存为的图像与 Python 一致

如果在 MATLAB 中另存为的图像与 Python 生成的图像不一致，可能是由于 MATLAB 和 Python 在图像保存时的默认设置不同。可以通过以下方法确保一致性：

检查 MATLAB 的保存设置：确保 MATLAB 中的保存分辨率和尺寸与 Python 一致。
使用相同的库和参数 ：在 MATLAB 中使用类似的库（如 audioread 和 melcepst）来处理音频和生成频谱图。

MATLAB 示例代码

matlab 复制代码

% 加载音频文件
[y, sr] = audioread('example.wav');

% 计算Mel频谱图
mel = melcepst(y, sr);

% 转换为对数尺度
log_mel = 20 * log10(abs(mel));

% 保存图像
figure;
imagesc(log_mel);
axis off;
set(gca, 'position', [0 0 1 1]);
set(gcf, 'position', [0 0 64 64]);
saveas(gcf, 'mel_spectrogram_64x64.png');
close;

总结

确保图像尺寸和分辨率 ：通过正确设置 figsize 和 dpi 参数，确保生成的图像符合 64x64 像素的要求。
解决内存溢出问题：通过限制线程池大小、分批处理和显式清理内存，避免批量处理时的内存溢出。
一致性：确保 MATLAB 和 Python 中的图像保存设置一致，以生成相同的图像。

通过这些方法，可以有效地解决图像尺寸不一致和内存溢出的问题。

Python 和 matplotlib 保存图像时，确保图像的分辨率和像素符合特定要求（如 64x64）,批量保存 不溢出内存

1. 确保保存的图像为 64x64 像素

设置图像尺寸和分辨率

示例代码

2. 解决批量保存时的内存溢出问题

使用线程池时的内存管理

示例代码

3. 确保 MATLAB 另存为的图像与 Python 一致

MATLAB 示例代码

总结

Python 和 matplotlib 保存图像时，确保图像的分辨率和像素符合特定要求（如 64x64）,批量保存不溢出内存