基于VGG16使用图像特征进行迁移学习的时装推荐系统

前言

系列专栏:【深度学习：算法项目实战】✨︎
涉及医疗健康、财经金融、商业零售、食品饮料、运动健身、交通运输、环境科学、社交媒体以及文本和图像处理等诸多领域，讨论了各种复杂的深度神经网络思想，如卷积神经网络、循环神经网络、生成对抗网络、门控循环单元、长短期记忆、自然语言处理、深度强化学习、大型语言模型和迁移学习。

利用图像特征提取的时尚推荐系统是一种通过分析时尚物品（如服装、配饰）的视觉内容，向用户推荐类似或互补产品的技术。

本文所提供的数据集由女性时尚物品的图像组成，包括各种服装和配饰。每张图片都代表一件独特的商品，按类型（如连衣裙、上衣、裙子）、风格（如休闲、正式、运动）以及颜色和图案等其他属性进行分类。这些图像以统一格式收集，以方便特征提取和分析过程。主要目标是开发一个时尚推荐系统，该系统可以分析输入的时尚单品图像，并根据视觉相似性从数据集中推荐相似的单品。

提取特征：利用预先训练好的 CNN 模型（如 VGG16、ResNet）从数据集中的每张图像中提取综合特征，捕捉纹理、颜色和形状等方面。
测量相似性：采用一种相似性测量方法（如余弦相似性），将提取的输入图像特征与数据集中的图像特征进行定量比较。
推荐单品：根据相似度得分，识别并推荐与输入单品视觉相似度最高的 N 个单品。

1. 相关数据集

利用图像特征构建时尚推荐系统涉及几个关键步骤，同时利用计算机视觉和机器学习技术。以下是利用图像特征构建时尚推荐系统的详细过程：

收集各种时尚物品的数据集。该数据集应包括不同颜色、图案、款式和类别的各种物品。
确保所有图像的格式（如 JPEG、PNG）和分辨率一致。
执行预处理功能，为特征提取准备图像。
选择预先训练好的 CNN 模型，如 VGG16、ResNet 或 InceptionV3。这些模型在 ImageNet 等大型数据集上经过预先训练，能够从图像中提取强大的特征表征。
将每张图像通过 CNN 模型提取特征。
定义衡量特征向量之间相似性的指标。
根据与输入图像的相似度对数据集图像进行排序，并推荐最相似的前 N 个项目。
实现一个最终函数，该函数封装了从预处理输入图像、提取特征、计算相似度到输出推荐的整个过程。

因此，这个过程从收集基于时尚服装的图片数据集开始。您可以从这里下载数据集。

1.1 导入必要库

现在，让我们通过导入必要的 Python 库，开始利用图像特征构建时尚推荐系统

python 复制代码

import os
from zipfile import ZipFile

from PIL import Image
import matplotlib.pyplot as plt

import glob
import numpy as np

from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.applications.vgg16 import preprocess_input

1.2 加载数据集

python 复制代码

zip_file_path = '/content/women fashion.zip'
extraction_directory = '/content/women_fashion/'

if not os.path.exists(extraction_directory):
    os.makedirs(extraction_directory)

with ZipFile(zip_file_path, 'r') as zip_ref:
    zip_ref.extractall(extraction_directory)

extracted_files = os.listdir(extraction_directory)
print(extracted_files[:10])

python 复制代码

['women fashion', '__MACOSX']

在上面的代码中，一个位于 Google Colab 上路径为"/content/women fashion.zip "的名为 "women fashion.zip "的压缩文件正在被解压缩到指定目录"/content/women_fashion/"。首先，我们检查提取目录是否存在，如果不存在，则使用 os.makedirs() 创建该目录。然后，使用 Python 的 ZipFile 模块，以读取模式打开 zip 文件，并将其内容解压缩到指定目录。

zip 文件包含一个名为 women fashion 的目录和一些 macOS 使用的元数据 (__MACOSX)。让我们忽略 macOS 元数据，专注于women fashion目录，列出其内容以了解我们所拥有的图片类型和数量：

python 复制代码

# correcting the path to include the 'women fashion' directory and listing its contents
extraction_directory_updated = os.path.join(extraction_directory, 'women fashion')

# list the files in the updated directory
extracted_files_updated = os.listdir(extraction_directory_updated)
extracted_files_updated[:10], len(extracted_files_updated)

python 复制代码

(['black floral saree.jpg',
  'black, sequined dress with thin shoulder straps.jpg',
  'dark blue, knee-length dress with thin straps.jpg',
  'classic black slip dress with a midi length.jpg',
  'black off-shoulder dress with belt.jpg',
  'white, intricately detailed top and a flowing dark blue skirt.jpg',
  'Women-off-the-shoulder-sexy-embroidery-fashion-party-dress-1.png',
  'fitted, off-the-shoulder white dress with horizontal ribbed texture.jpg',
  'one-shoulder, fitted dress that features sequin embellishments and sheer panels.jpg',
  'fitted, short, yellow dress with short sleeves.jpeg'],
 97)

现在，让我们来看看数据集中的第一张图片：

python 复制代码

# function to load and display an image
def display_image(file_path):
    image = Image.open(file_path)
    plt.imshow(image)
    plt.axis('off')
    plt.show()

# display the first image to understand its characteristics
first_image_path = os.path.join(extraction_directory_updated, extracted_files_updated[0])
display_image(first_image_path)

现在，我们将创建一个包含所有图像文件路径的列表，用于稍后从数据集中的每张图像中提取特征：

python 复制代码

# directory path containing your images
image_directory = '/content/women_fashion/women fashion'

image_paths_list = [file for file in glob.glob(os.path.join(image_directory, '*.*')) if file.endswith(('.jpg', '.png', '.jpeg', 'webp'))]

# print the list of image file paths
print(image_paths_list)

在上述代码中，glob 模块用于生成存储在目录中的图像的文件路径列表。glob.glob 函数搜索与指定模式匹配的文件，在本例中为*.*，它匹配目录中的所有文件。然后，列表理解会过滤这些文件，只包含具有特定图像文件扩展名（.jpg、.png、.jpeg、.webp）的文件。它确保 image_paths_list 只包含图像文件的路径，而不包括目录中可能存在的任何其他文件类型。

2. 构建模型（VGG16）

现在，我们将从所有时尚图像中提取特征：

python 复制代码

base_model = VGG16(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.output)

def preprocess_image(img_path):
    img = image.load_img(img_path, target_size=(224, 224))
    img_array = image.img_to_array(img)
    img_array_expanded = np.expand_dims(img_array, axis=0)
    return preprocess_input(img_array_expanded)

def extract_features(model, preprocessed_img):
    features = model.predict(preprocessed_img)
    flattened_features = features.flatten()
    normalized_features = flattened_features / np.linalg.norm(flattened_features)
    return normalized_features

all_features = []
all_image_names = []

for img_path in image_paths_list:
    preprocessed_img = preprocess_image(img_path)
    features = extract_features(model, preprocessed_img)
    all_features.append(features)
    all_image_names.append(os.path.basename(img_path))

python 复制代码

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58889256/58889256 [==============================] - 0s 0us/step
1/1 [==============================] - 1s 715ms/step
1/1 [==============================] - 0s 486ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 1s 623ms/step
1/1 [==============================] - 1s 840ms/step
1/1 [==============================] - 1s 747ms/step
1/1 [==============================] - 1s 583ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 0s 471ms/step
1/1 [==============================] - 1s 509ms/step
1/1 [==============================] - 0s 457ms/step
1/1 [==============================] - 0s 474ms/step
1/1 [==============================] - 0s 459ms/step
1/1 [==============================] - 0s 481ms/step
1/1 [==============================] - 0s 465ms/step
1/1 [==============================] - 0s 486ms/step
1/1 [==============================] - 1s 678ms/step
1/1 [==============================] - 1s 846ms/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 760ms/step
1/1 [==============================] - 1s 810ms/step
1/1 [==============================] - 1s 832ms/step
1/1 [==============================] - 1s 796ms/step
1/1 [==============================] - 1s 598ms/step
1/1 [==============================] - 0s 461ms/step
1/1 [==============================] - 0s 472ms/step
1/1 [==============================] - 0s 461ms/step
1/1 [==============================] - 0s 477ms/step
1/1 [==============================] - 0s 481ms/step
1/1 [==============================] - 0s 473ms/step
1/1 [==============================] - 0s 466ms/step
1/1 [==============================] - 0s 476ms/step
1/1 [==============================] - 0s 468ms/step
1/1 [==============================] - 0s 479ms/step
1/1 [==============================] - 0s 471ms/step
1/1 [==============================] - 1s 605ms/step
1/1 [==============================] - 1s 548ms/step
1/1 [==============================] - 1s 679ms/step
1/1 [==============================] - 1s 703ms/step
1/1 [==============================] - 1s 596ms/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 590ms/step
1/1 [==============================] - 0s 462ms/step
1/1 [==============================] - 0s 482ms/step
1/1 [==============================] - 0s 464ms/step
1/1 [==============================] - 0s 470ms/step
1/1 [==============================] - 0s 484ms/step
1/1 [==============================] - 0s 467ms/step
1/1 [==============================] - 0s 483ms/step
1/1 [==============================] - 0s 464ms/step
1/1 [==============================] - 0s 481ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 0s 479ms/step
1/1 [==============================] - 0s 464ms/step
1/1 [==============================] - 0s 499ms/step
1/1 [==============================] - 0s 466ms/step
1/1 [==============================] - 1s 735ms/step
1/1 [==============================] - 1s 894ms/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 852ms/step
1/1 [==============================] - 1s 912ms/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 908ms/step
1/1 [==============================] - 1s 871ms/step
1/1 [==============================] - 1s 1s/step
1/1 [==============================] - 1s 579ms/step
1/1 [==============================] - 0s 471ms/step
1/1 [==============================] - 0s 462ms/step
1/1 [==============================] - 0s 466ms/step
1/1 [==============================] - 1s 574ms/step
1/1 [==============================] - 1s 807ms/step
1/1 [==============================] - 1s 809ms/step
1/1 [==============================] - 1s 608ms/step
1/1 [==============================] - 0s 460ms/step
1/1 [==============================] - 0s 467ms/step
1/1 [==============================] - 0s 468ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 0s 474ms/step
1/1 [==============================] - 0s 478ms/step
1/1 [==============================] - 0s 477ms/step
1/1 [==============================] - 0s 469ms/step
1/1 [==============================] - 0s 475ms/step
1/1 [==============================] - 0s 463ms/step
1/1 [==============================] - 0s 479ms/step
1/1 [==============================] - 0s 462ms/step
1/1 [==============================] - 0s 485ms/step
1/1 [==============================] - 0s 471ms/step
1/1 [==============================] - 0s 480ms/step
1/1 [==============================] - 0s 462ms/step
1/1 [==============================] - 0s 476ms/step
1/1 [==============================] - 0s 466ms/step
1/1 [==============================] - 1s 692ms/step
1/1 [==============================] - 1s 848ms/step

最初加载的 VGG16 模型不包含顶层分类层（include_top = False），因此适合用于特征提取而非分类。

在上述代码中，使用 VGG16 模型（一种在 ImageNet 数据集上预先训练过的流行卷积神经网络）实现了特征提取过程，从存储在 image_path_list 中的图像中提取视觉特征。来自 image_paths_list 的每个图像路径都要经过一系列步骤处理：加载图像并将其大小调整为 224×224 像素，以符合 VGG16 输入大小的要求，将其转换为 NumPy 数组，并进行预处理以适应模型的预期输入格式。

然后将预处理后的图像输入 VGG16 模型以提取特征，随后对其进行扁平化和归一化处理，从而为每张图像创建一致的特征向量。这些特征向量（all_features）及其对应的图像文件名（all_image_names）被存储起来，为下一步利用图像特征建立时尚推荐系统提供了结构化数据集。

3. 模型评估

现在，我将编写一个函数，根据图片特征推荐时尚图片：

python 复制代码

from scipy.spatial.distance import cosine

def recommend_fashion_items_cnn(input_image_path, all_features, all_image_names, model, top_n=5):
    # pre-process the input image and extract features
    preprocessed_img = preprocess_image(input_image_path)
    input_features = extract_features(model, preprocessed_img)

    # calculate similarities and find the top N similar images
    similarities = [1 - cosine(input_features, other_feature) for other_feature in all_features]
    similar_indices = np.argsort(similarities)[-top_n:]

    # filter out the input image index from similar_indices
    similar_indices = [idx for idx in similar_indices if idx != all_image_names.index(input_image_path)]

    # display the input image
    plt.figure(figsize=(15, 10))
    plt.subplot(1, top_n + 1, 1)
    plt.imshow(Image.open(input_image_path))
    plt.title("Input Image")
    plt.axis('off')

    # display similar images
    for i, idx in enumerate(similar_indices[:top_n], start=1):
        image_path = os.path.join('/content/women_fashion/women fashion', all_image_names[idx])
        plt.subplot(1, top_n + 1, i + 1)
        plt.imshow(Image.open(image_path))
        plt.title(f"Recommendation {i}")
        plt.axis('off')

    plt.tight_layout()
    plt.show()

在上述代码中，我们定义了一个 recommend_fashion_items_cnn 函数，该函数通过基于深度学习的特征提取，向用户推荐与给定输入图片相似的时尚单品。它利用 VGG16 模型从图像中提取高维特征向量，捕捉图像的视觉本质。

对于指定的输入图像，该函数会对图像进行预处理，提取其特征，并计算该特征向量与数据集中其他图像特征向量（all_features）之间的余弦相似度。它根据相似度对这些图像进行排序，并选择前 N 个最相似的图像进行推荐，同时通过从相似指数列表中过滤掉输入图像的指数，明确排除将输入图像推荐给自己的可能性。

最后，该函数将通过显示输入图片及其推荐来实现可视化。

现在，我们来看看如何使用此函数，根据输入图片中的相似方式推荐图片：

python 复制代码

input_image_path = '/content/women_fashion/women fashion/dark, elegant, sleeveless dress that reaches down to about mid-calf.jpg'
recommend_fashion_items_cnn(input_image_path, all_features, image_paths_list, model, top_n=4)

您只需将图像路径作为输入，就会看到类似的时尚建议图片作为输出。

4. 总结

因此，这就是如何使用 Python 编程语言，利用图像特征构建时尚推荐系统。使用图像特征的时尚推荐系统利用计算机视觉和机器学习技术来分析时尚产品的视觉方面（如颜色、纹理和风格），并向用户推荐类似或互补的产品。