pyinstaller打包pytorch和transformers程序

记录使用pyinstaller打包含有pytorch和transformers库的程序时遇到的问题和解决方法。

环境和版本信息

操作系统:Windows 11

Python:3.10.12

pyinstaller:5.13.0

torch:2.2.2

transformers:4.40.1

打包过程和问题

打包命令:pyinstaller -w -F mainwindow.py

问题1:transformers找不到相关python包的元数据metadata

打包完成后在mainwindow.py所在目录下会生成一个dist文件夹,里面是打包生成的exe文件,这里是mainwindow.exe,直接双击执行该程序,会出现如下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\versions.py", line 102, in require_version
  File "importlib\metadata\__init__.py", line 996, in version
  File "importlib\metadata\__init__.py", line 969, in distribution
  File "importlib\metadata\__init__.py", line 548, in from_name
importlib.metadata.PackageNotFoundError: No package metadata was found for tqdm

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\__init__.py", line 26, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\dependency_versions_check.py", line 57, in <module>
  File "transformers\utils\versions.py", line 117, in require_version_core
  File "transformers\utils\versions.py", line 104, in require_version
importlib.metadata.PackageNotFoundError: No package metadata was found for The 'tqdm>=4.27' distribution was not found and is required by this application. 
Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main

从错误信息可以看出错误原因是找不到包tqdm的元数据,应该是打包时没有把这些数据一起打包过去,看了下pyinstaller的使用说明,有个参数可以解决这个问题:

纯文本 复制代码
--copy-metadata PACKAGENAME
                        Copy metadata for the specified package. This option can be used multiple times.

在打包命令中加了该参数并指定tqdm包后,又出现了其他没有找到元数据的包,重复多次后,才将所有这些包的元数据都添加进去,这些包包括:tqdm、regex、requests、packaging、filelock、numpy、huggingface-hub、safetensors、pyyaml,最后的打包命令如下:

powershell 复制代码
pyinstaller -w -F --copy-metadata tqdm --copy-metadata regex --copy-metadata requests --copy-metadata packaging --copy-metadata filelock --copy-metadata numpy --copy-metadata huggingface-hub --copy-metadata safetensors --copy-metadata pyyaml mainwindow.py

但是像这样在打包命令中一个一个加参数比较麻烦,其实在这个过程中可以发现,执行打包命令时,会先生成一个spec文件,这个文件是打包时pyinstaller根据传递给它的参数生成的一个python文件,里面说明了把一个py文件打包成exe程序需要执行的操作。在打包命令中添加--copy-metadata会相应地在生成spec文件中添加如下代码:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')

所以可以先使用pyinstaller -w -F mainwindow.py先生成spec文件,再对这个文件进行修改。使用spec文件打包只需要执行命令pyinstaller mainwindow.spec即可,不需要添加其他参数,因为参数对应的操作已经编码在spec文件中了。

问题2:没有transformers/__init__.py文件

上述问题解决后,再执行打包后的exe程序,又会出现以下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\import_utils.py", line 1510, in _get_module
  File "importlib\__init__.py", line 126, in import_module
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\models\auto\processing_auto.py", line 28, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\processing_utils.py", line 46, in <module>
  File "transformers\utils\import_utils.py", line 1539, in direct_transformers_import
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1016, in get_code
  File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "transformers\utils\import_utils.py", line 1501, in __getattr__
  File "transformers\utils\import_utils.py", line 1500, in __getattr__
  File "transformers\utils\import_utils.py", line 1512, in _get_module
RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
[Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

这个错误说明没有把transformers相关的文件打包进去,可以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files 

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])

现在可以通过执行命令pyinstaller mainwindow.spec再次打包成exe程序。

问题3:找不到PyTorch和Tokenizers

再次启动生成的exe程序,又会出现如下错误:

text 复制代码
None of PyTorch,TensorFlow >= 2.0,or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

ImportError:
CLIPTokenizerFast requires the Tokenizers library but it was not found in your environment. You can install it with:
pip install tokenizers

说明找不到pytorch和tokenizers库,以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch') 
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])

总结

最终的spec文件如下,记得需要把datas传入到AnalysisEXE中:

python 复制代码
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])


block_cipher = None


a = Analysis(
    ['mainwindow.py'],
    pathex=[],
    binaries=[],
    datas=datas,
    hiddenimports=[],
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='mainwindow',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=False,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
相关推荐
梧桐树04293 小时前
python常用内建模块:collections
python
Dream_Snowar3 小时前
速通Python 第三节
开发语言·python
四口鲸鱼爱吃盐4 小时前
Pytorch | 从零构建GoogleNet对CIFAR10进行分类
人工智能·pytorch·分类
蓝天星空5 小时前
Python调用open ai接口
人工智能·python
jasmine s5 小时前
Pandas
开发语言·python
郭wes代码5 小时前
Cmd命令大全(万字详细版)
python·算法·小程序
leaf_leaves_leaf5 小时前
win11用一条命令给anaconda环境安装GPU版本pytorch,并检查是否为GPU版本
人工智能·pytorch·python
夜雨飘零15 小时前
基于Pytorch实现的说话人日志(说话人分离)
人工智能·pytorch·python·声纹识别·说话人分离·说话人日志
404NooFound5 小时前
Python轻量级NoSQL数据库TinyDB
开发语言·python·nosql
天天要nx5 小时前
D102【python 接口自动化学习】- pytest进阶之fixture用法
python·pytest