pyinstaller打包pytorch和transformers程序

记录使用pyinstaller打包含有pytorch和transformers库的程序时遇到的问题和解决方法。

环境和版本信息

操作系统:Windows 11

Python:3.10.12

pyinstaller:5.13.0

torch:2.2.2

transformers:4.40.1

打包过程和问题

打包命令:pyinstaller -w -F mainwindow.py

问题1:transformers找不到相关python包的元数据metadata

打包完成后在mainwindow.py所在目录下会生成一个dist文件夹,里面是打包生成的exe文件,这里是mainwindow.exe,直接双击执行该程序,会出现如下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\versions.py", line 102, in require_version
  File "importlib\metadata\__init__.py", line 996, in version
  File "importlib\metadata\__init__.py", line 969, in distribution
  File "importlib\metadata\__init__.py", line 548, in from_name
importlib.metadata.PackageNotFoundError: No package metadata was found for tqdm

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\__init__.py", line 26, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\dependency_versions_check.py", line 57, in <module>
  File "transformers\utils\versions.py", line 117, in require_version_core
  File "transformers\utils\versions.py", line 104, in require_version
importlib.metadata.PackageNotFoundError: No package metadata was found for The 'tqdm>=4.27' distribution was not found and is required by this application. 
Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main

从错误信息可以看出错误原因是找不到包tqdm的元数据,应该是打包时没有把这些数据一起打包过去,看了下pyinstaller的使用说明,有个参数可以解决这个问题:

纯文本 复制代码
--copy-metadata PACKAGENAME
                        Copy metadata for the specified package. This option can be used multiple times.

在打包命令中加了该参数并指定tqdm包后,又出现了其他没有找到元数据的包,重复多次后,才将所有这些包的元数据都添加进去,这些包包括:tqdm、regex、requests、packaging、filelock、numpy、huggingface-hub、safetensors、pyyaml,最后的打包命令如下:

powershell 复制代码
pyinstaller -w -F --copy-metadata tqdm --copy-metadata regex --copy-metadata requests --copy-metadata packaging --copy-metadata filelock --copy-metadata numpy --copy-metadata huggingface-hub --copy-metadata safetensors --copy-metadata pyyaml mainwindow.py

但是像这样在打包命令中一个一个加参数比较麻烦,其实在这个过程中可以发现,执行打包命令时,会先生成一个spec文件,这个文件是打包时pyinstaller根据传递给它的参数生成的一个python文件,里面说明了把一个py文件打包成exe程序需要执行的操作。在打包命令中添加--copy-metadata会相应地在生成spec文件中添加如下代码:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')

所以可以先使用pyinstaller -w -F mainwindow.py先生成spec文件,再对这个文件进行修改。使用spec文件打包只需要执行命令pyinstaller mainwindow.spec即可,不需要添加其他参数,因为参数对应的操作已经编码在spec文件中了。

问题2:没有transformers/__init__.py文件

上述问题解决后,再执行打包后的exe程序,又会出现以下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\import_utils.py", line 1510, in _get_module
  File "importlib\__init__.py", line 126, in import_module
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\models\auto\processing_auto.py", line 28, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\processing_utils.py", line 46, in <module>
  File "transformers\utils\import_utils.py", line 1539, in direct_transformers_import
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1016, in get_code
  File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "transformers\utils\import_utils.py", line 1501, in __getattr__
  File "transformers\utils\import_utils.py", line 1500, in __getattr__
  File "transformers\utils\import_utils.py", line 1512, in _get_module
RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
[Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

这个错误说明没有把transformers相关的文件打包进去,可以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files 

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])

现在可以通过执行命令pyinstaller mainwindow.spec再次打包成exe程序。

问题3:找不到PyTorch和Tokenizers

再次启动生成的exe程序,又会出现如下错误:

text 复制代码
None of PyTorch,TensorFlow >= 2.0,or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

ImportError:
CLIPTokenizerFast requires the Tokenizers library but it was not found in your environment. You can install it with:
pip install tokenizers

说明找不到pytorch和tokenizers库,以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch') 
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])

总结

最终的spec文件如下,记得需要把datas传入到AnalysisEXE中:

python 复制代码
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])


block_cipher = None


a = Analysis(
    ['mainwindow.py'],
    pathex=[],
    binaries=[],
    datas=datas,
    hiddenimports=[],
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='mainwindow',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=False,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
相关推荐
网易独家音乐人Mike Zhou26 分钟前
【卡尔曼滤波】数据预测Prediction观测器的理论推导及应用 C语言、Python实现(Kalman Filter)
c语言·python·单片机·物联网·算法·嵌入式·iot
安静读书28 分钟前
Python解析视频FPS(帧率)、分辨率信息
python·opencv·音视频
小二·2 小时前
java基础面试题笔记(基础篇)
java·笔记·python
小喵要摸鱼3 小时前
Python 神经网络项目常用语法
python
一念之坤5 小时前
零基础学Python之数据结构 -- 01篇
数据结构·python
wxl7812275 小时前
如何使用本地大模型做数据分析
python·数据挖掘·数据分析·代码解释器
NoneCoder5 小时前
Python入门(12)--数据处理
开发语言·python
LKID体6 小时前
Python操作neo4j库py2neo使用(一)
python·oracle·neo4j
小尤笔记6 小时前
利用Python编写简单登录系统
开发语言·python·数据分析·python基础
FreedomLeo17 小时前
Python数据分析NumPy和pandas(四十、Python 中的建模库statsmodels 和 scikit-learn)
python·机器学习·数据分析·scikit-learn·statsmodels·numpy和pandas