pyinstaller打包pytorch和transformers程序

记录使用pyinstaller打包含有pytorch和transformers库的程序时遇到的问题和解决方法。

环境和版本信息

操作系统:Windows 11

Python:3.10.12

pyinstaller:5.13.0

torch:2.2.2

transformers:4.40.1

打包过程和问题

打包命令:pyinstaller -w -F mainwindow.py

问题1:transformers找不到相关python包的元数据metadata

打包完成后在mainwindow.py所在目录下会生成一个dist文件夹,里面是打包生成的exe文件,这里是mainwindow.exe,直接双击执行该程序,会出现如下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\versions.py", line 102, in require_version
  File "importlib\metadata\__init__.py", line 996, in version
  File "importlib\metadata\__init__.py", line 969, in distribution
  File "importlib\metadata\__init__.py", line 548, in from_name
importlib.metadata.PackageNotFoundError: No package metadata was found for tqdm

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\__init__.py", line 26, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\dependency_versions_check.py", line 57, in <module>
  File "transformers\utils\versions.py", line 117, in require_version_core
  File "transformers\utils\versions.py", line 104, in require_version
importlib.metadata.PackageNotFoundError: No package metadata was found for The 'tqdm>=4.27' distribution was not found and is required by this application. 
Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main

从错误信息可以看出错误原因是找不到包tqdm的元数据,应该是打包时没有把这些数据一起打包过去,看了下pyinstaller的使用说明,有个参数可以解决这个问题:

纯文本 复制代码
--copy-metadata PACKAGENAME
                        Copy metadata for the specified package. This option can be used multiple times.

在打包命令中加了该参数并指定tqdm包后,又出现了其他没有找到元数据的包,重复多次后,才将所有这些包的元数据都添加进去,这些包包括:tqdm、regex、requests、packaging、filelock、numpy、huggingface-hub、safetensors、pyyaml,最后的打包命令如下:

powershell 复制代码
pyinstaller -w -F --copy-metadata tqdm --copy-metadata regex --copy-metadata requests --copy-metadata packaging --copy-metadata filelock --copy-metadata numpy --copy-metadata huggingface-hub --copy-metadata safetensors --copy-metadata pyyaml mainwindow.py

但是像这样在打包命令中一个一个加参数比较麻烦,其实在这个过程中可以发现,执行打包命令时,会先生成一个spec文件,这个文件是打包时pyinstaller根据传递给它的参数生成的一个python文件,里面说明了把一个py文件打包成exe程序需要执行的操作。在打包命令中添加--copy-metadata会相应地在生成spec文件中添加如下代码:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')

所以可以先使用pyinstaller -w -F mainwindow.py先生成spec文件,再对这个文件进行修改。使用spec文件打包只需要执行命令pyinstaller mainwindow.spec即可,不需要添加其他参数,因为参数对应的操作已经编码在spec文件中了。

问题2:没有transformers/__init__.py文件

上述问题解决后,再执行打包后的exe程序,又会出现以下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\import_utils.py", line 1510, in _get_module
  File "importlib\__init__.py", line 126, in import_module
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\models\auto\processing_auto.py", line 28, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\processing_utils.py", line 46, in <module>
  File "transformers\utils\import_utils.py", line 1539, in direct_transformers_import
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1016, in get_code
  File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "transformers\utils\import_utils.py", line 1501, in __getattr__
  File "transformers\utils\import_utils.py", line 1500, in __getattr__
  File "transformers\utils\import_utils.py", line 1512, in _get_module
RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
[Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

这个错误说明没有把transformers相关的文件打包进去,可以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files 

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])

现在可以通过执行命令pyinstaller mainwindow.spec再次打包成exe程序。

问题3:找不到PyTorch和Tokenizers

再次启动生成的exe程序,又会出现如下错误:

text 复制代码
None of PyTorch,TensorFlow >= 2.0,or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

ImportError:
CLIPTokenizerFast requires the Tokenizers library but it was not found in your environment. You can install it with:
pip install tokenizers

说明找不到pytorch和tokenizers库,以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch') 
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])

总结

最终的spec文件如下,记得需要把datas传入到AnalysisEXE中:

python 复制代码
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])


block_cipher = None


a = Analysis(
    ['mainwindow.py'],
    pathex=[],
    binaries=[],
    datas=datas,
    hiddenimports=[],
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='mainwindow',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=False,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
相关推荐
张槊哲2 分钟前
函数的定义与使用(python)
开发语言·python
船长@Quant7 分钟前
文档构建:Sphinx全面使用指南 — 实战篇
python·markdown·sphinx·文档构建
偶尔微微一笑1 小时前
AI网络渗透kali应用(gptshell)
linux·人工智能·python·自然语言处理·编辑器
Sherlock Ma2 小时前
PDFMathTranslate:基于LLM的PDF文档翻译及双语对照的工具【使用教程】
人工智能·pytorch·语言模型·pdf·大模型·机器翻译·deepseek
船长@Quant2 小时前
文档构建:Sphinx全面使用指南 — 基础篇
python·markdown·sphinx·文档构建
喵手3 小时前
从 Java 到 Kotlin:在现有项目中迁移的最佳实践!
java·python·kotlin
liuweidong08023 小时前
【Pandas】pandas DataFrame rsub
开发语言·python·pandas
CH3_CH2_CHO3 小时前
不吃【Numpy】版
开发语言·python·numpy
-曾牛4 小时前
企业级AI开发利器:Spring AI框架深度解析与实战
java·人工智能·python·spring·ai·rag·大模型应用
Light604 小时前
智启未来:深度解析Python Transformers库及其应用场景
开发语言·python·深度学习·自然语言处理·预训练模型·transformers库 |·|应用场景