pyinstaller打包pytorch和transformers程序

记录使用pyinstaller打包含有pytorch和transformers库的程序时遇到的问题和解决方法。

环境和版本信息

操作系统:Windows 11

Python:3.10.12

pyinstaller:5.13.0

torch:2.2.2

transformers:4.40.1

打包过程和问题

打包命令:pyinstaller -w -F mainwindow.py

问题1:transformers找不到相关python包的元数据metadata

打包完成后在mainwindow.py所在目录下会生成一个dist文件夹,里面是打包生成的exe文件,这里是mainwindow.exe,直接双击执行该程序,会出现如下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\versions.py", line 102, in require_version
  File "importlib\metadata\__init__.py", line 996, in version
  File "importlib\metadata\__init__.py", line 969, in distribution
  File "importlib\metadata\__init__.py", line 548, in from_name
importlib.metadata.PackageNotFoundError: No package metadata was found for tqdm

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\__init__.py", line 26, in <module>
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\dependency_versions_check.py", line 57, in <module>
  File "transformers\utils\versions.py", line 117, in require_version_core
  File "transformers\utils\versions.py", line 104, in require_version
importlib.metadata.PackageNotFoundError: No package metadata was found for The 'tqdm>=4.27' distribution was not found and is required by this application. 
Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main

从错误信息可以看出错误原因是找不到包tqdm的元数据,应该是打包时没有把这些数据一起打包过去,看了下pyinstaller的使用说明,有个参数可以解决这个问题:

纯文本 复制代码
--copy-metadata PACKAGENAME
                        Copy metadata for the specified package. This option can be used multiple times.

在打包命令中加了该参数并指定tqdm包后,又出现了其他没有找到元数据的包,重复多次后,才将所有这些包的元数据都添加进去,这些包包括:tqdm、regex、requests、packaging、filelock、numpy、huggingface-hub、safetensors、pyyaml,最后的打包命令如下:

powershell 复制代码
pyinstaller -w -F --copy-metadata tqdm --copy-metadata regex --copy-metadata requests --copy-metadata packaging --copy-metadata filelock --copy-metadata numpy --copy-metadata huggingface-hub --copy-metadata safetensors --copy-metadata pyyaml mainwindow.py

但是像这样在打包命令中一个一个加参数比较麻烦,其实在这个过程中可以发现,执行打包命令时,会先生成一个spec文件,这个文件是打包时pyinstaller根据传递给它的参数生成的一个python文件,里面说明了把一个py文件打包成exe程序需要执行的操作。在打包命令中添加--copy-metadata会相应地在生成spec文件中添加如下代码:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')

所以可以先使用pyinstaller -w -F mainwindow.py先生成spec文件,再对这个文件进行修改。使用spec文件打包只需要执行命令pyinstaller mainwindow.spec即可,不需要添加其他参数,因为参数对应的操作已经编码在spec文件中了。

问题2:没有transformers/__init__.py文件

上述问题解决后,再执行打包后的exe程序,又会出现以下错误:

text 复制代码
Traceback (most recent call last):
  File "transformers\utils\import_utils.py", line 1510, in _get_module
  File "importlib\__init__.py", line 126, in import_module
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\models\auto\processing_auto.py", line 28, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "transformers\processing_utils.py", line 46, in <module>
  File "transformers\utils\import_utils.py", line 1539, in direct_transformers_import
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1016, in get_code
  File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mainwindow.py", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "task_thread.py", line 3, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "image_deduplicate.py", line 15, in <module>
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "transformers\utils\import_utils.py", line 1501, in __getattr__
  File "transformers\utils\import_utils.py", line 1500, in __getattr__
  File "transformers\utils\import_utils.py", line 1512, in _get_module
RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
[Errno 2] No such file or directory: 'C:\\Users\\yuany\\AppData\\Local\\Temp\\_MEI264762\\transformers\\__init__.py'

这个错误说明没有把transformers相关的文件打包进去,可以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files 

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])

现在可以通过执行命令pyinstaller mainwindow.spec再次打包成exe程序。

问题3:找不到PyTorch和Tokenizers

再次启动生成的exe程序,又会出现如下错误:

text 复制代码
None of PyTorch,TensorFlow >= 2.0,or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

ImportError:
CLIPTokenizerFast requires the Tokenizers library but it was not found in your environment. You can install it with:
pip install tokenizers

说明找不到pytorch和tokenizers库,以通过在spec文件中添加如下代码解决该问题:

python 复制代码
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch') 
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])

总结

最终的spec文件如下,记得需要把datas传入到AnalysisEXE中:

python 复制代码
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import copy_metadata, collect_data_files

datas = []
datas += copy_metadata('tqdm')
datas += copy_metadata('regex')
datas += copy_metadata('requests')
datas += copy_metadata('packaging')
datas += copy_metadata('filelock')
datas += copy_metadata('numpy')
datas += copy_metadata('huggingface-hub')
datas += copy_metadata('safetensors')
datas += copy_metadata('pyyaml')
datas += copy_metadata('tokenizers')
datas += copy_metadata('torch')
datas += collect_data_files('transformers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('tokenizers', include_py_files=True, includes=['**/*.py'])
datas += collect_data_files('torch', include_py_files=True, includes=['**/*.py'])


block_cipher = None


a = Analysis(
    ['mainwindow.py'],
    pathex=[],
    binaries=[],
    datas=datas,
    hiddenimports=[],
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='mainwindow',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=False,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
相关推荐
小陈工25 分钟前
Python Web开发入门(十七):Vue.js与Python后端集成——让前后端真正“握手言和“
开发语言·前端·javascript·数据库·vue.js·人工智能·python
A__tao5 小时前
Elasticsearch Mapping 一键生成 Java 实体类(支持嵌套 + 自动过滤注释)
java·python·elasticsearch
研究点啥好呢5 小时前
Github热门项目推荐 | 创建你的像素风格!
c++·python·node.js·github·开源软件
迷藏4945 小时前
**发散创新:基于Rust实现的开源合规权限管理框架设计与实践**在现代软件架构中,**权限控制(RBAC)** 已成为保障
java·开发语言·python·rust·开源
明日清晨5 小时前
python扫码登录dy
开发语言·python
bazhange6 小时前
python如何像matlab一样使用向量化替代for循环
开发语言·python·matlab
人工干智能6 小时前
科普:python中你写的模块找不到了——`ModuleNotFoundError`
服务器·python
unicrom_深圳市由你创科技6 小时前
做虚拟示波器这种实时波形显示的上位机,用什么语言?
c++·python·c#
小敬爱吃饭6 小时前
Ragflow Docker部署及问题解决方案(界面为Welcome to nginx,ragflow上传文件失败,Docker中的ragflow-cpu-1一直重启)
人工智能·python·nginx·docker·语言模型·容器·数据挖掘
宸津-代码粉碎机6 小时前
Spring Boot 4.0虚拟线程实战调优技巧,最大化发挥并发优势
java·人工智能·spring boot·后端·python