【python013】pyinstaller打包PDF提取脚本为exe工具

1.在日常工作和学习中,遇到类似问题处理场景,如pdf文件核心内容截取,这里将文件打包成exe可执行文件,实现功能简便使用。

2.欢迎点赞、关注、批评、指正,互三走起来,小手动起来!

3.欢迎点赞、关注、批评、指正,互三走起来,小手动起来!

文章目录

1.环境准备

  • 历史安装的环境打包失败,或环境包不兼容,或历史安装包太多等问题,新建环境可能会更快些。问题如4小节记录。

    bash 复制代码
    # 在 Anaconda Prompt 环境中创建虚拟环境
    conda create -n youli python==3.8.0
    
    # 激活新建的虚拟环境
    conda activate youli
    
    # 安装必要的Python环境包
    pip install fitz
    pip install pymupdf
    pip install wxpython
    pip install pyinstaller
    pip install frontend wxpython
    
    # 删除虚拟环境
    # conda create -n youli python==3.8.0
    
    # 查看当前存在哪些虚拟环境
    # conda env list 
    # conda info -e
  • 虚拟环境效果如下:

  • 环境包版本详情如下:

2.pyinstaller打包输出脚本工具

  • 命令行

    bash 复制代码
    pyinstaller -F -w ..\pdfextract.py --noconfirm --noconsole -p ..\Anaconda3\envs\python8\Lib\site-packages
  • 执行结果详情

    bash 复制代码
    (python8) C:\Users\Administrator>pyinstaller -F -w ..\pdfextract2.py --noconfirm --noconsole -p ..\Anaconda3\envs\python8\Lib\site-packages
    420 INFO: PyInstaller: 6.8.0, contrib hooks: 2024.7
    420 INFO: Python: 3.8.0 (conda)
    421 INFO: Platform: Windows-10-10.0.19041-SP0
    422 INFO: Python environment: ..\Anaconda3\envs\python8
    423 INFO: wrote C:\Users\Administrator\pdfextract2.spec
    429 DEPRECATION: Foreign Python environment's site-packages paths added to --paths/pathex:
    ['..\\Anaconda3\\envs\\python8\\Lib\\site-packages']
    This is ALWAYS the wrong thing to do. If your environment's site-packages is not in PyInstaller's module search path then you are running PyInstaller from a different environment to the one your packages are in. Run print(sys.prefix) without PyInstaller to get the environment you should be using then install and run PyInstaller from that environment instead of this one. This warning will become an error in PyInstaller 7.0.
    430 INFO: Module search paths (PYTHONPATH):
    ['..\\Anaconda3\\envs\\python8\\Scripts\\pyinstaller.exe',
     '..\\Anaconda3\\envs\\python8\\python38.zip',
     '..\\Anaconda3\\envs\\python8\\DLLs',
     '..\\Anaconda3\\envs\\python8\\lib',
     '..\\Anaconda3\\envs\\python8',
     '..\\Anaconda3\\envs\\python8\\lib\\site-packages',
     'E:\\PycharmSpace\\orclblobtest',
     '..\\Anaconda3\\envs\\python8\\Lib\\site-packages']
    733 INFO: checking Analysis
    733 INFO: Building Analysis because Analysis-00.toc is non existent
    733 INFO: Running Analysis Analysis-00.toc
    735 INFO: Target bytecode optimization level: 0
    735 INFO: Initializing module dependency graph...
    736 INFO: Caching module graph hooks...
    754 INFO: Analyzing base_library.zip ...
    1824 INFO: Loading module hook 'hook-heapq.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    1897 INFO: Loading module hook 'hook-encodings.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    3157 INFO: Loading module hook 'hook-pickle.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    3815 INFO: Caching module dependency graph...
    3965 INFO: Looking for Python shared library...
    3973 INFO: Using Python shared library: ..\Anaconda3\envs\python8\python38.dll
    3974 INFO: Analyzing E:\PycharmSpace\orclblobtest\pdfextract2.py
    5889 INFO: Loading module hook 'hook-PIL.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    5956 INFO: Loading module hook 'hook-PIL.Image.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    6952 INFO: Loading module hook 'hook-numpy.py' from '..\\Anaconda3\\envs\\python8\\Lib\\site-packages\\numpy\\_pyinstaller'...
    7029 WARNING: Conda distribution 'numpy', dependency of 'numpy', was not found. If you installed this distribution with pip then you may ignore this warning.
    7572 INFO: Loading module hook 'hook-multiprocessing.util.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    7681 INFO: Loading module hook 'hook-xml.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    8209 INFO: Loading module hook 'hook-difflib.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    8295 INFO: Loading module hook 'hook-platform.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    8669 INFO: Loading module hook 'hook-sysconfig.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    9621 INFO: Loading module hook 'hook-packaging.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    9745 INFO: Loading module hook 'hook-PIL.ImageFilter.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    10018 INFO: Loading module hook 'hook-pandas.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    11702 INFO: Loading module hook 'hook-pytz.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    12067 INFO: Loading module hook 'hook-pkg_resources.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    14388 INFO: Loading module hook 'hook-scipy.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    14539 INFO: Loading module hook 'hook-scipy.linalg.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    14909 INFO: Loading module hook 'hook-scipy.sparse.csgraph.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    15183 INFO: Loading module hook 'hook-scipy.special._ufuncs.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    15243 INFO: Loading module hook 'hook-scipy.special._ellip_harm_2.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    16644 INFO: Loading module hook 'hook-scipy.spatial.transform.rotation.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    17421 INFO: Loading module hook 'hook-scipy.stats._stats.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    18308 INFO: Loading module hook 'hook-pandas.io.formats.style.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    20782 INFO: Loading module hook 'hook-pandas.plotting.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    21009 INFO: Processing pre-safe import module hook six.moves from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\pre_safe_import_module\\hook-six.moves.py'.
    22334 INFO: Loading module hook 'hook-sqlite3.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    22879 INFO: Loading module hook 'hook-pandas.io.clipboard.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    23076 INFO: Loading module hook 'hook-xml.etree.cElementTree.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    23078 INFO: Loading module hook 'hook-lxml.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\_pyinstaller_hooks_contrib\\hooks\\stdhooks'...
    23627 INFO: Loading module hook 'hook-lxml.etree.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\_pyinstaller_hooks_contrib\\hooks\\stdhooks'...
    23633 INFO: Loading module hook 'hook-xml.dom.domreg.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    24205 INFO: Processing module hooks...
    24282 INFO: Loading module hook 'hook-lxml.isoschematron.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\_pyinstaller_hooks_contrib\\hooks\\stdhooks'...
    24301 WARNING: Hidden import "jinja2" not found!
    24515 INFO: Loading module hook 'hook-PIL.SpiderImagePlugin.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    25007 INFO: Loading module hook 'hook-lxml.objectify.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\_pyinstaller_hooks_contrib\\hooks\\stdhooks'...
    25027 INFO: Performing binary vs. data reclassification (624 entries)
    25172 INFO: Looking for ctypes DLLs
    25217 INFO: Analyzing run-time hooks ...
    25225 INFO: Including run-time hook '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_pkgutil.py'
    25229 INFO: Processing pre-find module path hook _pyi_rth_utils from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\pre_find_module_path\\hook-_pyi_rth_utils.py'.
    25230 INFO: Loading module hook 'hook-_pyi_rth_utils.py' from '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks'...
    25231 INFO: Including run-time hook '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_multiprocessing.py'
    25234 INFO: Including run-time hook '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_pkgres.py'
    25237 INFO: Including run-time hook '..\\Anaconda3\\envs\\python8\\lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_inspect.py'
    25287 INFO: Looking for dynamic libraries
    ..\Anaconda3\envs\python8\lib\site-packages\PyInstaller\building\build_main.py:205: UserWarning: The numpy.array_api submodule is still experimental. See NEP 47.
      __import__(package)
    26904 INFO: Extra DLL search directories (AddDllDirectory): ['..\\Anaconda3\\envs\\python8\\lib\\site-packages\\numpy\\.libs']
    26904 INFO: Extra DLL search directories (PATH): []
    30079 INFO: Warnings written to C:\Users\Administrator\build\pdfextract2\warn-pdfextract2.txt
    30264 INFO: Graph cross-reference written to C:\Users\Administrator\build\pdfextract2\xref-pdfextract2.html
    30336 INFO: checking PYZ
    30337 INFO: Building PYZ because PYZ-00.toc is non existent
    30337 INFO: Building PYZ (ZlibArchive) C:\Users\Administrator\build\pdfextract2\PYZ-00.pyz
    32361 INFO: Building PYZ (ZlibArchive) C:\Users\Administrator\build\pdfextract2\PYZ-00.pyz completed successfully.
    32416 INFO: checking PKG
    32416 INFO: Building PKG because PKG-00.toc is non existent
    32417 INFO: Building PKG (CArchive) pdfextract2.pkg
    59038 INFO: Building PKG (CArchive) pdfextract2.pkg completed successfully.
    59060 INFO: Bootloader ..\Anaconda3\envs\python8\lib\site-packages\PyInstaller\bootloader\Windows-64bit-intel\runw.exe
    59060 INFO: checking EXE
    59061 INFO: Building EXE because EXE-00.toc is non existent
    59061 INFO: Building EXE from EXE-00.toc
    59061 INFO: Copying bootloader EXE to C:\Users\Administrator\dist\pdfextract2.exe
    59067 INFO: Copying icon to EXE
    59072 INFO: Copying 0 resources to EXE
    59072 INFO: Embedding manifest in EXE
    59076 INFO: Appending PKG archive to EXE
    59148 INFO: Fixing EXE headers
    59664 INFO: Building EXE from EXE-00.toc completed successfully.

3.参数及生成文件释义

  • pyinstaller参数含义
  • 输出文件内容含义
    • Analysis:主要是分析py文件的依赖信息
    • PYZ:是一个.pyz的压缩包,包含程序运行需要的依赖
    • EXE:是根据上述两项内容而生成的
    • COLLECT:主要是输出信息dist文件夹:最终的exe文件存放位置
    • build文件夹:中间过程,创建好之后可以直接删除
      • 整体详情如下图所示:

4.打包错误示例及工具代码

  • 错误示例部分

  • 打包代码

    python 复制代码
    # -*- coding: utf-8 -*-
    
    import fitz
    import wx
    
    class PDFExtractor(wx.Frame):
        def __init__(self, parent, _title):
            wx.Frame.__init__(self, parent, id=wx.ID_ANY, title=_title, pos=wx.DefaultPosition,
                              size=wx.Size(500, 254), style=wx.DEFAULT_FRAME_STYLE | wx.TAB_TRAVERSAL )
    
            self.SetSizeHints(wx.DefaultSize, wx.DefaultSize)
            self.SetForegroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_WINDOW))
            self.SetBackgroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_ACTIVECAPTION))
    
            bSizer2 = wx.BoxSizer(wx.VERTICAL)
    
            self.m_filePicker2 = wx.FilePickerCtrl(self, wx.ID_ANY, wx.EmptyString, u"Select a file", u"*.*",
                                                   wx.DefaultPosition, wx.DefaultSize, wx.FLP_DEFAULT_STYLE)
            self.m_filePicker2.SetFont(wx.Font(9, 74, 90, 92, False, "微软雅黑"))
            self.m_filePicker2.SetForegroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_HIGHLIGHT))
            self.m_filePicker2.SetBackgroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_HIGHLIGHT))
    
            bSizer2.Add(self.m_filePicker2, 0, wx.ALL | wx.EXPAND, 5)
    
            self.m_staticText5 = wx.StaticText(self, wx.ID_ANY, u"Start Page:", wx.DefaultPosition, wx.DefaultSize, 0)
            self.m_staticText5.Wrap(-1)
            self.m_staticText5.SetFont(wx.Font(9, 74, 90, 92, True, "微软雅黑"))
            self.m_staticText5.SetForegroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_BTNTEXT))
    
            bSizer2.Add(self.m_staticText5, 0, wx.ALL, 5)
    
            self.m_textCtrl1 = wx.TextCtrl(self, wx.ID_ANY, wx.EmptyString, wx.DefaultPosition, wx.DefaultSize, 0)
            bSizer2.Add(self.m_textCtrl1, 0, wx.EXPAND, 5)
    
            self.m_staticText6 = wx.StaticText(self, wx.ID_ANY, u"End Page:", wx.DefaultPosition, wx.DefaultSize, 0)
            self.m_staticText6.Wrap(-1)
            self.m_staticText6.SetFont(wx.Font(9, 74, 90, 92, True, "微软雅黑"))
            self.m_staticText6.SetForegroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_BTNTEXT))
    
            bSizer2.Add(self.m_staticText6, 0, wx.ALL, 5)
    
            self.m_textCtrl2 = wx.TextCtrl(self, wx.ID_ANY, wx.EmptyString, wx.DefaultPosition, wx.DefaultSize, 0)
            bSizer2.Add(self.m_textCtrl2, 0, wx.EXPAND, 5)
    
            self.m_button18 = wx.Button(self, wx.ID_ANY, u"Extract", wx.DefaultPosition, wx.DefaultSize, wx.NO_BORDER)
            self.m_button18.SetFont(wx.Font(12, 74, 90, 92, False, "微软雅黑"))
            self.m_button18.SetForegroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_BTNTEXT))
            self.m_button18.SetBackgroundColour(wx.SystemSettings.GetColour(wx.SYS_COLOUR_BTNHIGHLIGHT))
    
    
            self.m_button18.Bind(wx.EVT_BUTTON, self.extract_pages)
    
            bSizer2.Add(self.m_button18, 0, wx.ALIGN_CENTER_HORIZONTAL | wx.SHAPED, 5)
    
            self.SetSizer(bSizer2)
            self.Layout()
            self.Centre(wx.BOTH)
        def __del__(self):
            pass
    
        def extract_pages(self, event):
            file_path = self.m_filePicker2.GetPath()
            start_page = int(self.m_textCtrl1.GetValue())
            end_page = int(self.m_textCtrl2.GetValue())
    
            doc = fitz.open(file_path)
            output_doc = fitz.open()
    
            for page_num in range(start_page - 1, end_page):
                output_doc.insert_pdf(doc, from_page=page_num, to_page=page_num)
    
            output_path = file_path.replace(".pdf", "_extracted.pdf")
            output_doc.save(output_path)
            output_doc.close()
            doc.close()
    
            wx.MessageBox("Extraction complete!", "Success", wx.OK | wx.ICON_INFORMATION)
    
    # app = wx.App()
    # PDFExtractor(None, title="PDF Extractor")
    # app.MainLoop()
    
    if __name__ == '__main__':
        app = wx.App()  # 运行wx.App()方法
        title = "PDF Extractor"
        frame = PDFExtractor( None , title )  # 调用Frame类,并且不指定父类,当前就成为父类
        frame.Show()  # 运行展示界面的方法Show()
        app.MainLoop()  # 进入程序wx.App()循环
相关推荐
_oP_i3 小时前
提取 PDF 文件中的文字以及图片中的文字
pdf
集成显卡9 小时前
图片压缩工具 | 图片生成PDF文档
图像处理·pdf
一路向北North9 小时前
PDF.js无法显示数字签名
开发语言·javascript·pdf
开开心心就好20 小时前
高效视频倍速播放插件推荐
python·学习·游戏·pdf·计算机外设·电脑·音视频
IT小农工1 天前
如何生成和制作PDF文件
pdf
北十南1 天前
VueScan Pro v9.8.45.08 一款图像扫描软件,中文绿色便携版
pdf·电脑
空谷有来人2 天前
推荐一款PDF压缩的工具
pdf·pdf压缩
开开心心_Every2 天前
免费且好用的PDF水印添加工具
android·javascript·windows·智能手机·pdf·c#·娱乐
aloha_7892 天前
论文中pdf图片文件太大怎么办
图像处理·pdf·论文笔记
开开心心就好2 天前
免费批量文件重命名软件
vue.js·人工智能·深度学习·typescript·pdf·excel·less