Ubuntu 16.04 系统(解释器为 python3.12)在Pycharm虚拟环境中安装 pyspider 爬虫工具

一:安装步骤

步骤1. 系统Terminal命令行执行如下命令安装依赖的组件 PhantomJS

复制代码
    $ wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
    $ sudo tar -xvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/
    $ sudo ln -s /usr/local/phantomjs-2.1.1-linux-x86_64/bin/phantomjs  /usr/local/bin/phantomjs
    $ phantomjs --version

步骤2. 系统Terminal命令行执行如下命令安装其他依赖包

复制代码
$ sudo apt update
$ sudo apt install libcurl4-openssl-dev backports

步骤3. pyspider 依赖的 tornado 库在 Python 3.12 环境下需要 backports.ssl_match_hostname 模块,而 pyspider 尚未完全适配这些改动。需通过pycharm的Terminal中执行如下命令解决

复制代码
 $ pip install backports.ssl_match_hostname

步骤4(可选). pyspider 在 Python 3.12 环境中运行时存在兼容性问题。附件中是已修复好兼容问题的终版压缩包[ubuntu16.04(python3.12解释器)下pyspider兼容性修复完后的.tar.gz],可以直接解压使用。然后进行步骤6(前提是先通过pip install强制安装了异常版本 pip install --force-reinstall pyspider),如果想自己一步步修改源码,可跳过此步,参考"步骤5.兼容问题修复"
点击下载压缩包

步骤5. 兼容问题修复。pyspider 在 Python 3.12 环境中运行时存在兼容性问题。这是由于 Python 3.12 对部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动,需要做如下修改

5.1 兼容问题1:Python3 中的 async 已经变成了关键字。需将源码中 async 替换成其他变量,

如: 将下面位置的 async 改为 mark_async

复制代码
        pyspider/run.py 的231行、245行(两个)、365行
        pyspider/webui/app.py 的95行
        pyspider/fetcher/tornado_fetcher.py 的81行、89行(两个)、95行、117行

5.2 兼容问题2:部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动。需通过如下步骤手动修改pyspider代码来解决兼容性问题
a). 修复 UserDict 和 Mapping 导入问题:
把 pyspider/libs/counter.py 文件里的python代码第14行:

复制代码
            try:
                from UserDict import DictMixin
            except ImportError:
                from collections import Mapping as DictMixin

改成:

复制代码
            try:
                from collections import UserDict as DictMixin
            except ImportError:
                from collections.abc import Mapping as DictMixin

把 pyspider/scheduler/task_queue.py 文件里的python代码第12行

复制代码
            try:
                from UserDict import DictMixin
            except ImportError:
                from collections import Mapping as DictMixin

改成:

复制代码
            try:
                from collections import UserDict as DictMixin
            except ImportError:
                from collections.abc import Mapping as DictMixin

b). 修复 imp 模块缺失问题:
把 pyspider/processor/project_module.py 文件中的python代码第11行:

复制代码
import imp

改成:

复制代码
 import importlib.util

c). 修复 MutableMapping 导入问题:
tornado 库引用MutableMapping出现错误,可修改 tornado/httputil.py 文件中的python代码第106行:

复制代码
class HTTPHeaders(collections.MutableMapping):

改成:

复制代码
 class HTTPHeaders(collections.abc.MutableMapping):

5.3 兼容问题3:pyspider 在 Python 3.12 环境下运行时存在兼容性问题,fractions模块已被移除,而 pyspider 尚未完全适配这些改动。修改如下将 fractions 替换成 math
a). pyspider/libs/base_handler.py的python代码第12行空白行新增

复制代码
import math

b).修改 pyspider/libs/base_handler.py的python代码将其中的第115行

复制代码
min_tick = fractions.gcd(min_tick, each.tick)

改成:

复制代码
min_tick = math.gcd(min_tick, each.tick)

5.4 兼容问题4:pyspider 使用的 Flask 版本不兼容。pyspider 是基于 Flask 旧版本开发的,而新版本(如 Flask 2.3+)移除了 before_first_request 装饰器,需进行如下修改
修改 pyspider/webui/debug.py的python代码将其中的第 64 行

复制代码
            @app.before_first_request
            def enable_projects_import():
                sys.meta_path.append(ProjectFinder(app.config['projectdb']))

改成:

复制代码
            @app.before_request
            def enable_projects_import():
                if not hasattr(app, '_got_first_request'):
                    app._got_first_request = True
                    sys.meta_path.append(ProjectFinder(app.config['projectdb']))

5.5 兼容问题5:pyspider 的 WebDAV 模块在 Python 3.12 环境下存在兼容性问题。
a) 在 Python 3.12 中,抽象基类(ABC)的检查变得更加严格,ScriptProvider 类没有实现其基类要求的所有抽象方法。修改如下
修改 pyspider/webui/webdav.py的python代码第165行

复制代码
           class ScriptProvider(DAVProvider):
                def __init__(self, app):
                    super(ScriptProvider, self).__init__()
                    self.app = app

                def __repr__(self):
                    return "pyspiderScriptProvider"

                def getResourceInst(self, path, environ):
                    path = os.path.normpath(path).replace('\\', '/')
                    if path in ('/', '.', ''):
                        path = '/'
                        return RootCollection(path, environ, self.app)
                    else:
                        return ScriptResource(path, environ, self.app)

改为:

复制代码
           class ScriptProvider(DAVProvider):
                def __init__(self, app):
                    super(ScriptProvider, self).__init__()
                    self.app = app

                def __repr__(self):
                    return "pyspiderScriptProvider"

                def getResourceInst(self, path, environ):
                    path = os.path.normpath(path).replace('\\', '/')
                    if path in ('/', '.', ''):
                        path = '/'
                        return RootCollection(path, environ, self.app)
                    else:
                        return ScriptResource(path, environ, self.app)

                # 添加缺失的抽象方法实现
                def get_resource_inst(self, path, environ):
                    return ScriptResource(path, self, environ)

b) WsgiDAV 库的配置格式发生了改变,domaincontroller 选项已被弃用,需要使用 http_authenticator.domain_controller 替代。修改如下
修改 pyspider/webui/webdav.py的python代码第207行

复制代码
            config = DEFAULT_CONFIG.copy()
            config.update({
                'mount_path': '/dav',
                'provider_mapping': {
                '/': ScriptProvider(app)
                },
                'domaincontroller': NeedAuthController(app),
                'verbose': 1 if app.debug else 0,
                'dir_browser': {'davmount': False,
                        'enable': True,
                        'msmount': False,
                        'response_trailer': ''},
            })
            dav_app = WsgiDAVApp(config)

改成:

复制代码
            config = DEFAULT_CONFIG.copy()
            config.update({
                'mount_path': '/dav',
                'provider_mapping': {
                '/': ScriptProvider(app)
                },
                # 更新认证配置
                "http_authenticator": {
                "domain_controller": NeedAuthController,  # 移动到 http_authenticator 下
                "accept_basic": True,
                "accept_digest": False,
                "default_to_digest": False,
                },
                'verbose': 1 if app.debug else 0,
                'dir_browser': {'davmount': False,
                        'enable': True,
                        'msmount': False,
                        'response_trailer': ''},
            })
            dav_app = WsgiDAVApp(config)

c) WsgiDAV 的认证控制器接口发生变化,存在兼容性问题,修改如下
修改 pyspider/webui/webdav.py的python代码第186行:

复制代码
            class NeedAuthController(object):
                def __init__(self, app):
                    self.app = app

                def getDomainRealm(self, inputRelativeURL, environ):
                    return 'need auth'

                def requireAuthentication(self, realmname, environ):
                    return self.app.config.get('need_auth', False)

                def isRealmUser(self, realmname, username, environ):
                    return username == self.app.config.get('webui_username')

                def getRealmUserPassword(self, realmname, username, environ):
                    return self.app.config.get('webui_password')

                def authDomainUser(self, realmname, username, password, environ):
                    return username == self.app.config.get('webui_username') \
                        and password == self.app.config.get('webui_password')

改成:

复制代码
            class NeedAuthController(object):
                def __init__(self, app, config=None):
                    self.app = app
                    # 处理额外的config参数,使其兼容WsgiDAV的初始化方式
                    if config is not None:
                        self.config = config
                    else:
                        # 如果config未提供,尝试从app中获取
                        self.config = app.config.get("http_authenticator", {})
    
                def getDomainRealm(self, inputRelativeURL, environ):
                    return 'need auth'

                def requireAuthentication(self, realmname, environ):
                    return self.app.config.get('need_auth', False)

                def isRealmUser(self, realmname, username, environ):
                    return username == self.app.config.get('webui_username')

                def getRealmUserPassword(self, realmname, username, environ):
                    return self.app.config.get('webui_password')

                def authDomainUser(self, realmname, username, password, environ):
                    return username == self.app.config.get('webui_username') \
                        and password == self.app.config.get('webui_password')

                # 添加WsgiDAV期望的接口方法,转发到原有方法
                def get_domain_realm(self, input_path, environ):
                    return self.getDomainRealm(input_path, environ)

                def basic_auth_user(self, realm, user_name, password, environ):
                    return self.authDomainUser(realm, user_name, password, environ)

                def supports_http_digest_auth(self):
                    return False  # 我们不支持摘要认证

                def is_share_anonymous(self, share_path):
                    """检查指定的共享路径是否允许匿名访问"""
                    # 如果不需要认证,则所有共享都允许匿名访问
                    return not self.app.config.get('need_auth', False)

5.6 兼容问题6:Werkzeug 库版本与 pyspider 不兼容。从 Python 3.12 开始,Werkzeug v2.3.0 及以上版本已经移除了DispatcherMiddleware,将其移至独立的werkzeug.middleware.dispatcher模块中。
而 pyspider 仍在使用旧的导入方式。修改如下
修改 pyspider/webui/app.py的python代码第64行

复制代码
from werkzeug.wsgi import DispatcherMiddleware

改成:

复制代码
 from werkzeug.middleware.dispatcher import DispatcherMiddleware

步骤6. pycharm的Terminal中执行如下命令安装pyspider

复制代码
 $ pip install pyspider

步骤7. Pycharm的Terminal命令行执行如下命令,验证pyspider是否安装成功,打印所示当前安装的pyspider最新版本是 0.3.10

复制代码
    $ pyspider --version
    pyspider, version 0.3.10

步骤8. pycharm的Terminal命令行执行 pyspider 启动 pyspider 网页端控制台,如下打印结果表示成功启动 pyspider,并且启用了5000端口。浏览器可以访问 http://localhost:5000/ 进入PySpider网页控制台爬数据了。

复制代码
    $ pyspider
    phantomjs fetcher running on port 25555
    [I 250515 15:08:09 result_worker:49] result_worker starting...
    [I 250515 15:08:10 processor:211] processor starting...
    [I 250515 15:08:10 tornado_fetcher:638] fetcher starting...
    [I 250515 15:08:10 scheduler:647] scheduler starting...
    [I 250515 15:08:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
    [I 250515 15:08:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
    [I 250515 15:08:10 app:76] webui running on 0.0.0.0:5000

步骤9. 如果运行 pyspider 命令,出现错误Error: Could not create web server listening on port 25555,原因是25555端口被占用,需要释放端口重新执行步骤8
解决方案: 使用 lsof -i 25555查看端口被哪个PID占用,然后用 kill -9 <PID> 释放端口

复制代码
    $ lsof -i :25555
    COMMAND     PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    phantomjs 16615 wanghu    7u  IPv4 278700      0t0  TCP *:25555 (LISTEN)
    
    $ kill -9 16615

二:pyspider踩坑过程与解决方案:
1.安装 pyspider 时出现了如下的 ConfigurationError: Could not run curl-config 错误,这是因为系统缺少 curl-devel 或 libcurl 开发库,而 pycurl 依赖这些库来编译。
解决方案:使用如下命令安装 libcurl4-openssl-dev 包(Ubuntu/Debian 系统),它包含了编译 pycurl 所需的 curl-config 工具和头文件

复制代码
sudo apt install libcurl4-openssl-dev

错误详情:
Getting requirements to build wheel ... error
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [33 lines of output]
Traceback (most recent call last):
File "<string>", line 230, in configure_unix
File "/usr/local/lib/python3.12/subprocess.py", line 1026, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/local/lib/python3.12/subprocess.py", line 1950, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'curl-config'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
main()
File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
json_out["return_val"] = hook(**hook_input["kwargs"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 143, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 331, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 301, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 512, in run_setup
super().run_setup(setup_script=setup_script)
File "/tmp/pip-build-env-6by6cyoh/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 317, in run_setup
exec(code, locals())
File "<string>", line 1016, in <module>
File "<string>", line 676, in get_extension
File "<string>", line 93, in init
File "<string>", line 235, in configure_unix
ConfigurationError: Could not run curl-config: [Errno 2] No such file or directory: 'curl-config'

end of output

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

  1. 安装完pyspider后在Terminal中输入 pyspider命令运行后出现如下错误,是因为Python3 中的 async 已经变成了关键字。
    解决方案:将 async 替换成其他变量,如 将下面位置的 async 改为 mark_async
复制代码
    pyspider/run.py 的231行、245行(两个)、365行
    pyspider/webui/app.py 的95行
    pyspider/fetcher/tornado_fetcher.py 的81行、89行(两个)、95行、117行

错误详情:
$ pyspider
Traceback (most recent call last):
File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 5, in <module>
from pyspider.run import main
File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 231
async=True, get_object=False, no_input=False):
^^^^^
SyntaxError: invalid syntax

  1. 再次尝试运行 pyspider 命令,出现如下错误,是因为pyspider 在 Python 3.12 环境中运行时存在兼容性问题。这是由于 Python 3.12 对部分旧模块进行了移除或者调整,而 pyspider 尚未完全适配这些改动。
    解决方案:通过如下步骤手动修改pyspider代码来解决兼容性问题
    a). 修复 UserDict 和 Mapping 导入问题:
    把 pyspider/libs/counter.py 文件里的python代码第14行:
复制代码
    try:
        from UserDict import DictMixin
    except ImportError:
        from collections import Mapping as DictMixin

改成:

复制代码
    try:
        from collections import UserDict as DictMixin
    except ImportError:
        from collections.abc import Mapping as DictMixin

把 pyspider/scheduler/task_queue.py 文件里的python代码第12行

复制代码
    try:
        from UserDict import DictMixin
    except ImportError:
        from collections import Mapping as DictMixin

改成:

复制代码
    try:
        from collections import UserDict as DictMixin
    except ImportError:
        from collections.abc import Mapping as DictMixin

b). 修复 imp 模块缺失问题:
把 pyspider/processor/project_module.py 文件中的python代码第11行:

复制代码
import imp

改成:

复制代码
  import importlib.util

c). 修复 MutableMapping 导入问题:
tornado 库引用MutableMapping出现错误,可修改 tornado/httputil.py 文件中的python代码第106行:

复制代码
class HTTPHeaders(collections.MutableMapping):

改成:

复制代码
class HTTPHeaders(collections.abc.MutableMapping):

错误详情:

$ pyspider

W 250515 11:30:54 run:413\] phantomjs not found, continue running without it. \[I 250515 11:30:56 result_worker:49\] result_worker starting... Process Process-5: Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/counter.py", line 14, in \ from UserDict import DictMixin ModuleNotFoundError: No module named 'UserDict' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(\*self._args, \*\*self._kwargs) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 192, in scheduler Scheduler = load_cls(None, None, scheduler_cls) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/scheduler/__init__.py", line 1, in \ from .scheduler import Scheduler, OneScheduler, ThreadBaseScheduler # NOQA \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/scheduler/scheduler.py", line 19, in \ from pyspider.libs import counter, utils File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/counter.py", line 16, in \ from collections import Mapping as DictMixin ImportError: cannot import name 'Mapping' from 'collections' (/usr/local/lib/python3.12/collections/__init__.py) Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(\*self._args, \*\*self._kwargs) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 236, in fetcher Fetcher = load_cls(None, None, fetcher_cls) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in \ from .tornado_fetcher import Fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 21, in \ import tornado.httputil File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/httputil.py", line 106, in \ class HTTPHeaders(collections.MutableMapping): \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ AttributeError: module 'collections' has no attribute 'MutableMapping' Process Process-3: Traceback (most recent call last): File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(\*self._args, \*\*self._kwargs) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 273, in processor Processor = load_cls(None, None, processor_cls) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/__init__.py", line 1, in \ from .processor import ProcessorResult, Processor File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/processor.py", line 20, in \ from .project_module import ProjectManager, ProjectFinder File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/processor/project_module.py", line 11, in \ import imp ModuleNotFoundError: No module named 'imp' Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui app = load_cls(None, None, webui_instance) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in \ from . import app, index, debug, task, result, login File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 17, in \ from pyspider.fetcher import tornado_fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in \ from .tornado_fetcher import Fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 21, in \ import tornado.httputil File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/httputil.py", line 106, in \ class HTTPHeaders(collections.MutableMapping): \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ AttributeError: module 'collections' has no attribute 'MutableMapping' 4. 再次尝试运行 pyspider 命令,出现如下import backports.ssl_match_hostname ModuleNotFoundError: No module named 'backports'错误,是因为pyspider 依赖的 tornado 库在 Python 3.12 环境下需要 backports.ssl_match_hostname 模块,而 pyspider 尚未完全适配这些改动。 解决方案: backports.ssl_match_hostname 模块缺失问题,可通过pycharm的Terminal中执行如下命令解决 ``` $ pip install backports.ssl_match_hostname ``` 补充: Error: Could not create web server listening on port 25555 属于端口占用问题,将其他问题都解决了再处理此问题 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 12:54:49 processor:211\] processor starting... Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(\*self._args, \*\*self._kwargs) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 236, in fetcher Fetcher = load_cls(None, None, fetcher_cls) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in \ from .tornado_fetcher import Fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 31, in \ from tornado.simple_httpclient import SimpleAsyncHTTPClient File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/simple_httpclient.py", line 8, in \ from tornado.http1connection import HTTP1Connection, HTTP1ConnectionParameters File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/http1connection.py", line 30, in \ from tornado import iostream File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/iostream.py", line 40, in \ from tornado.netutil import ssl_wrap_socket, ssl_match_hostname, SSLCertificateError, _client_ssl_defaults, _server_ssl_defaults File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/netutil.py", line 56, in \ import backports.ssl_match_hostname ModuleNotFoundError: No module named 'backports' Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui app = load_cls(None, None, webui_instance) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in \ from . import app, index, debug, task, result, login File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 17, in \ from pyspider.fetcher import tornado_fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/__init__.py", line 1, in \ from .tornado_fetcher import Fetcher File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/fetcher/tornado_fetcher.py", line 31, in \ from tornado.simple_httpclient import SimpleAsyncHTTPClient File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/simple_httpclient.py", line 8, in \ from tornado.http1connection import HTTP1Connection, HTTP1ConnectionParameters File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/http1connection.py", line 30, in \ from tornado import iostream File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/iostream.py", line 40, in \ from tornado.netutil import ssl_wrap_socket, ssl_match_hostname, SSLCertificateError, _client_ssl_defaults, _server_ssl_defaults File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/tornado/netutil.py", line 56, in \ import backports.ssl_match_hostname ModuleNotFoundError: No module named 'backports' 5. 再次尝试运行 pyspider 命令,出现如下错误信息AttributeError: module 'fractions' has no attribute 'gcd',是因为pyspider 在 Python 3.12 环境下运行时存在兼容性问题,fractions模块已被移除,而 pyspider 尚未完全适配这些改动。 解决方案: a). pyspider/libs/base_handler.py的python代码第12行空白行新增 ``` import math ``` b).修改 pyspider/libs/base_handler.py的python代码将其中的第115行 ``` min_tick = fractions.gcd(min_tick, each.tick) ``` 改成: ``` min_tick = math.gcd(min_tick, each.tick) ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui app = load_cls(None, None, webui_instance) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in \ from . import app, index, debug, task, result, login File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/debug.py", line 22, in \ from pyspider.libs import utils, sample_handler, dataurl File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/sample_handler.py", line 9, in \ class Handler(BaseHandler): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/base_handler.py", line 115, in __new__ min_tick = fractions.gcd(min_tick, each.tick) \^\^\^\^\^\^\^\^\^\^\^\^\^ AttributeError: module 'fractions' has no attribute 'gcd' 6. 再次尝试运行 pyspider 命令,出现如下Flask 兼容性问题错误信息(AttributeError: 'QuitableFlask' object has no attribute 'before_first_request'),这个错误是因为 pyspider 使用的 Flask 版本不兼容。pyspider 是基于 Flask 旧版本开发的,而新版本(如 Flask 2.3+)移除了 before_first_request 装饰器 解决方案: 修改 pyspider/webui/debug.py的python代码将其中的第 64 行 ``` @app.before_first_request def enable_projects_import(): sys.meta_path.append(ProjectFinder(app.config['projectdb'])) ``` 改成: ``` @app.before_request def enable_projects_import(): if not hasattr(app, '_got_first_request'): app._got_first_request = True sys.meta_path.append(ProjectFinder(app.config['projectdb'])) ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 13:42:09 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 13:42:09 processor:211\] processor starting... \[I 250515 13:42:09 tornado_fetcher:638\] fetcher starting... Error: Could not create web server listening on port 25555 \[I 250515 13:42:09 scheduler:647\] scheduler starting... \[I 250515 13:42:10 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 Error: Could not create web server listening on port 25555 \[I 250515 13:42:10 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 333, in webui app = load_cls(None, None, webui_instance) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 48, in load_cls return utils.load_object(value) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/libs/utils.py", line 369, in load_object module = __import__(module_name, globals(), locals(), \[object_name\]) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/__init__.py", line 8, in \ from . import app, index, debug, task, result, login File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/debug.py", line 64, in \ @app.before_first_request \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ AttributeError: 'QuitableFlask' object has no attribute 'before_first_request'. Did you mean: '_got_first_request'? Error: Could not create web server listening on port 25555 7. 再次尝试运行 pyspider 命令,出现错误 TypeError: Can't instantiate abstract class ScriptProvider without an implementation for abstract method 'get_resource_inst' ,原因是pyspider 的 WebDAV 模块在 Python 3.12 环境下存在兼容性问题。在 Python 3.12 中,抽象基类(ABC)的检查变得更加严格,ScriptProvider 类没有实现其基类要求的所有抽象方法。 解决方案: 修改pyspider/webui/webdav.py的python代码第165行 ``` class ScriptProvider(DAVProvider): def __init__(self, app): super(ScriptProvider, self).__init__() self.app = app def __repr__(self): return "pyspiderScriptProvider" def getResourceInst(self, path, environ): path = os.path.normpath(path).replace('\\', '/') if path in ('/', '.', ''): path = '/' return RootCollection(path, environ, self.app) else: return ScriptResource(path, environ, self.app) ``` 改为: ``` class ScriptProvider(DAVProvider): def __init__(self, app): super(ScriptProvider, self).__init__() self.app = app def __repr__(self): return "pyspiderScriptProvider" def getResourceInst(self, path, environ): path = os.path.normpath(path).replace('\\', '/') if path in ('/', '.', ''): path = '/' return RootCollection(path, environ, self.app) else: return ScriptResource(path, environ, self.app) # 添加缺失的抽象方法实现 def get_resource_inst(self, path, environ): return ScriptResource(path, self, environ) ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:05:24 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 14:05:24 processor:211\] processor starting... Error: Could not create web server listening on port 25555 \[I 250515 14:05:25 scheduler:647\] scheduler starting... \[I 250515 14:05:25 tornado_fetcher:638\] fetcher starting... \[I 250515 14:05:25 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 \[I 250515 14:05:25 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Error: Could not create web server listening on port 25555 \[I 250515 14:05:25 app:84\] webui exiting... Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main Error: Could not create web server listening on port 25555 cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui app.run(host=host, port=port) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run from .webdav import dav_app File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 207, in \ '/': ScriptProvider(app) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ TypeError: Can't instantiate abstract class ScriptProvider without an implementation for abstract method 'get_resource_inst' 8. 再次尝试运行 pyspider 命令,出现错误 ValueError: Invalid configuration: - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.,原因是 WsgiDAV 库的配置格式发生了改变,domaincontroller 选项已被弃用,需要使用 http_authenticator.domain_controller 替代。 解决方案: 将pyspider/webui/webdav.py的python代码第207行 ``` config = DEFAULT_CONFIG.copy() config.update({ 'mount_path': '/dav', 'provider_mapping': { '/': ScriptProvider(app) }, 'domaincontroller': NeedAuthController(app), 'verbose': 1 if app.debug else 0, 'dir_browser': {'davmount': False, 'enable': True, 'msmount': False, 'response_trailer': ''}, }) dav_app = WsgiDAVApp(config) ``` 改成: ``` config = DEFAULT_CONFIG.copy() config.update({ 'mount_path': '/dav', 'provider_mapping': { '/': ScriptProvider(app) }, # 更新认证配置 "http_authenticator": { "domain_controller": NeedAuthController, # 移动到 http_authenticator 下 "accept_basic": True, "accept_digest": False, "default_to_digest": False, }, 'verbose': 1 if app.debug else 0, 'dir_browser': {'davmount': False, 'enable': True, 'msmount': False, 'response_trailer': ''}, }) dav_app = WsgiDAVApp(config) ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:14:01 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 14:14:01 processor:211\] processor starting... \[I 250515 14:14:01 scheduler:647\] scheduler starting... Error: Could not create web server listening on port 25555 \[I 250515 14:14:01 tornado_fetcher:638\] fetcher starting... \[I 250515 14:14:01 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 \[I 250515 14:14:01 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:14:01 app:84\] webui exiting... Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui app.run(host=host, port=port) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run from .webdav import dav_app File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 220, in \ dav_app = WsgiDAVApp(config) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 155, in __init__ _check_config(config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 129, in _check_config raise ValueError("Invalid configuration:\\n - " + "\\n - ".join(errors)) ValueError: Invalid configuration: - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead. 9. 再次尝试运行 pyspider 命令,出现错误"TypeError: NeedAuthController.__init__() takes 2 positional arguments but 3 were given",原因是WsgiDAV 的认证控制器接口发生变化,存在兼容性问题 解决方案: pyspider/webui/webdav.py的python代码第186行: ``` class NeedAuthController(object): def __init__(self, app): self.app = app def getDomainRealm(self, inputRelativeURL, environ): return 'need auth' def requireAuthentication(self, realmname, environ): return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ): return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ): return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ): return username == self.app.config.get('webui_username') \ and password == self.app.config.get('webui_password') ``` 改成: ``` class NeedAuthController(object): def __init__(self, app, config=None): self.app = app # 处理额外的config参数,使其兼容WsgiDAV的初始化方式 if config is not None: self.config = config else: # 如果config未提供,尝试从app中获取 self.config = app.config.get("http_authenticator", {}) def getDomainRealm(self, inputRelativeURL, environ): return 'need auth' def requireAuthentication(self, realmname, environ): return self.app.config.get('need_auth', False) def isRealmUser(self, realmname, username, environ): return username == self.app.config.get('webui_username') def getRealmUserPassword(self, realmname, username, environ): return self.app.config.get('webui_password') def authDomainUser(self, realmname, username, password, environ): return username == self.app.config.get('webui_username') \ and password == self.app.config.get('webui_password') # 添加WsgiDAV期望的接口方法,转发到原有方法 def get_domain_realm(self, input_path, environ): return self.getDomainRealm(input_path, environ) def basic_auth_user(self, realm, user_name, password, environ): return self.authDomainUser(realm, user_name, password, environ) def supports_http_digest_auth(self): return False # 我们不支持摘要认证 def is_share_anonymous(self, share_path): """检查指定的共享路径是否允许匿名访问""" # 如果不需要认证,则所有共享都允许匿名访问 return not self.app.config.get('need_auth', False) ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:42:21 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 14:42:21 processor:211\] processor starting... \[I 250515 14:42:21 tornado_fetcher:638\] fetcher starting... Error: Could not create web server listening on port 25555 \[I 250515 14:42:21 scheduler:647\] scheduler starting... \[I 250515 14:42:21 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 \[I 250515 14:42:21 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Error: Could not create web server listening on port 25555 \[I 250515 14:42:21 app:84\] webui exiting... Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui app.run(host=host, port=port) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 59, in run from .webdav import dav_app File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/webdav.py", line 226, in \ dav_app = WsgiDAVApp(config) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/wsgidav_app.py", line 257, in __init__ app = mw(self, self.application, config) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/http_authenticator.py", line 140, in __init__ dc = make_domain_controller(wsgidav_app, config) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/wsgidav/http_authenticator.py", line 111, in make_domain_controller dc = dc(wsgidav_app, config) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ TypeError: NeedAuthController.__init__() takes 2 positional arguments but 3 were given Error: Could not create web server listening on port 25555 10. 再次尝试运行 pyspider 命令,出现错误ImportError: cannot import name 'DispatcherMiddleware' from 'werkzeug.wsgi',原因是 Werkzeug 库版本与 pyspider 不兼容。从 Python 3.12 开始,Werkzeug v2.3.0 及以上版本已经移除了DispatcherMiddleware,将其移至独立的werkzeug.middleware.dispatcher模块中。而 pyspider 仍在使用旧的导入方式。 解决方案: 修改 pyspider/webui/app.py的python代码第64行 ``` from werkzeug.wsgi import DispatcherMiddleware ``` 改成: ``` from werkzeug.middleware.dispatcher import DispatcherMiddleware ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:58:59 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 14:58:59 processor:211\] processor starting... Error: Could not create web server listening on port 25555 \[I 250515 14:58:59 scheduler:647\] scheduler starting... \[I 250515 14:58:59 tornado_fetcher:638\] fetcher starting... \[I 250515 14:58:59 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 \[I 250515 14:58:59 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 14:59:00 app:84\] webui exiting... Traceback (most recent call last): File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/bin/pyspider", line 8, in \ sys.exit(main()) \^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 754, in main cli() File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1442, in __call__ return self.main(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1363, in main rv = self.invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1808, in invoke rv = super().invoke(ctx) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 1226, in invoke return ctx.invoke(self.callback, \*\*ctx.params) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 165, in cli ctx.invoke(all) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 497, in all ctx.invoke(webui, \*\*webui_config) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/core.py", line 794, in invoke return callback(\*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/click/decorators.py", line 34, in new_func return f(get_current_context(), \*args, \*\*kwargs) \^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^ File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/run.py", line 384, in webui app.run(host=host, port=port) File "/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/pyspider/webui/app.py", line 64, in run from werkzeug.wsgi import DispatcherMiddleware ImportError: cannot import name 'DispatcherMiddleware' from 'werkzeug.wsgi' (/home/wanghu/PycharmProjects/PythonProject/getHarmonyVideoDatas/.venv/lib/python3.12/site-packages/werkzeug/wsgi.py) Error: Could not create web server listening on port 25555 11. 再次尝试运行 pyspider 命令,出现错误Error: Could not create web server listening on port 25555,原因是25555端口被占用,需要释放端口重新执行 pyspider 命令 解决方案: 使用 lsof -i 25555查看端口被哪个PID占用,然后用 kill -9 \ 释放端口后再重新执行 pyspider 命令。看到 如下打印结果表示成功启动 pyspider,并且启用了5000端口。浏览器可以访问 http://localhost:5000/ 进入PySpider网页控制台爬数据了。 ``` $ lsof -i :25555 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME phantomjs 16615 wanghu 7u IPv4 278700 0t0 TCP *:25555 (LISTEN) (.venv) wanghu@td-1:~/PycharmProjects/PythonProject/getHarmonyVideoDatas$ kill -9 16615 (.venv) wanghu@td-1:~/PycharmProjects/PythonProject/getHarmonyVideoDatas$ pyspider phantomjs fetcher running on port 25555 [I 250515 15:08:09 result_worker:49] result_worker starting... [I 250515 15:08:10 processor:211] processor starting... [I 250515 15:08:10 tornado_fetcher:638] fetcher starting... [I 250515 15:08:10 scheduler:647] scheduler starting... [I 250515 15:08:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333 [I 250515 15:08:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0 [I 250515 15:08:10 app:76] webui running on 0.0.0.0:5000 ``` 错误详情: $ pyspider Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 Error: Could not create web server listening on port 25555 \[I 250515 15:06:48 result_worker:49\] result_worker starting... Error: Could not create web server listening on port 25555 \[I 250515 15:06:49 processor:211\] processor starting... \[I 250515 15:06:49 tornado_fetcher:638\] fetcher starting... \[I 250515 15:06:49 scheduler:647\] scheduler starting... \[I 250515 15:06:49 scheduler:782\] scheduler.xmlrpc listening on 127.0.0.1:23333 Error: Could not create web server listening on port 25555 \[I 250515 15:06:49 scheduler:586\] in 5m: new:0,success:0,retry:0,failed:0 Error: Could not create web server listening on port 25555 \[I 250515 15:06:49 app:76\] webui running on 0.0.0.0:5000