python 文件管理库 Path 解析（详细&基础）

python 文件管理库 Path 解析

1 Path库能做什么：

Path库是python常见的文件操作库（以对象形式操作文件路径），可以进行以下操作：

文件路径的拼接（example: test / Your_path / files ）
文件地址的提取（提取名称、后缀、全程......）
层级关系访问
查询文件是否存在
创建目录

.............（基本上文件操作都够用的实用库）

2 Path 与 os 库的优势（可选）

（了解文件操作库os的可以查看，不了解建议略过）

python 复制代码

from pathlib import Path
import os

def get_base_name_vs_os(input_file_path:str):
    '''
    获取当前py文件的完整名称、不带后缀名称、后缀
    :return:
    '''
    fpath_os = input_file_path
    fpath_path = Path(input_file_path)

    #u can debug this function and try to input-> type(fpath_path) in ur IDE-debug window,u can see <class 'pathlib.WindowsPath'>
    print(
        f'os method: {os.path.basename(fpath_os)}, {os.path.basename(os.path.splitext(fpath_os)[0])}, {os.path.splitext(fpath_os)[-1]}'
    )
    print(
        f'Path method: {fpath_path.name}, {fpath_path.stem}, {fpath_path.suffix}'
    )

    return

if __name__ == '__main__':
  get_base_name_vs_os(r'your_file_path_like_C:\window')

优势：

Path库相比os库有更全的封装接口，能够快速且便携地获取想要的文件地址，同样的实现步骤可能需要好几层os
操作对象为Object（即面向对象），可以很方便调用类方法，而os需要调用os库，相对来说繁琐一些

劣势：

虽然在很多方面完胜os，但本质是 不可哈希 的类型------object，类类型，在写接口兼容、调用路径时需要注意。
其他兼容性问题。

3 Path库常用操作：

3.1 初始化路径

想初始化Path对象路径很简单，和其他类对象一样，只需要Path(your_path)即可获得对应的路径对象。

python 复制代码

def init_path():
    input_path = Path(r"your_input_path_test.py")#直接通过Path 初始化
    print(input_path)
    return input_path

除了直接通过单一变量初始化路径，还能通过以下示例进行初始化：

python 复制代码

input_path = Path(r"C:\\","Windows","your_dir")		#构造 c盘，window/your_dir的文件路径，可以传入多个路径，返回他们按顺序构造的路径
input_path = input_path / "hello_world.py"

3.2 获取文件地址（文件名称【带后缀、不带后缀】及后缀）

假设已经获取了文件对象的变量为：input_path = Path("your_path")

文件名称(完整带后缀)：用 input_path.name即可 ,返回带后缀的文件名称【依旧是Path对象】
文件名称(不带后缀)：用 input_path.stem即可，stem 有茎的意思，假设文件路径像一朵花，地上的花就包含了花朵和根茎，少了花朵部分，根茎也可以被认为是不带后缀（花朵）的了。
文件名称(后缀)：用 input_path.suffix即可，suffix翻译过来就是后缀，返回文件后缀【str】
- 但是这里要注意，如果有多个后缀，如 library.tar.gz, 则会返回最后一个后缀，如果想要获取n个后缀，请使用 suffixes。

python 复制代码

def get_path_name_stem_suffix(input_path:Path):
    name = input_path.name
    name_without_suffix = input_path.stem
    suffix = input_path.suffix
    print(f'\n u get it-> name:{name} ; stem: {name_without_suffix} ; suffix:{suffix}\n')
    return

官方示例：

name：

python 复制代码

PurePosixPath('my/library/setup.py').name

PureWindowsPath('//some/share/setup.py').name

PureWindowsPath('//some/share').name

stem：

python 复制代码

PurePosixPath('my/library.tar.gz').stem

PurePosixPath('my/library.tar').stem

PurePosixPath('my/library').stem

suffix：

python 复制代码

PurePosixPath('my/library/setup.py').suffix

PurePosixPath('my/library.tar.gz').suffix

PurePosixPath('my/library').suffix

3.3 获取路径状态

当我们对路径进行操作时，需要判断当前路径所处的位置、是否为文件、是否为文件夹等。

假设已经获取了文件对象的变量为：input_path = Path("your_path")

是否存在：返回 bool: (True or False)
python 复制代码
```
exist = input_path.exists()
```
是否是文件夹：返回 bool: (True or False)
python 复制代码
```
whether_dir = input_path.is_dir()
```
是否是文件：返回 bool: (True or False)
python 复制代码
```
whether_file = input_path.is_file()
```

常用状态（总）：

python 复制代码

def get_path_status(input_path:Path):

    exist = input_path.exists()

    whether_dir = input_path.is_dir()

    whether_file = input_path.is_file()

    print(f'\n file exist status:{exist}, dir :{whether_dir}, file:{whether_file}\n')
    return

3.4 获取当前/父文件夹

使用os库时，对父文件夹的控制相对较为繁琐，而Path对于文件夹层级的管理比较简单。假设我们有一个路径为："C:\Windows\Learning\pycharm\hello_world.py"，那他的当前路径为：pycharm，父路径为：Learning ,可以使用以下操作获取

python 复制代码

def get_father_and_local_dir(input_path:Path):

    local_dir = input_path.parent
    all_father_dir = input_path.parents                 #返回的是 Path.parents，类似于可迭代对象，如果想要直接看所有结构，就list化
    for father_dir,idx in enumerate(all_father_dir):
        output = f'(father_{idx}){father_dir}'
        print(output,end='')                            #迭代参考，可以自行debug体会一下
    print('')                                           #纯美化用
    all_father_dirs = list(input_path.parents)          # list(Path.parents), u can easily get value
    father_1_dir = str(input_path.parents[1])           # get idx=1[start in 0 index] parents and str the value
    
    print(f'local_dir:{local_dir}, father_1_dir:{father_1_dir} ,all_father_dir:{all_father_dirs} \n')

PS：

通常获取当前目录，使用 .parent就够用了，他同样返回当前父文件夹的Path对象
获取前n个父级文件夹路径就使用 .parents就好，但注意他返回的是可迭代对象，不能直接使用，需要直接使用就套list
- 返回的可迭代对象从0开始，也就是说，input_path.parents[0] == local_dir = input_path.parent

3.5 路径拼接

在上文初始化时，我们提及了其中一种路径拼接的方式（调用初始化函数）

python 复制代码

input_path = Path(r"C:\\","Windows","your_dir")		#构造 c盘，window/your_dir的文件路径，可以传入多个路径，返回他们按顺序构造的路径
input_path = input_path / "hello_world.py"

除此之外，还有函数方式进行拼接：

python 复制代码

def join_path(input_path:Path,sub_paths=('hello','world')):
    from copy import deepcopy
    example_1,example_2 = deepcopy(input_path), deepcopy(input_path)
    for sub_path in sub_paths:
        example_1 = example_1 / sub_path                                #两种路径拼接方式等价，个人建议使用 重载的"/"，方便简洁
        example_2 = example_2.joinpath(sub_path)
    print(f'{"*"*50}\n'+f'{example_1}; {example_2}; '+f'\n{"*"*50}\n')

3.6 确保文件路径存在（创建路径）

python 复制代码

def make_sure_dir_exist(input_path:Path):
    input_path.mkdir(parents=True,exist_ok=True)                        #相对固定，类似于模板，父文件夹自动创建True，文件夹存在不会重复创建 True

使用 mkdir这个接口即可，一般来说都会使用这样的配置：

parents: 父文件夹是否需要创建
exist_ok: 路径已存在文件夹的情况（是否处理）

Create a new directory at this given path. If mode is given, it is combined with the process' umask value to determine the file mode and access flags. If the path already exists, FileExistsError is raised.

If parents is true, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIX mkdir -p command).

If parents is false (the default), a missing parent raises FileNotFoundError.

If exist_ok is false (the default), FileExistsError is raised if the target directory already exists.

If exist_ok is true, FileExistsError exceptions will be ignored (same behavior as the POSIX mkdir -p command), but only if the last path component is not an existing non-directory file.

Changed in version 3.5: The exist_ok parameter was added.

3.7 计算文件路径间的差异（进阶）

接口介绍：

python 复制代码

def cal_path_diff(path_1:Path, path_2:Path=Path.cwd()):
    '''
    计算两个路径间的相对路径差
    :param path_1: 子集路径【范围更广】
    :param path_2: 父集路径【范围更小】
    这种方式仅适用于父集和子集之间，无依附关系则会报错
    '''
    try:
        diff = path_1.relative_to(path_2)
        diff_2 = path_2.relative_to(path_1)
        print(str(diff), '\n')
        # print(str(diff_2),"\n")                                           #如果path_2是path_1的父集，就会error【反之同理】，只能从子集计算父集的差距路径
    except Exception as e:
        raise e

官方参考：

python 复制代码

>>> p = PurePosixPath('/etc/passwd')
>>> p.relative_to('/')
PurePosixPath('etc/passwd')
>>> p.relative_to('/etc')
PurePosixPath('passwd')
>>> p.relative_to('/usr')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pathlib.py", line 694, in relative_to
    .format(str(self), str(formatted)))
ValueError: '/etc/passwd' is not in the subpath of '/usr' OR one path is relative and the other absolute

3.8 获取当前目录下指定后缀文件（略进阶）

python 复制代码

def find_files(input_path:Path, files_suffix:str=".jpg"):
    '''
    找文件下的文件（通过通配符查找）
    :param input_path: 输入路径
    :param files_suffix: 匹配的后缀
    可以自己换一下后缀，或者不要后缀，换一下路径，debug体会一下，还能配合列表推导式，还算实用，但我觉得os.walk对文件遍历好一些
    '''
    specimen_1 = input_path.glob(f'*{files_suffix}')                                #不递归进入
    specimen_2 = input_path.glob(f'**/*{files_suffix}')                             #递归进入，same to : path.rglob("*.files_suffix")
    print(f'specimen_1:{list(specimen_1)} \n specimen_2:{list(specimen_2)}\n')

真要大面积遍历文件的话，我建议用os.walk会好一些【参考 chapter2】

4.参考文档

official_path_doc

最常使用的基本上是上面这些了，还有什么需要的再查再找就好了，基础解析到这里应该够用了，bey~，感谢你看到这里，希望这篇文章能给你带来一些帮助，喜欢的话帮我点个赞吧。

5.code

Main_test：

python 复制代码

from pathlib import Path


def init_path():
    # input_path = Path(r"E:\Learning\test.py")
    input_path = Path(r"E:\\",r"Learning","os_vs_path.py")
    print(input_path)
    return input_path

def get_path_name_stem_suffix(input_path:Path):
    name = input_path.name
    name_without_suffix = input_path.stem
    suffix = input_path.suffix
    print(f'u get it-> name:{name} ; stem: {name_without_suffix} ; suffix:{suffix}\n')
    return

def get_path_status(input_path:Path):

    exist = input_path.exists()

    whether_dir = input_path.is_dir()

    whether_file = input_path.is_file()

    print(f'file exist status:{exist}, dir :{whether_dir}, file:{whether_file}\n')
    return

def get_father_and_local_dir(input_path:Path):

    local_dir = input_path.parent
    all_father_dir = input_path.parents                 #返回的是 Path.parents，类似于迭代器，如果想要直接看所有结构，就list化
    for father_dir,idx in enumerate(all_father_dir):
        output = f'(father_{idx}){father_dir}'
        print(output,end='')                            #迭代参考，可以自行debug体会一下
    print('')                                           #纯美化用
    all_father_dirs = list(input_path.parents)          # list(Path.parents), u can easily get value
    father_1_dir = str(input_path.parents[1])           # get idx=1[start in 0 index] parents and str the value

    print(f'local_dir:{local_dir}, father_1_dir:{father_1_dir} ,all_father_dir:{all_father_dirs} \n')

def join_path(input_path:Path,sub_paths=('hello','world')):
    from copy import deepcopy
    example_1,example_2 = deepcopy(input_path), deepcopy(input_path)
    for sub_path in sub_paths:
        example_1 = example_1 / sub_path                                #两种路径拼接方式等价，个人建议使用 重载的"/"，方便简洁
        example_2 = example_2.joinpath(sub_path)
    print(f'{"*"*50}\n'+f'{example_1}; {example_2}; '+f'\n{"*"*50}\n')

def make_sure_dir_exist(input_path:Path):
    input_path.mkdir(parents=True,exist_ok=True)                        #相对固定，类似于模板，父文件夹自动创建True，文件夹存在不会重复创建 True

def cal_path_diff(path_1:Path, path_2:Path=Path.cwd()):
    '''
    计算两个路径间的相对路径差
    :param path_1: 子集路径【范围更广】
    :param path_2: 父集路径【范围更小】
    这种方式仅适用于父集和子集之间，无依附关系则会报错
    '''
    try:
        diff = path_1.relative_to(path_2)
        diff_2 = path_2.relative_to(path_1)
        print(str(diff), '\n')
        # print(str(diff_2),"\n")                                           #如果path_2是path_1的父集，就会error【反之同理】，只能从子集计算父集的差距路径
    except Exception as e:
        raise e

def find_files(input_path:Path, files_suffix:str=".jpg"):
    '''
    找文件下的文件（通过通配符查找）
    :param input_path: 输入路径
    :param files_suffix: 匹配的后缀
    可以自己换一下后缀，或者不要后缀，换一下路径，debug体会一下
    '''
    specimen_1 = input_path.glob(f'*{files_suffix}')                                #不递归进入
    specimen_2 = input_path.glob(f'**/*{files_suffix}')                             #递归进入，same to : path.rglob("*.files_suffix")
    print(f'specimen_1:{list(specimen_1)} \n specimen_2:{list(specimen_2)}\n')

if __name__ == '__main__':
    t_path = init_path()
    get_path_name_stem_suffix(t_path)
    get_path_status(input_path=t_path)
    get_father_and_local_dir(t_path)
    join_path(t_path)
    find_files(t_path)
    cal_path_diff(t_path,Path("C:\\Windows"))   #记得删掉后面路径再试一次，就不会报错了

Os Vs Path:

python 复制代码

from pathlib import Path
import os

def get_base_name_vs_os(input_file_path:str):
    '''
    获取当前py文件的完整名称、不带后缀名称、后缀
    :return:
    '''
    fpath_os = input_file_path
    fpath_path = Path(input_file_path)

    #u can debug this function and try to input-> type(fpath_path) in ur IDE-debug window,u can see <class 'pathlib.WindowsPath'>
    print(
        f'os method: {os.path.basename(fpath_os)}, {os.path.basename(os.path.splitext(fpath_os)[0])}, {os.path.splitext(fpath_os)[-1]}'
    )
    print(
        f'Path method: {fpath_path.name}, {fpath_path.stem}, {fpath_path.suffix}'
    )

    return

if __name__ == '__main__':

    get_base_name_vs_os(r'E:\Learning\os_vs_path.py')

python 文件管理库 Path 解析（详细&基础）