背景需求:
前期制作了单题文件夹
data:image/s3,"s3://crabby-images/9fb4e/9fb4e580820c7719da51f64ea447b47293e5eed5" alt=""
每个二级文件夹里都有1-2份docx文件
data:image/s3,"s3://crabby-images/a040e/a040ea5c2a6c0803db60e7d698deca618c3b9d1b" alt=""
data:image/s3,"s3://crabby-images/3bb62/3bb62d636854b6a5fb78a712653a8417dca3a448" alt=""
每个二级文件夹里都有1-2份docx文件------有5分钟字样那份docx不需要
data:image/s3,"s3://crabby-images/f5458/f5458bcc6f8a8a6e9136265e0f105696e71be23a" alt=""
data:image/s3,"s3://crabby-images/3f8ef/3f8ef50e1d0a45a963a22e727a519d16563635b2" alt=""
data:image/s3,"s3://crabby-images/981c2/981c2f3e6b41227895a31101f72b977ca674b863" alt=""
如何批量提取 7个二级文件夹里不含"5分钟"字样的docx?并将7个docx合并成PDF?
data:image/s3,"s3://crabby-images/62341/62341b2fd732e7861e828ae2c59d858c1b60bde8" alt=""
代码展示:
data:image/s3,"s3://crabby-images/fd0d0/fd0d0f23c9537ca67998d87df22f072e5747c3e9" alt=""
python
'''
读取一级文件夹中的所有二级文件夹,二级文件夹里面的所有docx复制到一级文件夹的二级文件夹"整理"文件夹
作者:阿夏(AI对话大师)
时间:2024年3月3日
'''
import os,time
import shutil
print('-----1、复制d每个二个文件夹下的(没有5分钟字样的docx文件到二级文件夹里-------')
# 一级文件夹路径
folder_path = r'D:\04三级操作题'
# 目标文件夹路径
new_path = folder_path+r'\整理'
# 获取一级文件夹中的所有二级文件夹(包括整理文件夹)
subfolders = [f.path for f in os.scandir(folder_path) if f.is_dir()]
# 遍历二级文件夹并复制docx文件到目标文件夹
for subfolder in subfolders:
if subfolders=='整理': # 排除"整理"文件夹
pass
else:
docx_files = [f for f in os.listdir(subfolder) if f.endswith('.docx')]
for file in docx_files:
source_file = os.path.join(subfolder, file)
destination_file = os.path.join(new_path, file)
if source_file == destination_file:# 如果复制文件相同,就跳过
continue
if '5分钟' in file:
# # 不要有5分钟文件名的docx
pass
else:
shutil.copy2(source_file, destination_file)
# 19份docx合并成1个pdf
data:image/s3,"s3://crabby-images/b51e6/b51e63276654fc555ea401cd7830a538f6133a3b" alt=""
data:image/s3,"s3://crabby-images/65ffc/65ffc8c4e761fa94f9670da95276947352d0765d" alt=""
data:image/s3,"s3://crabby-images/87b5b/87b5bd78453dcb61feddf9659356cd8e0d4049a2" alt=""
PDF一页一套题(因为前期设置docx,让每个docx里面的文字说明都在一页内)
data:image/s3,"s3://crabby-images/9935e/9935e6c05ee282df9e0ec7114f80b2fe5e15e931" alt=""
存在问题:多个docx合并一个PDF后,没有页码,打印后容易散乱。
data:image/s3,"s3://crabby-images/acae8/acae8b99254603e3fc089d0e237deaf98d33543f" alt=""