Python3：文件操作

- 文件操作的基本流程
- [一、打开文件：open() 函数](#一、打开文件：open() 函数)
- - 常用的文件打开模式
- 二、读取文件内容
- - 读取整个文件
  - 逐行读取（处理大文件的理想方式）
  - 读取指定数量的字符
- 三、写入文件内容
- - 写入字符串
  - 写入多行
  - [使用 writelines() 写入多行](#使用 writelines() 写入多行)
- 四、追加内容到文件
- 五、文件指针操作
- [六、with 语句的好处](#六、with 语句的好处)
- 七、处理二进制文件
- 八、文件操作的实用例子
- - 例子1：复制文件
  - 例子2：文件内容统计
  - 例子3：简单的日志记录器
  - [例子4：CSV 文件处理](#例子4：CSV 文件处理)
  - 例子5：简单的文件加密/解密
- [九、处理路径的工具：os 和 pathlib 模块](#九、处理路径的工具：os 和 pathlib 模块)
- - [使用 os.path 模块](#使用 os.path 模块)
  - [使用现代的 pathlib 模块](#使用现代的 pathlib 模块)
- 十、处理目录
- - 创建目录
  - 列出目录内容
  - 删除文件和目录
- 十一、文件和目录的移动、复制和重命名
- 十二、临时文件和目录
- 十三、实际应用：简单的文件备份脚本
- 小结：文件操作的最佳实践

文件操作是编程中的基础技能，Python 提供了简单而强大的文件处理功能。通过文件操作，你可以读取、写入、修改文件，实现数据的永久存储和共享。

文件操作的基本流程

打开文件
读取或写入数据
关闭文件

一、打开文件：open() 函数

python 复制代码

file = open(filename, mode, encoding)

参数说明：

filename：文件路径
mode：打开模式（读、写、追加等）
encoding：字符编码（如 'utf-8'）

常用的文件打开模式

模式	描述
`'r'`	只读模式（默认）
`'w'`	写入模式（会覆盖现有内容）
`'a'`	追加模式（在文件末尾添加内容）
`'x'`	创建模式（如果文件已存在则失败）
`'b'`	二进制模式（与其他模式结合使用，如 `'rb'`, `'wb'`）
`'t'`	文本模式（默认）
`'+'`	读写模式（与其他模式结合使用，如 `'r+'`, `'w+'`）

二、读取文件内容

读取整个文件

python 复制代码

# 方法 1：读取所有内容为一个字符串
with open('example.txt', 'r', encoding='utf-8') as file:
    content = file.read()
    print(content)

# 方法 2：读取所有行到一个列表
with open('example.txt', 'r', encoding='utf-8') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())  # strip() 移除行尾的换行符

逐行读取（处理大文件的理想方式）

python 复制代码

with open('example.txt', 'r', encoding='utf-8') as file:
    for line in file:
        print(line.strip())

读取指定数量的字符

python 复制代码

with open('example.txt', 'r', encoding='utf-8') as file:
    # 读取前10个字符
    chunk = file.read(10)
    print(chunk)

三、写入文件内容

写入字符串

python 复制代码

with open('output.txt', 'w', encoding='utf-8') as file:
    file.write("这是第一行\n")
    file.write("这是第二行\n")

写入多行

python 复制代码

lines = ["第一行", "第二行", "第三行"]
with open('output.txt', 'w', encoding='utf-8') as file:
    for line in lines:
        file.write(line + '\n')

使用 writelines() 写入多行

python 复制代码

lines = ["第一行\n", "第二行\n", "第三行\n"]
with open('output.txt', 'w', encoding='utf-8') as file:
    file.writelines(lines)

四、追加内容到文件

python 复制代码

with open('output.txt', 'a', encoding='utf-8') as file:
    file.write("这是追加的一行\n")

五、文件指针操作

python 复制代码

with open('example.txt', 'r', encoding='utf-8') as file:
    # 读取前5个字符
    print(file.read(5))
    
    # 获取当前位置
    position = file.tell()
    print(f"当前位置: {position}")
    
    # 移动到文件开头
    file.seek(0)
    print(f"回到开头后读取: {file.read(5)}")
    
    # 移动到特定位置
    file.seek(10)
    print(f"移动到第10个字节后读取: {file.read(5)}")

六、with 语句的好处

使用 with 语句（上下文管理器）打开文件的优点：

自动关闭文件，即使发生异常
代码更简洁、更易读
避免资源泄漏

python 复制代码

# 不推荐的方式
file = open('example.txt', 'r')
try:
    content = file.read()
    print(content)
finally:
    file.close()

# 推荐的方式
with open('example.txt', 'r') as file:
    content = file.read()
    print(content)
# 文件在这里自动关闭

七、处理二进制文件

python 复制代码

# 读取二进制文件
with open('image.jpg', 'rb') as file:
    binary_data = file.read()
    print(f"文件大小: {len(binary_data)} 字节")

# 写入二进制文件
with open('copy.jpg', 'wb') as file:
    file.write(binary_data)

八、文件操作的实用例子

例子1：复制文件

python 复制代码

def copy_file(source, destination):
    try:
        with open(source, 'rb') as src_file:
            with open(destination, 'wb') as dest_file:
                # 每次读取 1MB
                chunk_size = 1024 * 1024
                while True:
                    chunk = src_file.read(chunk_size)
                    if not chunk:
                        break
                    dest_file.write(chunk)
        print(f"文件从 {source} 复制到 {destination} 成功！")
        return True
    except Exception as e:
        print(f"复制文件时出错: {e}")
        return False

# 使用函数复制文件
copy_file('original.txt', 'backup.txt')

例子2：文件内容统计

python 复制代码

def analyze_text_file(filename):
    try:
        with open(filename, 'r', encoding='utf-8') as file:
            content = file.read()
            
            # 基本统计
            char_count = len(content)
            word_count = len(content.split())
            line_count = content.count('\n') + 1  # +1 for the last line
            
            # 计算每个字符的出现频率
            char_freq = {}
            for char in content:
                if char in char_freq:
                    char_freq[char] += 1
                else:
                    char_freq[char] = 1
            
            # 排序并获取前5个最常见字符
            sorted_chars = sorted(char_freq.items(), key=lambda x: x[1], reverse=True)
            top_chars = sorted_chars[:5]
            
            # 打印统计结果
            print(f"文件：{filename}")
            print(f"字符数：{char_count}")
            print(f"单词数：{word_count}")
            print(f"行数：{line_count}")
            print("最常见的字符：")
            for char, count in top_chars:
                if char == '\n':
                    char_name = '换行符'
                elif char == ' ':
                    char_name = '空格'
                elif char == '\t':
                    char_name = '制表符'
                else:
                    char_name = char
                print(f"  '{char_name}': {count}次")
                
    except Exception as e:
        print(f"分析文件时出错: {e}")

# 分析文本文件
analyze_text_file('example.txt')

例子3：简单的日志记录器

python 复制代码

import datetime

def log(message, log_file='app.log'):
    """记录带时间戳的日志消息"""
    timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    log_entry = f"[{timestamp}] {message}\n"
    
    with open(log_file, 'a', encoding='utf-8') as file:
        file.write(log_entry)

# 使用日志记录器
log("程序启动")
try:
    result = 10 / 0
except Exception as e:
    log(f"错误: {e}")
log("程序结束")

例子4：CSV 文件处理

python 复制代码

import csv

# 读取 CSV 文件
def read_csv(filename):
    data = []
    try:
        with open(filename, 'r', encoding='utf-8', newline='') as file:
            csv_reader = csv.reader(file)
            headers = next(csv_reader)  # 获取表头
            for row in csv_reader:
                data.append(row)
        print(f"读取了 {len(data)} 行数据")
        return headers, data
    except Exception as e:
        print(f"读取CSV文件时出错: {e}")
        return None, None

# 写入 CSV 文件
def write_csv(filename, headers, data):
    try:
        with open(filename, 'w', encoding='utf-8', newline='') as file:
            csv_writer = csv.writer(file)
            csv_writer.writerow(headers)  # 写入表头
            csv_writer.writerows(data)    # 写入数据行
        print(f"成功写入 {len(data)} 行数据")
        return True
    except Exception as e:
        print(f"写入CSV文件时出错: {e}")
        return False

# 示例数据
headers = ['姓名', '年龄', '城市']
data = [
    ['张三', '25', '北京'],
    ['李四', '30', '上海'],
    ['王五', '22', '广州']
]

# 写入并读取CSV文件
write_csv('people.csv', headers, data)
read_headers, read_data = read_csv('people.csv')

例子5：简单的文件加密/解密

python 复制代码

def xor_encrypt_decrypt(input_file, output_file, key):
    """
    使用简单的XOR加密/解密文件
    注意：这只是演示用，不适合真实的加密需求
    """
    try:
        with open(input_file, 'rb') as infile:
            data = infile.read()
            
        # 将密钥转换为字节
        if isinstance(key, str):
            key = key.encode()
        
        # XOR 操作（加密和解密是同一个操作）
        encrypted_data = bytearray()
        for i, byte in enumerate(data):
            key_byte = key[i % len(key)]
            encrypted_data.append(byte ^ key_byte)
            
        with open(output_file, 'wb') as outfile:
            outfile.write(encrypted_data)
            
        print(f"处理完成: {input_file} -> {output_file}")
        return True
    except Exception as e:
        print(f"处理文件时出错: {e}")
        return False

# 使用XOR加密/解密
# 加密
xor_encrypt_decrypt('original.txt', 'encrypted.bin', 'secret_key')
# 解密（使用相同的密钥）
xor_encrypt_decrypt('encrypted.bin', 'decrypted.txt', 'secret_key')

九、处理路径的工具：os 和 pathlib 模块

使用 os.path 模块

python 复制代码

import os

# 获取当前工作目录
current_dir = os.getcwd()
print(f"当前目录: {current_dir}")

# 拼接路径
file_path = os.path.join(current_dir, 'data', 'example.txt')
print(f"文件路径: {file_path}")

# 检查路径是否存在
if os.path.exists(file_path):
    print(f"{file_path} 存在")
else:
    print(f"{file_path} 不存在")

# 区分文件和目录
if os.path.isfile(file_path):
    print(f"{file_path} 是文件")
elif os.path.isdir(file_path):
    print(f"{file_path} 是目录")

# 获取文件信息
if os.path.exists(file_path):
    size = os.path.getsize(file_path)
    modified_time = os.path.getmtime(file_path)
    print(f"文件大小: {size} 字节")
    print(f"修改时间: {modified_time}")

# 分割路径
dirname, filename = os.path.split(file_path)
print(f"目录: {dirname}")
print(f"文件名: {filename}")

# 分割文件名和扩展名
name, ext = os.path.splitext(filename)
print(f"文件名（无扩展名）: {name}")
print(f"扩展名: {ext}")

使用现代的 pathlib 模块

python 复制代码

from pathlib import Path

# 创建路径对象
current_dir = Path.cwd()
print(f"当前目录: {current_dir}")

# 构建路径
file_path = current_dir / 'data' / 'example.txt'
print(f"文件路径: {file_path}")

# 检查路径是否存在
if file_path.exists():
    print(f"{file_path} 存在")
else:
    print(f"{file_path} 不存在")

# 区分文件和目录
if file_path.is_file():
    print(f"{file_path} 是文件")
elif file_path.is_dir():
    print(f"{file_path} 是目录")

# 获取文件信息
if file_path.exists():
    size = file_path.stat().st_size
    modified_time = file_path.stat().st_mtime
    print(f"文件大小: {size} 字节")
    print(f"修改时间: {modified_time}")

# 获取路径组件
print(f"父目录: {file_path.parent}")
print(f"文件名: {file_path.name}")
print(f"文件名（无扩展名）: {file_path.stem}")
print(f"扩展名: {file_path.suffix}")

# 遍历目录
if current_dir.is_dir():
    print("目录中的文件:")
    for item in current_dir.iterdir():
        print(f"  {'目录' if item.is_dir() else '文件'}: {item.name}")

十、处理目录

创建目录

python 复制代码

import os
from pathlib import Path

# 使用 os 模块
if not os.path.exists('new_folder'):
    os.mkdir('new_folder')  # 创建单个目录
    print("目录已创建")

if not os.path.exists('path/to/nested/folder'):
    os.makedirs('path/to/nested/folder')  # 创建多级目录
    print("多级目录已创建")

# 使用 pathlib 模块
folder = Path('another_folder')
if not folder.exists():
    folder.mkdir()
    print("另一个目录已创建")

nested_folder = Path('another/nested/folder')
if not nested_folder.exists():
    nested_folder.mkdir(parents=True)  # 创建多级目录
    print("另一个多级目录已创建")

列出目录内容

python 复制代码

import os
from pathlib import Path

# 使用 os 模块
folder_path = 'data'
if os.path.exists(folder_path) and os.path.isdir(folder_path):
    print(f"{folder_path} 中的内容:")
    
    # 列出文件和目录
    items = os.listdir(folder_path)
    for item in items:
        item_path = os.path.join(folder_path, item)
        item_type = "目录" if os.path.isdir(item_path) else "文件"
        print(f"  {item_type}: {item}")
    
    # 只列出文件
    files = [f for f in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, f))]
    print(f"{folder_path} 中的文件: {files}")

# 使用 pathlib 模块
folder = Path('data')
if folder.exists() and folder.is_dir():
    print(f"{folder} 中的内容:")
    
    # 列出所有内容
    for item in folder.iterdir():
        item_type = "目录" if item.is_dir() else "文件"
        print(f"  {item_type}: {item.name}")
    
    # 只列出文件
    files = [item.name for item in folder.iterdir() if item.is_file()]
    print(f"{folder} 中的文件: {files}")
    
    # 使用通配符查找特定文件
    txt_files = list(folder.glob('*.txt'))
    print(f"{folder} 中的文本文件: {[f.name for f in txt_files]}")
    
    # 递归查找所有子目录中的文件
    all_txt_files = list(folder.glob('**/*.txt'))
    print(f"所有子目录中的文本文件: {[f.name for f in all_txt_files]}")

删除文件和目录

python 复制代码

import os
import shutil
from pathlib import Path

# 删除文件
if os.path.exists('temp.txt'):
    os.remove('temp.txt')
    print("文件已删除")

# 删除空目录
if os.path.exists('empty_folder') and os.path.isdir('empty_folder'):
    os.rmdir('empty_folder')  # 只能删除空目录
    print("空目录已删除")

# 删除非空目录（及其所有内容）
if os.path.exists('non_empty_folder') and os.path.isdir('non_empty_folder'):
    shutil.rmtree('non_empty_folder')
    print("非空目录及其内容已删除")

# 使用 pathlib（Python 3.8+）
file_path = Path('another_temp.txt')
if file_path.exists():
    file_path.unlink()  # 删除文件
    print("另一个文件已删除")

dir_path = Path('another_empty_folder')
if dir_path.exists() and dir_path.is_dir():
    dir_path.rmdir()  # 只能删除空目录
    print("另一个空目录已删除")

十一、文件和目录的移动、复制和重命名

python 复制代码

import os
import shutil
from pathlib import Path

# 重命名文件或目录
if os.path.exists('old_name.txt'):
    os.rename('old_name.txt', 'new_name.txt')
    print("文件已重命名")

# 移动文件或目录
if os.path.exists('source_file.txt'):
    shutil.move('source_file.txt', 'destination/source_file.txt')
    print("文件已移动")

# 复制文件
if os.path.exists('original.txt'):
    shutil.copy('original.txt', 'copy.txt')  # 复制文件
    print("文件已复制")
    
    shutil.copy2('original.txt', 'copy2.txt')  # 复制文件（保留元数据）
    print("文件已复制（带元数据）")

# 复制目录
if os.path.exists('original_dir') and os.path.isdir('original_dir'):
    shutil.copytree('original_dir', 'copy_dir')  # 复制整个目录树
    print("目录树已复制")

# 使用 pathlib 重命名
file_path = Path('old_path.txt')
if file_path.exists():
    file_path.rename('new_path.txt')
    print("文件已使用 pathlib 重命名")

十二、临时文件和目录

python 复制代码

import tempfile
import os

# 创建临时文件
with tempfile.NamedTemporaryFile(delete=False) as temp_file:
    temp_name = temp_file.name
    temp_file.write(b"这是临时文件的内容")
    print(f"临时文件创建于: {temp_name}")

# 使用临时文件
with open(temp_name, 'r') as file:
    content = file.read()
    print(f"临时文件内容: {content}")

# 删除临时文件
os.unlink(temp_name)
print("临时文件已删除")

# 创建临时目录
temp_dir = tempfile.mkdtemp()
print(f"临时目录创建于: {temp_dir}")

# 在临时目录中创建文件
temp_file_path = os.path.join(temp_dir, 'example.txt')
with open(temp_file_path, 'w') as file:
    file.write("临时目录中的文件内容")

# 清理临时目录
import shutil
shutil.rmtree(temp_dir)
print("临时目录已删除")

十三、实际应用：简单的文件备份脚本

python 复制代码

import os
import shutil
import datetime
import zipfile

def backup_folder(source_dir, backup_dir=None):
    """
    创建指定文件夹的备份压缩文件
    
    Args:
        source_dir: 要备份的源目录
        backup_dir: 备份文件存放目录，默认为当前目录
    
    Returns:
        backup_path: 备份文件的完整路径
    """
    # 源目录必须存在
    if not os.path.exists(source_dir):
        print(f"错误: 源目录 '{source_dir}' 不存在")
        return None
    
    # 默认备份到当前目录
    if backup_dir is None:
        backup_dir = os.getcwd()
    
    # 确保备份目录存在
    if not os.path.exists(backup_dir):
        os.makedirs(backup_dir)
    
    try:
        # 创建带时间戳的备份文件名
        source_name = os.path.basename(source_dir)
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        backup_filename = f"{source_name}_backup_{timestamp}.zip"
        backup_path = os.path.join(backup_dir, backup_filename)
        
        # 创建 ZIP 文件
        print(f"正在创建备份: {backup_path}")
        with zipfile.ZipFile(backup_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
            # 遍历源目录中的所有文件和子目录
            for root, dirs, files in os.walk(source_dir):
                # 计算相对路径，保持目录结构
                rel_path = os.path.relpath(root, os.path.dirname(source_dir))
                if rel_path == '.':
                    rel_path = source_name
                else:
                    rel_path = os.path.join(source_name, rel_path)
                
                # 添加空目录
                for dir_name in dirs:
                    dir_path = os.path.join(rel_path, dir_name)
                    zipf.write(os.path.join(root, dir_name), dir_path)
                
                # 添加文件
                for file_name in files:
                    file_path = os.path.join(rel_path, file_name)
                    zipf.write(os.path.join(root, file_name), file_path)
        
        print(f"备份完成: {backup_path}")
        
        # 获取备份文件大小
        backup_size = os.path.getsize(backup_path)
        print(f"备份文件大小: {backup_size / (1024*1024):.2f} MB")
        
        return backup_path
    
    except Exception as e:
        print(f"备份过程中出错: {e}")
        return None

# 使用备份函数
if __name__ == "__main__":
    # 要备份的目录
    source_directory = "important_data"
    
    # 备份目录
    backup_directory = "backups"
    
    # 执行备份
    backup_file = backup_folder(source_directory, backup_directory)
    
    if backup_file:
        print(f"备份文件已保存到: {backup_file}")
    else:
        print("备份失败")

小结：文件操作的最佳实践

始终使用 with 语句打开文件，确保文件正确关闭
指定字符编码 ，通常使用 utf-8
使用适当的错误处理，处理可能的异常
对于大文件，逐行或分块处理，避免一次性加载整个文件
对于路径操作，优先使用 pathlib 模块（Python 3.4+）
使用有意义的文件名和路径，按照应用程序的组织结构
备份重要文件，特别是在执行可能修改文件的操作之前
注意文件权限，特别是在不同操作系统之间
验证文件操作的结果，确保数据已正确写入/读取