ADVANCE Day32 - 技术栈

@浙大疏锦行

📘 Day 32 实战作业：程序员的生存指南 ------ 官方文档阅读

1. 作业综述

核心目标 ：

打破"遇到问题只会被动搜索"的习惯，掌握主动查阅文档 的技能。

我们不再机械地敲代码，而是通过模拟"遇到新库 -> 查文档 -> 写代码"的真实流程，训练你的自学能力。

涉及知识点：

内省 (Introspection) : 使用 dir(), help(), __doc__ 在代码中直接查看说明。
文档理解: 读懂函数签名 (Signature)、参数类型 (Parameters) 和返回值 (Returns)。
实战演练: 仅仅通过阅读说明，学会使用一个从未见过的标准库模块。

场景类比：

搜索引擎/ChatGPT：像是"问路"，别人告诉你怎么走，你只知道这一条路。
官方文档：像是"地图"，虽然刚开始看有点累，但你能掌握整个地形，想去哪都行。

步骤 1：离线求助 ------ 内省三件套 (dir, help, doc)

场景描述 ：

有时候网络不好，或者你懒得切浏览器。Python 自带了强大的"说明书"。

dir(obj): 这个东西里有什么？(列出属性和方法)
help(obj): 这个东西怎么用？(打印详细说明)
obj.__doc__: 看一眼简介。

任务：

导入 math 模块。
使用 dir() 查看它有哪些工具。
使用 help() 查看 math.log 的用法（注意底数 base 的默认值）。
根据文档说明，计算 l o g 2 ( 16 ) log_2(16) log2(16)。

py 复制代码

import math

# 1. 看看 math 盒子里装了什么
print("--- math 模块的工具列表 ---")
# 只打印前 10 个，避免刷屏
print(dir(math)[:10]) 

# 2. 看看 log 函数怎么用
print("\n--- math.log 的说明书 ---")
help(math.log)

# 3. 这里的关键是看文档：log(x, [base=math.e])
# 这意味着如果不传第二个参数，默认是自然对数 ln
# 如果要算 log2，需要传 base=2

val = math.log(16, 2)
print(f"\n计算结果: log2(16) = {val}")

复制代码

--- math 模块的工具列表 ---
['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan']

--- math.log 的说明书 ---
Help on built-in function log in module math:

log(...)
    log(x, [base=math.e])
    Return the logarithm of x to the given base.
    
    If the base not specified, returns the natural logarithm (base e) of x.


计算结果: log2(16) = 4.0

步骤 2：文档模拟战 ------ 读懂函数签名

场景描述 ：

现在假设我们要使用 datetime 模块来解析时间字符串。

这是一个典型的"不看文档绝对写不对"的函数。

你需要阅读 strptime 的格式化代码文档。

官方文档片段 (模拟)：

datetime.strptime(date_string, format)

%Y: Year with century (e.g. 2024)

%m: Month as a zero-padded decimal number (01, 02, ..., 12)

%d: Day of the month (01, 02, ..., 31)

%H: Hour (24-hour clock)

%M: Minute

任务：

给定时间字符串 "2025-12-31 23:59"。
根据上面的文档片段，构建正确的 format 字符串。
将字符串转换为 datetime 对象。

py 复制代码

from datetime import datetime

date_str = "2025-12-31 23:59"

# ❌ 错误示范 (这也是初学者常犯的错)
# dt = datetime.strptime(date_str) # 报错：没有指定格式

# ✅ 正确示范 (查文档后)
# 2025 -> %Y
# -    -> - (原样照抄)
# 12   -> %m
# 31   -> %d
# 23   -> %H
# :    -> :
# 59   -> %M
fmt = "%Y-%m-%d %H:%M"

dt = datetime.strptime(date_str, fmt)
print(f"转换成功: {dt}")
print(f"类型: {type(dt)}")
print(f"年份: {dt.year}")

复制代码

转换成功: 2025-12-31 23:59:00
类型: <class 'datetime.datetime'>
年份: 2025

步骤 3：自学挑战 ------ 探索新大陆

场景描述 ：

我们要解决一个经典面试题：统计一段文本中每个单词出现的次数，并找出前 3 名 。

你可以手写循环字典计数，但 Python 的 collections 模块里有一个神器叫 Counter。

假设你以前从未用过它，请通过查看文档（help/dir）自学并完成任务。

任务：

导入 from collections import Counter。
使用 help(Counter) 找到初始化方法和 most_common 方法。
统计 text 中单词的频率，并输出频率最高的前 3 个单词。

py 复制代码

from collections import Counter

text = "apple banana apple orange banana apple grape apple banana orange"
words = text.split() # 先切分成列表

# 1. 自学：初始化
# 这里的文档会告诉你：Counter(iterable) -> 自动计数
counter = Counter(words)
print(f"计数器对象: {counter}")

# 2. 自学：如何获取 Top K？
# 使用 dir(counter) 可能会发现一个叫 most_common 的方法
# 使用 help(counter.most_common) 查看用法
print("\n--- most_common 文档摘要 ---")
print(counter.most_common.__doc__)

# 3. 实战应用
top_3 = counter.most_common(3)
print(f"\n频率最高的 3 个单词: {top_3}")

复制代码

计数器对象: Counter({'apple': 4, 'banana': 3, 'orange': 2, 'grape': 1})

--- most_common 文档摘要 ---
List the n most common elements and their counts from the most
        common to the least.  If n is None, then list all element counts.

        >>> Counter('abracadabra').most_common(3)
        [('a', 5), ('b', 2), ('r', 2)]

        

频率最高的 3 个单词: [('apple', 4), ('banana', 3), ('orange', 2)]

🎓 Day 32 总结：掌握"元技能"

今天我们练习的不是某行具体的代码，而是一种**"元技能" (Meta-Skill)** ------ 学习如何学习。

RTFM (Read The Friendly Manual): 不要害怕那密密麻麻的英文文档，那是开发者写给你的情书。
准确性 : 相比于搜索到的二手代码片段，官方文档（或 help()）提供的信息是最准确、不过时的。
探索性 : 通过 dir()，你经常能发现一些意想不到的好用功能（比如 Counter 的 most_common）。

Next Level :

具备了阅读文档的能力，你现在已经有能力去啃那些硬骨头了。

当你面对 scikit-learn 或 matplotlib 庞大的参数列表时，记得今天的方法：先 help，再 coding。