Comprehensive Python Cheatsheet 综合 Python 备忘单P2

Comprehensive Python Cheatsheet 综合 Python 备忘单P12

[#](#Match Statement #匹配声明 "#matchstatement")Match Statement #匹配声明

Executes the first block with matching pattern. Added in Python 3.10. 执行具有匹配模式的第一个块。 Python 3.10 中添加。

sql 复制代码
match <object/expression>:
    case <pattern> [if <condition>]:
        <code>
    ...

Patterns 模式

python 复制代码
<value_pattern> = 1/'abc'/True/None/math.pi          # Matches the literal or a dotted name.
<class_pattern> = <type>()                           # Matches any object of that type.
<wildcard_patt> = _                                  # Matches any object.
<capture_patt>  = <name>                             # Matches any object and binds it to name.
<or_pattern>    = <pattern> | <pattern> [| ...]      # Matches any of the patterns.
<as_pattern>    = <pattern> as <name>                # Binds the match to the name.
<sequence_patt> = [<pattern>, ...]                   # Matches sequence with matching items.
<mapping_patt>  = {<value_pattern>: <pattern>, ...}  # Matches dictionary with matching items.
<class_pattern> = <type>(<attr_name>=<patt>, ...)    # Matches object with matching attributes.
  • Sequence pattern can also be written as a tuple. 序列模式也可以写成元组。
  • Use '*<name>' and '**<name>' in sequence/mapping patterns to bind remaining items. 在序列/映射模式中使用 '*<name>''**<name>' 来绑定剩余的项目。
  • Sequence pattern must match all items, while mapping pattern does not. 序列模式必须匹配所有项目,而映射模式则不然。
  • Patterns can be surrounded with brackets to override precedence ('|' > 'as' > ','). 模式可以用方括号括起来以覆盖优先级 ( '|' > 'as' > ',' )。
  • Built-in types allow a single positional pattern that is matched against the entire object. 内置类型允许与整个对象匹配的单个位置模式。
  • All names that are bound in the matching case, as well as variables initialized in its block, are visible after the match statement. 匹配 case 中绑定的所有名称以及其块中初始化的变量在 match 语句之后可见。

Example 例子

python 复制代码
>>> from pathlib import Path
>>> match Path('/home/gto/python-cheatsheet/README.md'):
...     case Path(
...         parts=['/', 'home', user, *_],
...         stem=stem,
...         suffix=('.md' | '.txt') as suffix
...     ) if stem.lower() == 'readme':
...         print(f'{stem}{suffix} is a readme file that belongs to user {user}.')
'README.md is a readme file that belongs to user gto.'

[#](#Logging 日志 "#logging")Logging 日志

python 复制代码
import logging
logging.basicConfig(filename=<path>, level='DEBUG')  # Configures the root logger (see Setup).
logging.debug/info/warning/error/critical(<str>)     # Logs to the root logger.
<Logger> = logging.getLogger(__name__)               # Logger named after the module.
<Logger>.<level>(<str>)                              # Logs to the logger.
<Logger>.exception(<str>)                            # Error() that appends caught exception.

Setup 设置

ini 复制代码
logging.basicConfig(
    filename=None,                                   # Logs to console (stderr) by default.
    format='%(levelname)s:%(name)s:%(message)s',     # Add '%(asctime)s' for local datetime.
    level=logging.WARNING,                           # Drops messages with lower priority.
    handlers=[logging.StreamHandler(sys.stderr)]     # Uses FileHandler if filename is set.
)
<Formatter> = logging.Formatter('<format>')          # Creates a Formatter.
<Handler> = logging.FileHandler(<path>, mode='a')    # Creates a Handler. Also `encoding=None`.
<Handler>.setFormatter(<Formatter>)                  # Adds Formatter to the Handler.
<Handler>.setLevel(<int/str>)                        # Processes all messages by default.
<Logger>.addHandler(<Handler>)                       # Adds Handler to the Logger.
<Logger>.setLevel(<int/str>)                         # What is sent to its/ancestors' handlers.
<Logger>.propagate = <bool>                          # Cuts off ancestors' handlers if false.
  • Parent logger can be specified by naming the child logger '<parent>.<name>'. 可以通过命名子记录器 '<parent>.<name>' 来指定父记录器。
  • If logger doesn't have a set level it inherits it from the first ancestor that does. 如果记录器没有设置级别,它会从第一个具有设置级别的祖先继承它。
  • Formatter also accepts: pathname, filename, funcName, lineno, thread and process. 格式化程序还接受:路径名、文件名、funcName、lineno、线程和进程。
  • A 'handlers.RotatingFileHandler' creates and deletes log files based on 'maxBytes' and 'backupCount' arguments. 'handlers.RotatingFileHandler' 根据"maxBytes"和"backupCount"参数创建和删除日志文件。

Creates a logger that writes all messages to file and sends them to the root's handler that prints warnings or higher: 创建一个记录器,将所有消息写入文件并将它们发送到打印警告或更高级别的根处理程序:

python 复制代码
>>> logger = logging.getLogger('my_module')
>>> handler = logging.FileHandler('test.log', encoding='utf-8')
>>> handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s:%(name)s:%(message)s'))
>>> logger.addHandler(handler)
>>> logger.setLevel('DEBUG')
>>> logging.basicConfig()
>>> logging.root.handlers[0].setLevel('WARNING')
>>> logger.critical('Running out of disk space.')
CRITICAL:my_module:Running out of disk space.
>>> print(open('test.log').read())
2023-02-07 23:21:01,430 CRITICAL:my_module:Running out of disk space.

[#](#Introspection 内省 "#introspection")Introspection 内省

python 复制代码
<list> = dir()                             # Names of local variables, functions, classes, etc.
<dict> = vars()                            # Dict of local variables, etc. Also locals().
<dict> = globals()                         # Dict of global vars, etc. (incl. '__builtins__').
<list> = dir(<object>)                     # Names of object's attributes (including methods).
<dict> = vars(<object>)                    # Dict of writable attributes. Also <obj>.__dict__.
<bool> = hasattr(<object>, '<attr_name>')  # Checks if getattr() raises an AttributeError.
value  = getattr(<object>, '<attr_name>')  # Default value can be passed as the third argument.
setattr(<object>, '<attr_name>', value)    # Only works on objects with '__dict__' attribute.
delattr(<object>, '<attr_name>')           # Same. Also `del <object>.<attr_name>`.
<Sig>  = inspect.signature(<function>)     # Function's Signature object.
<dict> = <Sig>.parameters                  # Dict of Parameter objects.
<memb> = <Param>.kind                      # Member of ParameterKind enum.
<obj>  = <Param>.default                   # Default value or Parameter.empty.
<type> = <Param>.annotation                # Type or Parameter.empty.

[#](#Coroutines 协程 "#coroutines")Coroutines 协程

  • Coroutines have a lot in common with threads, but unlike threads, they only give up control when they call another coroutine and they don't use as much memory. 协程与线程有很多共同点,但与线程不同的是,它们仅在调用另一个协程时放弃控制,并且不使用那么多内存。
  • Coroutine definition starts with 'async' and its call with 'await'. 协程定义以 'async' 开头,调用以 'await' 开头。
  • 'asyncio.run(<coroutine>)' is the main entry point for asynchronous programs. 'asyncio.run(<coroutine>)' 是异步程序的主要入口点。
python 复制代码
import asyncio as aio
<coro> = <async_func>(<args>)             # Creates a coroutine.
<obj>  = await <coroutine>                # Starts the coroutine and returns result.
<task> = aio.create_task(<coroutine>)     # Schedules coroutine for execution.
<obj>  = await <task>                     # Returns result. Also <task>.cancel().
<coro> = aio.gather(<coro/task>, ...)     # Schedules coroutines. Returns results when awaited.
<coro> = aio.wait(<tasks>, ...)             # `aio.ALL/FIRST_COMPLETED`. Returns (done, pending).
<iter> = aio.as_completed(<coros/tasks>)  # Iter of coros. All return next result when awaited.

Runs a terminal game where you control an asterisk that must avoid numbers: 运行一个终端游戏,您可以在其中控制必须避免数字的星号:

css 复制代码
import asyncio, collections, curses, curses.textpad, enum, random, timeP = collections.namedtuple('P', 'x y')    # Position
D = enum.Enum('D', 'n e s w')             # Direction
W, H = 15, 7                              # Width, Heightdef main(screen):
    curses.curs_set(0)                    # Makes cursor invisible.
    screen.nodelay(True)                  # Makes getch() non-blocking.
    asyncio.run(main_coroutine(screen))   # Starts running asyncio code.async def main_coroutine(screen):
    moves = asyncio.Queue()
    state = {'*': P(0, 0), **{id_: P(W//2, H//2) for id_ in range(10)}}
    ai    = [random_controller(id_, moves) for id_ in range(10)]
    mvc   = [human_controller(screen, moves), model(moves, state), view(state, screen)]
    tasks = [asyncio.create_task(cor) for cor in ai + mvc]
    await asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED)async def random_controller(id_, moves):
    while True:
        d = random.choice(list(D))
        moves.put_nowait((id_, d))
        await asyncio.sleep(random.triangular(0.01, 0.65))async def human_controller(screen, moves):
    while True:
        key_mappings = {258: D.s, 259: D.n, 260: D.w, 261: D.e}
        if d := key_mappings.get(screen.getch()):
            moves.put_nowait(('*', d))
        await asyncio.sleep(0.005)async def model(moves, state):
    while state['*'] not in (state[id_] for id_ in range(10)):
        id_, d = await moves.get()
        deltas = {D.n: P(0, -1), D.e: P(1, 0), D.s: P(0, 1), D.w: P(-1, 0)}
        state[id_] = P((state[id_].x + deltas[d].x) % W, (state[id_].y + deltas[d].y) % H)async def view(state, screen):
    offset = P(curses.COLS//2 - W//2, curses.LINES//2 - H//2)
    while True:
        screen.erase()
        curses.textpad.rectangle(screen, offset.y-1, offset.x-1, offset.y+H, offset.x+W)
        for id_, p in state.items():
            screen.addstr(offset.y + (p.y - state['*'].y + H//2) % H,
                          offset.x + (p.x - state['*'].x + W//2) % W, str(id_))
        screen.refresh()
        await asyncio.sleep(0.005)if __name__ == '__main__':
    curses.wrapper(main)

Libraries (常用)库

[#](#Progress Bar #进度条 "#progressbar")Progress Bar #进度条

ini 复制代码
# $ pip3 install tqdm
>>> import tqdm, time
>>> for el in tqdm.tqdm([1, 2, 3], desc='Processing'):
...     time.sleep(1)
Processing: 100%|████████████████████| 3/3 [00:03<00:00,  1.00s/it]

[#](#Plot 绘图 "#plot")Plot 绘图

python 复制代码
# $ pip3 install matplotlib
import matplotlib.pyplot as pltplt.plot/bar/scatter(x_data, y_data [, label=<str>])  # Or: plt.plot(y_data)
plt.legend()                                          # Adds a legend.
plt.savefig(<path>)                                   # Saves the figure.
plt.show()                                            # Displays the figure.
plt.clf()                                             # Clears the figure.

[#](#Table 表格 "#table")Table 表格

Prints a CSV spreadsheet to the console: 将 CSV 电子表格打印到控制台:

ini 复制代码
# $ pip3 install tabulate
import csv, tabulate
with open('test.csv', encoding='utf-8', newline='') as file:
    rows = list(csv.reader(file))
print(tabulate.tabulate(rows, headers='firstrow'))

[#](#Curses 终端 "#curses")Curses 终端

Runs a basic file explorer in the console: 在控制台中运行基本文件资源管理器:

css 复制代码
# $ pip3 install windows-curses
import curses, os
from curses import A_REVERSE, KEY_DOWN, KEY_UP, KEY_LEFT, KEY_RIGHT, KEY_ENTERdef main(screen):
    ch, first, selected, paths = 0, 0, 0, os.listdir()
    while ch != ord('q'):
        height, width = screen.getmaxyx()
        screen.erase()
        for y, filename in enumerate(paths[first : first+height]):
            color = A_REVERSE if filename == paths[selected] else 0
            screen.addnstr(y, 0, filename, width-1, color)
        ch = screen.getch()
        selected += (ch == KEY_DOWN) - (ch == KEY_UP)
        selected = max(0, min(len(paths)-1, selected))
        first += (selected >= first + height) - (selected < first)
        if ch in [KEY_LEFT, KEY_RIGHT, KEY_ENTER, ord('\n'), ord('\r')]:
            new_dir = '..' if ch == KEY_LEFT else paths[selected]
            if os.path.isdir(new_dir):
                os.chdir(new_dir)
                first, selected, paths = 0, 0, os.listdir()if __name__ == '__main__':
    curses.wrapper(main)

[#](#PySimpleGUI 简单GUI程序 "#pysimplegui")PySimpleGUI 简单GUI程序

A weight converter GUI application: 重量转换器 GUI 应用程序:

ini 复制代码
# $ pip3 install PySimpleGUI
import PySimpleGUI as sgtext_box = sg.Input(default_text='100', enable_events=True, key='-VALUE-')
dropdown = sg.InputCombo(['g', 'kg', 't'], 'kg', readonly=True, enable_events=True, k='-UNIT-')
label    = sg.Text('100 kg is 220.462 lbs.', key='-OUTPUT-')
button   = sg.Button('Close')
window   = sg.Window('Weight Converter', [[text_box, dropdown], [label], [button]])while True:
    event, values = window.read()
    if event in [sg.WIN_CLOSED, 'Close']:
        break
    try:
        value = float(values['-VALUE-'])
    except ValueError:
        continue
    unit = values['-UNIT-']
    factors = {'g': 0.001, 'kg': 1, 't': 1000}
    lbs = value * factors[unit] / 0.45359237
    window['-OUTPUT-'].update(value=f'{value} {unit} is {lbs:g} lbs.')
window.close()

[#](#Scraping 抓取(爬虫) "#scraping")Scraping 抓取(爬虫)

ini 复制代码
# $ pip3 install requests beautifulsoup4
import requests, bs4, osresponse   = requests.get('https://en.wikipedia.org/wiki/Python_(programming_language)')
document   = bs4.BeautifulSoup(response.text, 'html.parser')
table      = document.find('table', class_='infobox vevent')
python_url = table.find('th', text='Website').next_sibling.a['href']
logo_url   = table.find('img')['src']
logo       = requests.get(f'https:{logo_url}').content
filename   = os.path.basename(logo_url)
with open(filename, 'wb') as file:
    file.write(logo)
print(f'{python_url}, file://{os.path.abspath(filename)}')

Selenium 浏览器模拟器

Library for scraping websites with dynamic content. 用于抓取具有动态内容的网站的库。

xml 复制代码
# $ pip3 install selenium
from selenium import webdriver<Drv> = webdriver.Chrome/Firefox/Safari/Edge()         # Opens the browser. Also <Drv>.quit().
<Drv>.get('<url>')                                     # Also <Drv>.implicitly_wait(seconds).
<El> = <Drv/El>.find_element('css selector', '<css>')  # '<tag>#<id>.<class>[<attr>="<val>"]'.
<list> = <Drv/El>.find_elements('xpath', '<xpath>')    # '//<tag>[@<attr>="<val>"]'.
<str> = <El>.get_attribute/get_property(<str>)         # Also <El>.text/tag_name.
<El>.click/clear()                                     # Also <El>.send_keys(<str>).

XPath --- also available in browser's console via '$x(<xpath>)' and by lxml library: XPath --- 也可以通过 '$x(<xpath>)' 和 lxml 库在浏览器控制台中使用:

xml 复制代码
<xpath>     = //<element>[/ or // <element>]           # Child: /, Descendant: //, Parent: /..
<xpath>     = //<element>/following::<element>         # Next sibling. Also preceding/parent/...
<element>   = <tag><conditions><index>                 # `<tag> = */a/...`, `<index> = [1/2/...]`.
<condition> = [<sub_cond> [and/or <sub_cond>]]         # For negation use `not(<sub_cond>)`.
<sub_cond>  = @<attr>="<val>"                          # `.="<val>"` matches complete text.
<sub_cond>  = contains(@<attr>, "<val>")               # Is <val> a substring of attr's value?
<sub_cond>  = [//]<element>                            # Has matching child? Descendant if //.

[#](#Web 网站 "#web")Web 网站

Flask is a micro web framework/server. If you just want to open a html file in a web browser use 'webbrowser.open(<path>)' instead. Flask 是一个微型 Web 框架/服务器。如果您只想在网络浏览器中打开 html 文件,请使用 'webbrowser.open(<path>)'

ini 复制代码
# $ pip3 install flask
import flask
app = flask.Flask(__name__)
app.run(host=None, port=None, debug=None)
  • Starts the app at 'http://localhost:5000'. Use 'host="0.0.0.0"' to run externally. 在 'http://localhost:5000' 处启动应用程序。使用 'host="0.0.0.0"' 在外部运行。
  • Install a WSGI server like Waitress and a HTTP server such as Nginx for better security. 安装 WSGI 服务器(如 Waitress)和 HTTP 服务器(如 Nginx)以获得更好的安全性。
  • Debug mode restarts the app whenever script changes and displays errors in the browser. 每当脚本更改时,调试模式都会重新启动应用程序并在浏览器中显示错误。

Static Request 静态请求

python 复制代码
@app.route('/img/<path:filename>')
def serve_file(filename):
    return flask.send_from_directory('dirname/', filename)

Dynamic Request 动态请求

python 复制代码
@app.route('/<sport>')
def serve_html(sport):
    return flask.render_template_string('<h1>{{title}}</h1>', title=sport)
  • Use 'render_template(filename, <kwargs>)' to render file located in templates dir. 使用 'render_template(filename, <kwargs>)' 渲染位于模板目录中的文件。
  • To return an error code use 'abort(<int>)' and to redirect use 'redirect(<url>)'. 要返回错误代码,请使用 'abort(<int>)' 并使用 'redirect(<url>)' 进行重定向。
  • 'request.args[<str>]' returns parameter from the query string (URL part after '?'). 'request.args[<str>]' 从查询字符串返回参数("?"之后的 URL 部分)。
  • Use 'session[key] = value' to store session data like username, etc. 使用 'session[key] = value' 存储用户名等会话数据。

REST Request REST请求

python 复制代码
@app.post('/<sport>/odds')
def serve_json(sport):
    team = flask.request.form['team']
    return {'team': team, 'odds': [2.09, 3.74, 3.68]}

Starts the app in its own thread and queries its REST API: 在自己的线程中启动应用程序并查询其 REST API:

python 复制代码
# $ pip3 install requests
>>> import threading, requests
>>> threading.Thread(target=app.run, daemon=True).start()
>>> url = 'http://localhost:5000/football/odds'
>>> request_data = {'team': 'arsenal f.c.'}
>>> response = requests.post(url, data=request_data)
>>> response.json()
{'team': 'arsenal f.c.', 'odds': [2.09, 3.74, 3.68]}

[#](#Profiling 分析 "#profiling")Profiling 分析

ini 复制代码
from time import perf_counter
start_time = perf_counter()
...
duration_in_seconds = perf_counter() - start_time

Timing a Snippet 为一段代码计时

python 复制代码
>>> from timeit import timeit
>>> timeit('list(range(10000))', number=1000, globals=globals(), setup='pass')
0.19373

Profiling by Line 按行分析

less 复制代码
$ pip3 install line_profiler
$ echo '@profile
def main():
    a = list(range(10000))
    b = set(range(10000))
main()' > test.py
$ kernprof -lv test.py
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           @profile
     2                                           def main():
     3         1        253.4    253.4     32.2      a = list(range(10000))
     4         1        534.1    534.1     67.8      b = set(range(10000))

Call and Flame Graphs 调用图和火焰图

shell 复制代码
$ apt/brew install graphviz && pip3 install gprof2dot snakeviz  # Or download installer.
$ tail --lines=4 test.py > test.py                              # Removes first line.
$ python3 -m cProfile -o test.prof test.py                      # Runs built-in profiler.
$ gprof2dot --format=pstats test.prof | dot -T png -o test.png  # Generates call graph.
$ xdg-open/open test.png                                        # Displays call graph.
$ snakeviz test.prof                                            # Displays flame graph.

Sampling and Memory Profilers 采样和内存分析器

css 复制代码
┏━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━┓
┃ pip3 install │   Type   │   Target   │          How to run           │ Live ┃
┠──────────────┼──────────┼────────────┼───────────────────────────────┼──────┨
┃ pyinstrument │ Sampling │    CPU     │ pyinstrument test.py          │  ×   ┃
┃ py-spy       │ Sampling │    CPU     │ py-spy top -- python3 test.py │  ✓   ┃
┃ scalene      │ Sampling │ CPU+Memory │ scalene test.py               │  ×   ┃
┃ memray       │ Tracing  │   Memory   │ memray run --live test.py     │  ✓   ┃
┗━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━┛

#NumPy

Array manipulation mini-language. It can run up to one hundred times faster than the equivalent Python code. An even faster alternative that runs on a GPU is called CuPy. 数组操作迷你语言。它的运行速度比同等的 Python 代码快一百倍。在 GPU 上运行的更快的替代方案称为 CuPy。

xml 复制代码
# $ pip3 install numpy
import numpy as np
<array> = np.array(<list/list_of_lists/...>)              # Returns a 1d/2d/... NumPy array.
<array> = np.zeros/ones/empty(<shape>)                  # Also np.full(<shape>, <el>).
<array> = np.arange(from_inc, to_exc, ±step)            # Also np.linspace(start, stop, len).
<array> = np.random.randint(from_inc, to_exc, <shape>)  # Also np.random.random(<shape>).
<view>  = <array>.reshape(<shape>)                      # Also `<array>.shape = <shape>`.
<array> = <array>.flatten()                             # Also `<view> = <array>.ravel()`.
<view>  = <array>.transpose()                           # Or: <array>.T
<array> = np.copy/abs/sqrt/log/int64(<array>)           # Returns new array of the same shape.
<array> = <array>.sum/max/mean/argmax/all(axis)         # Passed dimension gets aggregated.
<array> = np.apply_along_axis(<func>, axis, <array>)    # Func can return a scalar or array.
<array> = np.concatenate(<list_of_arrays>, axis=0)      # Links arrays along first axis (rows).
<array> = np.row_stack/column_stack(<list_of_arrays>)   # Treats 1d arrays as rows or columns.
<array> = np.tile/repeat(<array>, <int/list> [, axis])  # Tiles array or repeats its elements.
  • Shape is a tuple of dimension sizes. A 100x50 RGB image has shape (50, 100, 3). 形状是尺寸大小的元组。 100x50 RGB 图像的形状为 (50, 100, 3)。
  • Axis is an index of a dimension. Leftmost dimension has index 0. Summing the RGB image along axis 2 will return a greyscale image with shape (50, 100). 轴是维度的索引。最左边的维度的索引为 0。沿轴 2 对 RGB 图像求和将返回形状为 (50, 100) 的灰度图像。

Indexing 索引

ini 复制代码
<el>       = <2d_array>[row_index, column_index]        # <3d_a>[table_i, row_i, column_i]
<1d_view>  = <2d_array>[row_index]                      # <3d_a>[table_i, row_i]
<1d_view>  = <2d_array>[:, column_index]                # <3d_a>[table_i, :, column_i]
<2d_view>  = <2d_array>[rows_slice, columns_slice]      # <3d_a>[table_i, rows_s, columns_s]
<2d_array> = <2d_array>[row_indexes]                    # <3d_a>[table_i/is, row_is]
<2d_array> = <2d_array>[:, column_indexes]              # <3d_a>[table_i/is, :, column_is]
<1d_array> = <2d_array>[row_indexes, column_indexes]    # <3d_a>[table_i/is, row_is, column_is]
<1d_array> = <2d_array>[row_indexes, column_index]      # <3d_a>[table_i/is, row_is, column_i]
<2d_bools> = <2d_array> > <el/1d/2d_array>              # 1d_array must have size of a row.
<1d/2d_a>  = <2d_array>[<2d/1d_bools>]                  # 1d_bools must have size of a column.
  • Indexes should not be tuples because Python converts 'obj[i, j]' to 'obj[(i, j)]'! 索引不应该是元组,因为 Python 将 'obj[i, j]' 转换为 'obj[(i, j)]'
  • ':' returns a slice of all dimension's indexes. Omitted dimensions default to ':'. ':' 返回所有维度索引的切片。省略的尺寸默认为 ':'
  • Any value that is broadcastable to the indexed shape can be assigned to the selection. 任何可广播到索引形状的值都可以分配给选择。

Broadcasting 广播

Set of rules by which NumPy functions operate on arrays of different sizes and/or dimensions. NumPy 函数对不同大小和/或维度的数组进行操作的一组规则。

ini 复制代码
left  = [[0.1], [0.6], [0.8]]                           # Shape: (3, 1)
right = [ 0.1 ,  0.6 ,  0.8 ]                           # Shape: (3,)

1. If array shapes differ in length, left-pad the shorter shape with ones: 1. 如果数组形状的长度不同,则用 1 向左填充较短的形状:

lua 复制代码
left  = [[0.1], [0.6], [0.8]]                           # Shape: (3, 1)
right = [[0.1 ,  0.6 ,  0.8]]                           # Shape: (1, 3) <- !

2. If any dimensions differ in size, expand the ones that have size 1 by duplicating their elements: 2. 如果任何维度的大小不同,请通过复制其元素来扩展大小为 1 的维度:

lua 复制代码
left  = [[0.1,  0.1,  0.1],                             # Shape: (3, 3) <- !
         [0.6,  0.6,  0.6],
         [0.8,  0.8,  0.8]]right = [[0.1,  0.6,  0.8],                             # Shape: (3, 3) <- !
         [0.1,  0.6,  0.8],
         [0.1,  0.6,  0.8]]

Example 例子

For each point returns index of its nearest point ([0.1, 0.6, 0.8] => [1, 2, 1]): 对于每个点返回其最近点的索引( [0.1, 0.6, 0.8] => [1, 2, 1] ):

ini 复制代码
>>> points = np.array([0.1, 0.6, 0.8])
 [ 0.1,  0.6,  0.8]
>>> wrapped_points = points.reshape(3, 1)
[[ 0.1],
 [ 0.6],
 [ 0.8]]
>>> distances = wrapped_points - points
[[ 0. , -0.5, -0.7],
 [ 0.5,  0. , -0.2],
 [ 0.7,  0.2,  0. ]]
>>> distances = np.abs(distances)
[[ 0. ,  0.5,  0.7],
 [ 0.5,  0. ,  0.2],
 [ 0.7,  0.2,  0. ]]
>>> distances[range(3), range(3)] = np.inf
[[ inf,  0.5,  0.7],
 [ 0.5,  inf,  0.2],
 [ 0.7,  0.2,  inf]]
>>> distances.argmin(1)
[1, 2, 1]

[#](#Image 图片/图像 "#image")Image 图片/图像

python 复制代码
# $ pip3 install pillow
from PIL import Image
<Image> = Image.new('<mode>', (width, height))  # Also `color=<int/tuple/str>`.
<Image> = Image.open(<path>)                    # Identifies format based on file contents.
<Image> = <Image>.convert('<mode>')             # Converts image to the new mode.
<Image>.save(<path>)                            # Selects format based on the path extension.
<Image>.show()                                  # Opens image in the default preview app.
<int/tuple> = <Image>.getpixel((x, y))          # Returns pixel's value (its color).
<Image>.putpixel((x, y), <int/tuple>)           # Updates pixel's value.
<ImagingCore> = <Image>.getdata()               # Returns a flattened view of pixel values.
<Image>.putdata(<list/ImagingCore>)             # Updates pixels with a copy of the sequence.
<Image>.paste(<Image>, (x, y))                  # Draws passed image at specified location.
<Image> = <Image>.filter(<Filter>)              # `<Filter> = ImageFilter.<name>([<args>])`
<Image> = <Enhance>.enhance(<float>)            # `<Enhance> = ImageEnhance.<name>(<Image>)`
<array> = np.array(<Image>)                     # Creates a 2d/3d NumPy array from the image.
<Image> = Image.fromarray(np.uint8(<array>))    # Use `<array>.clip(0, 255)` to clip values.

Modes 模式

  • 'L' - 8-bit pixels, greyscale. 'L' - 8 位像素,灰度。
  • 'RGB' - 3x8-bit pixels, true color. 'RGB' - 3x8 位像素,真彩色。
  • 'RGBA' - 4x8-bit pixels, true color with transparency mask. 'RGBA' - 4x8 位像素,带透明蒙版的真彩色。
  • 'HSV' - 3x8-bit pixels, Hue, Saturation, Value color space. 'HSV' - 3x8 位像素、色调、饱和度、明度颜色空间。

Examples 例子

Creates a PNG image of a rainbow gradient: 创建彩虹渐变的 PNG 图像:

css 复制代码
WIDTH, HEIGHT = 100, 100
n_pixels = WIDTH * HEIGHT
hues = (255 * i/n_pixels for i in range(n_pixels))
img = Image.new('HSV', (WIDTH, HEIGHT))
img.putdata([(int(h), 255, 255) for h in hues])
img.convert('RGB').save('test.png')

Adds noise to the PNG image and displays it: 向 PNG 图像添加噪声并显示它:

css 复制代码
from random import randint
add_noise = lambda value: max(0, min(255, value + randint(-20, 20)))
img = Image.open('test.png').convert('HSV')
img.putdata([(add_noise(h), s, v) for h, s, v in img.getdata()])
img.show()

Image Draw 图像绘制

scss 复制代码
from PIL import ImageDraw
<ImageDraw> = ImageDraw.Draw(<Image>)           # Object for adding 2D graphics to the image.
<ImageDraw>.point((x, y))                       # Draws a point. Truncates floats into ints.
<ImageDraw>.line((x1, y1, x2, y2 [, ...]))      # To get anti-aliasing use Image's resize().
<ImageDraw>.arc((x1, y1, x2, y2), deg1, deg2)   # Always draws in clockwise direction.
<ImageDraw>.rectangle((x1, y1, x2, y2))         # To rotate use Image's rotate() and paste().
<ImageDraw>.polygon((x1, y1, x2, y2, ...))      # Last point gets connected to the first.
<ImageDraw>.ellipse((x1, y1, x2, y2))           # To rotate use Image's rotate() and paste().
<ImageDraw>.text((x, y), <str>, font=<Font>)    # `<Font> = ImageFont.truetype(<path>, size)`
  • Use 'fill=<color>' to set the primary color. 使用 'fill=<color>' 设置原色。
  • Use 'width=<int>' to set the width of lines or contours. 使用 'width=<int>' 设置线条或轮廓的宽度。
  • Use 'outline=<color>' to set the color of the contours. 使用 'outline=<color>' 设置轮廓的颜色。
  • Color can be an int, tuple, '#rrggbb[aa]' string or a color name. 颜色可以是整数、元组、 '#rrggbb[aa]' 字符串或颜色名称。

[#](#Animation 动画 "#animation")Animation 动画

Creates a GIF of a bouncing ball: 创建弹跳球的 GIF:

ini 复制代码
# $ pip3 install imageio
from PIL import Image, ImageDraw
import imageioWIDTH, HEIGHT, R = 126, 126, 10
frames = []
for velocity in range(1, 16):
    y = sum(range(velocity))
    frame = Image.new('L', (WIDTH, HEIGHT))
    draw = ImageDraw.Draw(frame)
    draw.ellipse((WIDTH/2-R, y, WIDTH/2+R, y+R*2), fill='white')
    frames.append(frame)
frames += reversed(frames[1:-1])
imageio.mimsave('test.gif', frames, duration=0.03)

[#](#Audio 音频 "#audio")Audio 音频

python 复制代码
import wave
<Wave>  = wave.open('<path>', 'rb')   # Opens the WAV file.
<int>   = <Wave>.getframerate()       # Returns number of frames per second.
<int>   = <Wave>.getnchannels()       # Returns number of samples per frame.
<int>   = <Wave>.getsampwidth()       # Returns number of bytes per sample.
<tuple> = <Wave>.getparams()          # Returns namedtuple of all parameters.
<bytes> = <Wave>.readframes(nframes)  # Returns next n frames. All if -1.
<Wave> = wave.open('<path>', 'wb')    # Creates/truncates a file for writing.
<Wave>.setframerate(<int>)            # Pass 44100 for CD, 48000 for video.
<Wave>.setnchannels(<int>)            # Pass 1 for mono, 2 for stereo.
<Wave>.setsampwidth(<int>)            # Pass 2 for CD, 3 for hi-res sound.
<Wave>.setparams(<tuple>)             # Sets all parameters.
<Wave>.writeframes(<bytes>)           # Appends frames to the file.
  • Bytes object contains a sequence of frames, each consisting of one or more samples. Bytes 对象包含一系列帧,每个帧由一个或多个样本组成。
  • In a stereo signal, the first sample of a frame belongs to the left channel. 在立体声信号中,帧的第一个样本属于左声道。
  • Each sample consists of one or more bytes that, when converted to an integer, indicate the displacement of a speaker membrane at a given moment. 每个样本由一个或多个字节组成,当转换为整数时,指示扬声器膜在给定时刻的位移。
  • If sample width is one byte, then the integer should be encoded unsigned. 如果样本宽度为一字节,则整数应编码为无符号。
  • For all other sizes, the integer should be encoded signed with little-endian byte order. 对于所有其他大小,整数应使用小端字节顺序进行有符号编码。

Sample Values 样本值

arduino 复制代码
┏━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━┯━━━━━━━━━━━┓
┃ sampwidth │    min    │ zero │    max    ┃
┠───────────┼───────────┼──────┼───────────┨
┃     1     │         0 │  128 │       255 ┃
┃     2     │    -32768 │    0 │     32767 ┃
┃     3     │  -8388608 │    0 │   8388607 ┃
┗━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━┷━━━━━━━━━━━┛

Read Float Samples from WAV File 从 WAV 文件读取浮点样本

python 复制代码
def read_wav_file(filename):
    def get_int(bytes_obj):
        an_int = int.from_bytes(bytes_obj, 'little', signed=(sampwidth != 1))
        return an_int - 128 * (sampwidth == 1)
    with wave.open(filename, 'rb') as file:
        sampwidth = file.getsampwidth()
        frames = file.readframes(-1)
    bytes_samples = (frames[i : i+sampwidth] for i in range(0, len(frames), sampwidth))
    return [get_int(b) / pow(2, sampwidth * 8 - 1) for b in bytes_samples]

Write Float Samples to WAV File 将浮点样本写入 WAV 文件

scss 复制代码
def write_to_wav_file(filename, float_samples, nchannels=1, sampwidth=2, framerate=44100):
    def get_bytes(a_float):
        a_float = max(-1, min(1 - 2e-16, a_float))
        a_float += sampwidth == 1
        a_float *= pow(2, sampwidth * 8 - 1)
        return int(a_float).to_bytes(sampwidth, 'little', signed=(sampwidth != 1))
    with wave.open(filename, 'wb') as file:
        file.setnchannels(nchannels)
        file.setsampwidth(sampwidth)
        file.setframerate(framerate)
        file.writeframes(b''.join(get_bytes(f) for f in float_samples))

Examples 例子

Saves a 440 Hz sine wave to a mono WAV file: 将 440 Hz 正弦波保存为单声道 WAV 文件:

lua 复制代码
from math import pi, sin
samples_f = (sin(i * 2 * pi * 440 / 44100) for i in range(100_000))
write_to_wav_file('test.wav', samples_f)

Adds noise to the mono WAV file: 向单声道 WAV 文件添加噪音:

css 复制代码
from random import random
add_noise = lambda value: value + (random() - 0.5) * 0.03
samples_f = (add_noise(f) for f in read_wav_file('test.wav'))
write_to_wav_file('test.wav', samples_f)

Plays the WAV file: 播放 WAV 文件:

python 复制代码
# $ pip3 install simpleaudio
from simpleaudio import play_buffer
with wave.open('test.wav', 'rb') as file:
    p = file.getparams()
    frames = file.readframes(-1)
    play_buffer(frames, p.nchannels, p.sampwidth, p.framerate).wait_done()

Text to Speech 文字转语音

csharp 复制代码
# $ pip3 install pyttsx3
import pyttsx3
engine = pyttsx3.init()
engine.say('Sally sells seashells by the seashore.')
engine.runAndWait()

[#](#Synthesizer 合成器 "#synthesizer")Synthesizer 合成器

Plays Popcorn by Gershon Kingsley: 格申·金斯利 (Gershon Kingsley) 玩爆米花:

ini 复制代码
# $ pip3 install simpleaudio
import array, itertools as it, math, simpleaudioF  = 44100
P1 = '71♩,69♪,,71♩,66♪,,62♩,66♪,,59♩,,,71♩,69♪,,71♩,66♪,,62♩,66♪,,59♩,,,'
P2 = '71♩,73♪,,74♩,73♪,,74♪,,71♪,,73♩,71♪,,73♪,,69♪,,71♩,69♪,,71♪,,67♪,,71♩,,,'
get_pause   = lambda seconds: it.repeat(0, int(seconds * F))
sin_f       = lambda i, hz: math.sin(i * 2 * math.pi * hz / F)
get_wave    = lambda hz, seconds: (sin_f(i, hz) for i in range(int(seconds * F)))
get_hz      = lambda note: 8.176 * 2 ** (int(note[:2]) / 12)
get_sec     = lambda note: 1/4 if '♩' in note else 1/8
get_samples = lambda note: get_wave(get_hz(note), get_sec(note)) if note else get_pause(1/8)
samples_f   = it.chain.from_iterable(get_samples(n) for n in (P1+P2).split(','))
samples_i   = array.array('h', (int(f * 30000) for f in samples_f))
simpleaudio.play_buffer(samples_i, 1, 2, F).wait_done()

[#](#Pygame 小游戏 "#pygame")Pygame 小游戏

csharp 复制代码
# $ pip3 install pygame
import pygame as pgpg.init()
screen = pg.display.set_mode((500, 500))
rect = pg.Rect(240, 240, 20, 20)
while not pg.event.get(pg.QUIT):
    deltas = {pg.K_UP: (0, -20), pg.K_RIGHT: (20, 0), pg.K_DOWN: (0, 20), pg.K_LEFT: (-20, 0)}
    for event in pg.event.get(pg.KEYDOWN):
        dx, dy = deltas.get(event.key, (0, 0))
        rect = rect.move((dx, dy))
    screen.fill((0, 0, 0))
    pg.draw.rect(screen, (255, 255, 255), rect)
    pg.display.flip()

Rectangle 长方形

Object for storing rectangular coordinates. 用于存储直角坐标的对象。

xml 复制代码
<Rect> = pg.Rect(x, y, width, height)           # Floats get truncated into ints.
<int>  = <Rect>.x/y/centerx/centery/...           # Top, right, bottom, left. Allows assignments.
<tup.> = <Rect>.topleft/center/...                # Topright, bottomright, bottomleft. Same.
<Rect> = <Rect>.move((delta_x, delta_y))        # Use move_ip() to move in-place.
<bool> = <Rect>.collidepoint((x, y))            # Checks if rectangle contains the point.
<bool> = <Rect>.colliderect(<Rect>)             # Checks if the two rectangles overlap.
<int>  = <Rect>.collidelist(<list_of_Rect>)     # Returns index of first colliding Rect or -1.
<list> = <Rect>.collidelistall(<list_of_Rect>)  # Returns indexes of all colliding rectangles.

Surface 表面

Object for representing images. 用于表示图像的对象。

scss 复制代码
<Surf> = pg.display.set_mode((width, height))   # Opens new window and returns its surface.
<Surf> = pg.Surface((width, height))            # New RGB surface. RGBA if `flags=pg.SRCALPHA`.
<Surf> = pg.image.load(<path/file>)             # Loads the image. Format depends on source.
<Surf> = pg.surfarray.make_surface(<np_array>)  # Also `<np_arr> = surfarray.pixels3d(<Surf>)`.
<Surf> = <Surf>.subsurface(<Rect>)              # Creates a new surface from the cutout.
<Surf>.fill(color)                              # Tuple, Color('#rrggbb[aa]') or Color(<name>).
<Surf>.set_at((x, y), color)                    # Updates pixel. Also <Surf>.get_at((x, y)).
<Surf>.blit(<Surf>, (x, y))                     # Draws passed surface at specified location.
from pygame.transform import scale, ...
<Surf> = scale(<Surf>, (width, height))         # Returns scaled surface.
<Surf> = rotate(<Surf>, anticlock_degrees)      # Returns rotated and scaled surface.
<Surf> = flip(<Surf>, x_bool, y_bool)           # Returns flipped surface.
from pygame.draw import line, ...
line(<Surf>, color, (x1, y1), (x2, y2), width)  # Draws a line to the surface.
arc(<Surf>, color, <Rect>, from_rad, to_rad)    # Also ellipse(<Surf>, color, <Rect>, width=0).
rect(<Surf>, color, <Rect>, width=0)            # Also polygon(<Surf>, color, points, width=0).

Font 字体

scss 复制代码
<Font> = pg.font.Font(<path/file>, size)        # Loads TTF file. Pass None for default font.
<Surf> = <Font>.render(text, antialias, color)  # Background color can be specified at the end.

Sound 声音

scss 复制代码
<Sound> = pg.mixer.Sound(<path/file/bytes>)     # WAV file or bytes/array of signed shorts.
<Sound>.play/stop()                             # Also set_volume(<float>), fadeout(msec).

Basic Mario Brothers Example 基本马里奥兄弟示例

scss 复制代码
import collections, dataclasses, enum, io, itertools as it, pygame as pg, urllib.request
from random import randint
​
P = collections.namedtuple('P', 'x y')          # Position
D = enum.Enum('D', 'n e s w')                   # Direction
W, H, MAX_S = 50, 50, P(5, 10)                  # Width, Height, Max speed
​
def main():
    def get_screen():
        pg.init()
        return pg.display.set_mode((W*16, H*16))
    def get_images():
        url = 'https://gto76.github.io/python-cheatsheet/web/mario_bros.png'
        img = pg.image.load(io.BytesIO(urllib.request.urlopen(url).read()))
        return [img.subsurface(get_rect(x, 0)) for x in range(img.get_width() // 16)]
    def get_mario():
        Mario = dataclasses.make_dataclass('Mario', 'rect spd facing_left frame_cycle'.split())
        return Mario(get_rect(1, 1), P(0, 0), False, it.cycle(range(3)))
    def get_tiles():
        border = [(x, y) for x in range(W) for y in range(H) if x in [0, W-1] or y in [0, H-1]]
        platforms = [(randint(1, W-2), randint(2, H-2)) for _ in range(W*H // 10)]
        return [get_rect(x, y) for x, y in border + platforms]
    def get_rect(x, y):
        return pg.Rect(x*16, y*16, 16, 16)
    run(get_screen(), get_images(), get_mario(), get_tiles())
​
def run(screen, images, mario, tiles):
    clock = pg.time.Clock()
    pressed = set()
    while not pg.event.get(pg.QUIT) and clock.tick(28):
        keys = {pg.K_UP: D.n, pg.K_RIGHT: D.e, pg.K_DOWN: D.s, pg.K_LEFT: D.w}
        pressed |= {keys.get(e.key) for e in pg.event.get(pg.KEYDOWN)}
        pressed -= {keys.get(e.key) for e in pg.event.get(pg.KEYUP)}
        update_speed(mario, tiles, pressed)
        update_position(mario, tiles)
        draw(screen, images, mario, tiles, pressed)
​
def update_speed(mario, tiles, pressed):
    x, y = mario.spd
    x += 2 * ((D.e in pressed) - (D.w in pressed))
    x += (x < 0) - (x > 0)
    y += 1 if D.s not in get_boundaries(mario.rect, tiles) else (D.n in pressed) * -10
    mario.spd = P(x=max(-MAX_S.x, min(MAX_S.x, x)), y=max(-MAX_S.y, min(MAX_S.y, y)))
​
def update_position(mario, tiles):
    x, y = mario.rect.topleft
    n_steps = max(abs(s) for s in mario.spd)
    for _ in range(n_steps):
        mario.spd = stop_on_collision(mario.spd, get_boundaries(mario.rect, tiles))
        mario.rect.topleft = x, y = x + (mario.spd.x / n_steps), y + (mario.spd.y / n_steps)
​
def get_boundaries(rect, tiles):
    deltas = {D.n: P(0, -1), D.e: P(1, 0), D.s: P(0, 1), D.w: P(-1, 0)}
    return {d for d, delta in deltas.items() if rect.move(delta).collidelist(tiles) != -1}
​
def stop_on_collision(spd, bounds):
    return P(x=0 if (D.w in bounds and spd.x < 0) or (D.e in bounds and spd.x > 0) else spd.x,
             y=0 if (D.n in bounds and spd.y < 0) or (D.s in bounds and spd.y > 0) else spd.y)
​
def draw(screen, images, mario, tiles, pressed):
    def get_marios_image_index():
        if D.s not in get_boundaries(mario.rect, tiles):
            return 4
        return next(mario.frame_cycle) if {D.w, D.e} & pressed else 6
    screen.fill((85, 168, 255))
    mario.facing_left = (D.w in pressed) if {D.w, D.e} & pressed else mario.facing_left
    screen.blit(images[get_marios_image_index() + mario.facing_left * 9], mario.rect)
    for t in tiles:
        screen.blit(images[18 if t.x in [0, (W-1)*16] or t.y in [0, (H-1)*16] else 19], t)
    pg.display.flip()
​
if __name__ == '__main__':
    main()

[#](#Pandas 数据处理 "#pandas")Pandas 数据处理

python 复制代码
# $ pip3 install pandas matplotlib
import pandas as pd, matplotlib.pyplot as plt

Series 系列

Ordered dictionary with a name. 带名字的有序字典。

ini 复制代码
>>> pd.Series([1, 2], index=['x', 'y'], name='a')
x    1
y    2
Name: a, dtype: int64
<Sr> = pd.Series(<list>)                       # Assigns RangeIndex starting at 0.
<Sr> = pd.Series(<dict>)                       # Takes dictionary's keys for index.
<Sr> = pd.Series(<dict/Series>, index=<list>)  # Only keeps items with keys specified in index.
<el> = <Sr>.loc[key]                           # Or: <Sr>.iloc[index]
<Sr> = <Sr>.loc[keys]                          # Or: <Sr>.iloc[indexes]
<Sr> = <Sr>.loc[from_key : to_key_inclusive]   # Or: <Sr>.iloc[from_i : to_i_exclusive]
<el> = <Sr>[key/index]                         # Or: <Sr>.key
<Sr> = <Sr>[keys/indexes]                      # Or: <Sr>[<keys_slice/slice>]
<Sr> = <Sr>[bools]                             # Or: <Sr>.loc/iloc[bools]
<Sr> = <Sr> > <el/Sr>                          # Returns a Series of bools.
<Sr> = <Sr> + <el/Sr>                          # Items with non-matching keys get value NaN.
<Sr> = pd.concat(<coll_of_Sr>)                 # Concats multiple series into one long Series.
<Sr> = <Sr>.combine_first(<Sr>)                # Adds items that are not yet present.
<Sr>.update(<Sr>)                              # Updates items that are already present.
<Sr>.plot.line/area/bar/pie/hist()             # Generates a Matplotlib plot.
plt.show()                                     # Displays the plot. Also plt.savefig(<path>).

Series --- Aggregate, Transform, Map: 系列 --- 聚合、转换、映射:

scss 复制代码
<el> = <Sr>.sum/max/mean/idxmax/all()          # Or: <Sr>.agg(lambda <Sr>: <el>)
<Sr> = <Sr>.rank/diff/cumsum/ffill/interplt()  # Or: <Sr>.agg/transform(lambda <Sr>: <Sr>)
<Sr> = <Sr>.fillna(<el>)                       # Or: <Sr>.agg/transform/map(lambda <el>: <el>)
>>> sr = pd.Series([2, 3], index=['x', 'y'])
x    2
y    3
┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃               │    'sum'    │   ['sum']   │ {'s': 'sum'}  ┃
┠───────────────┼─────────────┼─────────────┼───────────────┨
┃ sr.apply(...)   │      5      │    sum  5   │     s  5      ┃
┃ sr.agg(...)     │             │             │               ┃
┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┛┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃               │    'rank'   │   ['rank']  │ {'r': 'rank'} ┃
┠───────────────┼─────────────┼─────────────┼───────────────┨
┃ sr.apply(...)   │             │      rank   │               ┃
┃ sr.agg(...)     │     x  1    │   x     1   │    r  x  1    ┃
┃               │     y  2    │   y     2   │       y  2    ┃
┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┛
  • Keys/indexes/bools can't be tuples because 'obj[x, y]' is converted to 'obj[(x, y)]'! 键/索引/布尔值不能是元组,因为 'obj[x, y]' 被转换为 'obj[(x, y)]'
  • Methods ffill(), interpolate(), fillna() and dropna() accept 'inplace=True'. 方法 ffill()、interpolate()、fillna() 和 dropna() 接受 'inplace=True'
  • Last result has a hierarchical index. Use '<Sr>[key_1, key_2]' to get its values. 最后的结果有一个分层索引。使用 '<Sr>[key_1, key_2]' 获取其值。

DataFrame 数据框

Table with labeled rows and columns. 带有标记的行和列的表格。

ini 复制代码
>>> pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x  y
a  1  2
b  3  4
<DF>    = pd.DataFrame(<list_of_rows>)         # Rows can be either lists, dicts or series.
<DF>    = pd.DataFrame(<dict_of_columns>)      # Columns can be either lists, dicts or series.
<el>    = <DF>.loc[row_key, column_key]        # Or: <DF>.iloc[row_index, column_index]
<Sr/DF> = <DF>.loc[row_key/s]                  # Or: <DF>.iloc[row_index/es]
<Sr/DF> = <DF>.loc[:, column_key/s]            # Or: <DF>.iloc[:, column_index/es]
<DF>    = <DF>.loc[row_bools, column_bools]    # Or: <DF>.iloc[row_bools, column_bools]
<Sr/DF> = <DF>[column_key/s]                   # Or: <DF>.column_key
<DF>    = <DF>[row_bools]                      # Keeps rows as specified by bools.
<DF>    = <DF>[<DF_of_bools>]                  # Assigns NaN to items that are False in bools.
<DF>    = <DF> > <el/Sr/DF>                    # Returns DF of bools. Sr is treated as a row.
<DF>    = <DF> + <el/Sr/DF>                    # Items with non-matching keys get value NaN.
<DF>    = <DF>.set_index(column_key)           # Replaces row keys with values from the column.
<DF>    = <DF>.reset_index(drop=False)         # Drops or moves row keys to column named index.
<DF>    = <DF>.sort_index(ascending=True)      # Sorts rows by row keys. Use `axis=1` for cols.
<DF>    = <DF>.sort_values(column_key/s)       # Sorts rows by passed column/s. Also `axis=1`.

DataFrame --- Merge, Join, Concat: DataFrame --- 数据框 合并、连接、连接:

css 复制代码
>>> l = pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x  y
a  1  2
b  3  4
>>> r = pd.DataFrame([[4, 5], [6, 7]], index=['b', 'c'], columns=['y', 'z'])
   y  z
b  4  5
c  6  7
┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                        │    'outer'    │   'inner'  │   'left'   │       Description        ┃
┠────────────────────────┼───────────────┼────────────┼────────────┼──────────────────────────┨
┃ l.merge(r, on='y',     │    x   y   z  │ x   y   z  │ x   y   z  │ Merges on column if 'on' ┃
┃            how=...)      │ 0  1   2   .  │ 3   4   5  │ 1   2   .  │ or 'left/right_on' are   ┃
┃                        │ 1  3   4   5  │            │ 3   4   5  │ set, else on shared cols.┃
┃                        │ 2  .   6   7  │            │            │ Uses 'inner' by default. ┃
┠────────────────────────┼───────────────┼────────────┼────────────┼──────────────────────────┨
┃ l.join(r, lsuffix='l', │    x yl yr  z │            │ x yl yr  z │ Merges on row keys.      ┃
┃           rsuffix='r', │ a  1  2  .  . │ x yl yr  z │ 1  2  .  . │ Uses 'left' by default.  ┃
┃           how=...)       │ b  3  4  4  5 │ 3  4  4  5 │ 3  4  4  5 │ If r is a Series, it is  ┃
┃                        │ c  .  .  6  7 │            │            │ treated as a column.     ┃
┠────────────────────────┼───────────────┼────────────┼────────────┼──────────────────────────┨
┃ pd.concat([l, r],      │    x   y   z  │     y      │            │ Adds rows at the bottom. ┃
┃           axis=0,      │ a  1   2   .  │     2      │            │ Uses 'outer' by default. ┃
┃           join=...)      │ b  3   4   .  │     4      │            │ A Series is treated as a ┃
┃                        │ b  .   4   5  │     4      │            │ column. To add a row use ┃
┃                        │ c  .   6   7  │     6      │            │ pd.concat([l, DF([sr])]).┃
┠────────────────────────┼───────────────┼────────────┼────────────┼──────────────────────────┨
┃ pd.concat([l, r],      │    x  y  y  z │            │            │ Adds columns at the      ┃
┃           axis=1,      │ a  1  2  .  . │ x  y  y  z │            │ right end. Uses 'outer'  ┃
┃           join=...)      │ b  3  4  4  5 │ 3  4  4  5 │            │ by default. A Series is  ┃
┃                        │ c  .  .  6  7 │            │            │ treated as a column.     ┃
┠────────────────────────┼───────────────┼────────────┼────────────┼──────────────────────────┨
┃ l.combine_first(r)     │    x   y   z  │            │            │ Adds missing rows and    ┃
┃                        │ a  1   2   .  │            │            │ columns. Also updates    ┃
┃                        │ b  3   4   5  │            │            │ items that contain NaN.  ┃
┃                        │ c  .   6   7  │            │            │ Argument r must be a DF. ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━┛

DataFrame --- Aggregate, Transform, Map: DataFrame --- 数据框 聚合、转换、映射:

xml 复制代码
<Sr> = <DF>.sum/max/mean/idxmax/all()          # Or: <DF>.apply/agg(lambda <Sr>: <el>)
<DF> = <DF>.rank/diff/cumsum/ffill/interplt()  # Or: <DF>.apply/agg/transfrm(lambda <Sr>: <Sr>)
<DF> = <DF>.fillna(<el>)                       # Or: <DF>.applymap(lambda <el>: <el>)
  • All operations operate on columns by default. Pass 'axis=1' to process the rows instead. 默认情况下,所有操作都对列进行。而是传递 'axis=1' 来处理行。
css 复制代码
>>> df = pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x  y
a  1  2
b  3  4
┏━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃                 │    'sum'    │   ['sum']   │ {'x': 'sum'}  ┃
┠─────────────────┼─────────────┼─────────────┼───────────────┨
┃ df.apply(...)     │     x  4    │       x  y  │     x  4      ┃
┃ df.agg(...)       │     y  6    │  sum  4  6  │               ┃
┗━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┛┏━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃                 │    'rank'   │   ['rank']  │ {'x': 'rank'} ┃
┠─────────────────┼─────────────┼─────────────┼───────────────┨
┃ df.apply(...)     │             │      x    y │               ┃
┃ df.agg(...)       │      x  y   │   rank rank │        x      ┃
┃ df.transform(...) │   a  1  1   │ a    1    1 │     a  1      ┃
┃                 │   b  2  2   │ b    2    2 │     b  2      ┃
┗━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┛
  • Use '<DF>[col_key_1, col_key_2][row_key]' to get the fifth result's values. 使用 '<DF>[col_key_1, col_key_2][row_key]' 获取第五个结果的值。

DataFrame --- Plot, Encode, Decode: DataFrame --- 数据框 绘图、编码、解码:

bash 复制代码
<DF>.plot.line/area/bar/hist/scatter/box()     # Also: `x=column_key, y=column_key/s`.
plt.show()                                     # Displays the plot. Also plt.savefig(<path>).
<DF> = pd.read_json/html('<str/path/url>')     # Run `$ pip3 install beautifulsoup4 lxml`.
<DF> = pd.read_csv('<path/url>')               # `header/index_col/dtype/parse_dates=<obj>`.
<DF> = pd.read_pickle/excel('<path/url>')      # Use `sheet_name=None` to get all Excel sheets.
<DF> = pd.read_sql('<table/query>', <conn.>)   # SQLite3/SQLAlchemy connection (see #SQLite).
<dict> = <DF>.to_dict(['d/l/s/...'])             # Returns columns as dicts, lists or series.
<str>  = <DF>.to_json/html/csv([<path>])       # Also to_markdown/latex([<path>]).
<DF>.to_pickle/excel(<path>)                   # Run `$ pip3 install "pandas[excel]" odfpy`.
<DF>.to_sql('<table_name>', <connection>)      # Also `if_exists='fail/replace/append'`.

GroupBy 通过...分组

Object that groups together rows of a dataframe based on the value of the passed column. 根据传递的列的值将数据帧的行分组在一起的对象。

bash 复制代码
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 6]], list('abc'), list('xyz'))
>>> df.groupby('z').get_group(6)
   x  y  z
b  4  5  6
c  7  8  6
<GB> = <DF>.groupby(column_key/s)              # Splits DF into groups based on passed column.
<DF> = <GB>.apply(<func>)                      # Maps each group. Func can return DF, Sr or el.
<GB> = <GB>[column_key]                        # Single column GB. All operations return a Sr.
<Sr> = <GB>.size()                             # A Sr of group sizes. Same keys as get_group().

GroupBy --- Aggregate, Transform, Map: GroupBy --- 聚合、转换、映射:

css 复制代码
<DF> = <GB>.sum/max/mean/idxmax/all()          # Or: <GB>.agg(lambda <Sr>: <el>)
<DF> = <GB>.rank/diff/cumsum/ffill()           # Or: <GB>.transform(lambda <Sr>: <Sr>)
<DF> = <GB>.fillna(<el>)                       # Or: <GB>.transform(lambda <Sr>: <Sr>)
>>> gb = df.groupby('z'); gb.apply(print)
   x  y  z
a  1  2  3
   x  y  z
b  4  5  6
c  7  8  6
┏━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃                 │    'sum'    │    'rank'   │   ['rank']  │ {'x': 'rank'} ┃
┠─────────────────┼─────────────┼─────────────┼─────────────┼───────────────┨
┃ gb.agg(...)       │      x   y  │             │      x    y │               ┃
┃                 │  z          │      x  y   │   rank rank │        x      ┃
┃                 │  3   1   2  │   a  1  1   │ a    1    1 │     a  1      ┃
┃                 │  6  11  13  │   b  1  1   │ b    1    1 │     b  1      ┃
┃                 │             │   c  2  2   │ c    2    2 │     c  2      ┃
┠─────────────────┼─────────────┼─────────────┼─────────────┼───────────────┨
┃ gb.transform(...) │      x   y  │      x  y   │             │               ┃
┃                 │  a   1   2  │   a  1  1   │             │               ┃
┃                 │  b  11  13  │   b  1  1   │             │               ┃
┃                 │  c  11  13  │   c  2  2   │             │               ┃
┗━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┛

Rolling 滚动

Object for rolling window calculations. 用于滚动窗口计算的对象。

bash 复制代码
<RSr/RDF/RGB> = <Sr/DF/GB>.rolling(win_size)   # Also: `min_periods=None, center=False`.
<RSr/RDF/RGB> = <RDF/RGB>[column_key/s]        # Or: <RDF/RGB>.column_key
<Sr/DF>       = <R>.mean/sum/max()             # Or: <R>.apply/agg(<agg_func/str>)

[#](#Plotly 绘图 "#plotly")Plotly 绘图

python 复制代码
# $ pip3 install pandas plotly kaleido
import pandas as pd, plotly.express as ex
<Figure> = ex.line(<DF>, x=<col_name>, y=<col_name>)        # Or: ex.line(x=<list>, y=<list>)
<Figure>.update_layout(margin=dict(t=0, r=0, b=0, l=0), ...)  # `paper_bgcolor='rgb(0, 0, 0)'`.
<Figure>.write_html/json/image('<path>')                    # Also <Figure>.show().

Displays a line chart of total coronavirus deaths per million grouped by continent: 显示按大陆分组的每百万人冠状病毒死亡总数的折线图:

Apr 2020Jul 2020Oct 2020Jan 2021Apr 2021Jul 2021Oct 202105001000150020002500

ContinentSouth AmericaNorth AmericaEuropeAsiaAfricaOceaniaDateTotal Deaths per Million

ini 复制代码
covid = pd.read_csv('https://covid.ourworldindata.org/data/owid-covid-data.csv',
                    usecols=['iso_code', 'date', 'total_deaths', 'population'])
continents = pd.read_csv('https://gist.githubusercontent.com/stevewithington/20a69c0b6d2ff'
                         '846ea5d35e5fc47f26c/raw/country-and-continent-codes-list-csv.csv',
                         usecols=['Three_Letter_Country_Code', 'Continent_Name'])
df = pd.merge(covid, continents, left_on='iso_code', right_on='Three_Letter_Country_Code')
df = df.groupby(['Continent_Name', 'date']).sum().reset_index()
df['Total Deaths per Million'] = df.total_deaths * 1e6 / df.population
df = df[df.date > '2020-03-14']
df = df.rename({'date': 'Date', 'Continent_Name': 'Continent'}, axis='columns')
ex.line(df, x='Date', y='Total Deaths per Million', color='Continent').show()

Displays a multi-axis line chart of total coronavirus cases and changes in prices of Bitcoin, Dow Jones and gold: 显示冠状病毒病例总数以及比特币、道琼斯和黄金价格变化的多轴折线图:

Apr 2020Jul 2020Oct 2020Jan 2021Apr 2021Jul 2021Oct 2021050M100M150M200M250M0200400600

Total CasesBitcoinDow JonesGoldTotal Cases%

python 复制代码
import pandas as pd, plotly.graph_objects as godef main():
    covid, bitcoin, gold, dow = scrape_data()
    display_data(wrangle_data(covid, bitcoin, gold, dow))def scrape_data():
    def get_covid_cases():
        url = 'https://covid.ourworldindata.org/data/owid-covid-data.csv'
        df = pd.read_csv(url, usecols=['location', 'date', 'total_cases'])
        return df[df.location == 'World'].set_index('date').total_cases
    def get_ticker(symbol):
        url = (f'https://query1.finance.yahoo.com/v7/finance/download/{symbol}?'
               'period1=1579651200&period2=9999999999&interval=1d&events=history')
        df = pd.read_csv(url, usecols=['Date', 'Close'])
        return df.set_index('Date').Close
    out = get_covid_cases(), get_ticker('BTC-USD'), get_ticker('GC=F'), get_ticker('^DJI')
    return map(pd.Series.rename, out, ['Total Cases', 'Bitcoin', 'Gold', 'Dow Jones'])def wrangle_data(covid, bitcoin, gold, dow):
    df = pd.concat([bitcoin, gold, dow], axis=1)  # Creates table by joining columns on dates.
    df = df.sort_index().interpolate()            # Sorts table by date and interpolates NaN-s.
    df = df.loc['2020-02-23':]                    # Discards rows before '2020-02-23'.
    df = (df / df.iloc[0]) * 100                  # Calculates percentages relative to day 1.
    df = df.join(covid)                           # Adds column with covid cases.
    return df.sort_values(df.index[-1], axis=1)   # Sorts columns by last day's value.def display_data(df):
    figure = go.Figure()
    for col_name in reversed(df.columns):
        yaxis = 'y1' if col_name == 'Total Cases' else 'y2'
        trace = go.Scatter(x=df.index, y=df[col_name], name=col_name, yaxis=yaxis)
        figure.add_trace(trace)
    figure.update_layout(
        yaxis1=dict(title='Total Cases', rangemode='tozero'),
        yaxis2=dict(title='%', rangemode='tozero', overlaying='y', side='right'),
        legend=dict(x=1.08),
        width=944,
        height=423
    )
    figure.show()if __name__ == '__main__':
    main()

[#](#Appendix 附件 "#appendix")Appendix 附件

Cython 赛通

Library that compiles Python code into C. 将 Python 代码编译为 C 的库。

python 复制代码
# $ pip3 install cython
import pyximport; pyximport.install()
import <cython_script>
<cython_script>.main()

Definitions: 定义:

  • All 'cdef' definitions are optional, but they contribute to the speed-up. 所有 'cdef' 定义都是可选的,但它们有助于加速。
  • Script needs to be saved with a 'pyx' extension. 脚本需要使用 'pyx' 扩展名保存。
xml 复制代码
cdef <ctype> <var_name> = <el>
cdef <ctype>[n_elements] <var_name> = [<el>, <el>, ...]
cdef <ctype/void> <func_name>(<ctype> <arg_name>): ...
cdef class <class_name>:
    cdef public <ctype> <attr_name>
    def __init__(self, <ctype> <arg_name>):
        self.<attr_name> = <arg_name>
cdef enum <enum_name>: <member_name>, <member_name>, ...

Virtual Environments 虚拟环境

System for installing libraries directly into project's directory. 用于将库直接安装到项目目录中的系统。

shell 复制代码
$ python3 -m venv <name>      # Creates virtual environment in current directory.
$ source <name>/bin/activate  # Activates venv. On Windows run `<name>\Scripts\activate`.
$ pip3 install <library>      # Installs the library into active environment.
$ python3 <path>              # Runs the script in active environment. Also `./<path>`.
$ deactivate                  # Deactivates the active virtual environment.

Basic Script Template 基本脚本模板

python 复制代码
#!/usr/bin/env python3
#
# Usage: .py
#from sys import argv, exit
from collections import defaultdict, namedtuple
from dataclasses import make_dataclass
from enum import Enum
import functools as ft, itertools as it, operator as op, re
def main():
    pass
###
##  UTIL
#def read_file(filename):
    with open(filename, encoding='utf-8') as file:
        return file.readlines()
if __name__ == '__main__':
    main()

March 17, 2024 2024 年 3 月 17 日Jure Šorn 尤雷·索恩 Chinese By Yulk yulike2017@outlook.com 2024 年 3 月 20 日

相关推荐
iracole1 小时前
深度学习训练Camp:第R5周:天气预测
人工智能·python·深度学习
梦丶晓羽3 小时前
自然语言处理:最大期望值算法
人工智能·python·自然语言处理·高斯混合模型·最大期望值算法
君科程序定做4 小时前
PDFMathTranslate安装使用
python
Linzerox5 小时前
Pycharm 取消拼写错误检查(Typo:in word xxx)
python·pycharm
千里码aicood5 小时前
[含文档+PPT+源码等]精品基于Python实现的校园小助手小程序的设计与实现
开发语言·前端·python
Icomi_6 小时前
【神经网络】0.深度学习基础:解锁深度学习,重塑未来的智能新引擎
c语言·c++·人工智能·python·深度学习·神经网络
蠟筆小新工程師6 小时前
Deepseek可以通过多种方式帮助CAD加速工作
开发语言·python·seepdeek
NoBarLing6 小时前
python将目录下的所欲md文件转化为html和pdf
python·pdf·html
岱宗夫up7 小时前
【Python】Django 中的算法应用与实现
数据库·python·opencv·django·sqlite
天道有情战天下8 小时前
python flask
开发语言·python·flask