MySQL故障排查与生产环境优化

一、MySQL故障排查概述

1. 故障排查方法论

text

复制代码
故障排查流程:
┌─────────────────────────────────────────────────────────┐
│ 1. 现象确认 → 2. 信息收集 → 3. 问题定位 → 4. 原因分析 │
│                              ↓                         │
│ 5. 解决方案 → 6. 验证效果 → 7. 总结复盘 → 8. 预防措施 │
└─────────────────────────────────────────────────────────┘

故障类型:
├── 连接类故障
├── 性能类故障
├── 数据类故障
├── 复制类故障
├── 资源类故障
└── 安全类故障

2. 故障排查工具清单

bash

复制代码
# 1. 系统层面工具
top          # CPU、内存使用情况
htop         # 更友好的top
vmstat 1     # 虚拟内存统计
iostat -x 1  # 磁盘I/O统计
netstat -an  # 网络连接状态
ss -tlnp     # socket统计
lsof         # 打开文件列表

# 2. MySQL工具
mysqladmin   # 管理命令
mysqldump    # 备份工具
mysqlbinlog  # 二进制日志查看
pt-query-digest  # Percona工具包
mysqltuner   # 配置调优建议

# 3. 性能分析工具
perf         # 性能分析
strace       # 系统调用跟踪
tcpdump      # 网络抓包

3. 关键日志文件

bash

复制代码
# MySQL日志位置
/var/log/mysql/error.log      # 错误日志
/var/log/mysql/slow.log       # 慢查询日志
/var/log/mysql/mysql-bin.*    # 二进制日志
/var/lib/mysql/*.err          # 错误日志(某些版本)

# 查看日志配置
mysql> SHOW VARIABLES LIKE '%log%';

# 查看错误日志
tail -100 /var/log/mysql/error.log

# 查看慢查询日志
tail -100 /var/log/mysql/slow.log

二、连接类故障排查

1. 连接数过高

sql

复制代码
-- 查看当前连接数
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Max_used_connections';

-- 查看连接数限制
SHOW VARIABLES LIKE 'max_connections';

-- 查看所有连接
SHOW PROCESSLIST;
SHOW FULL PROCESSLIST;

-- 查看连接来源统计
SELECT host, COUNT(*) FROM information_schema.processlist 
GROUP BY host ORDER BY COUNT(*) DESC;

-- 查看连接状态分布
SELECT command, COUNT(*) FROM information_schema.processlist 
GROUP BY command;

连接数过高处理:

bash

复制代码
# 临时增加连接数
mysql> SET GLOBAL max_connections = 1000;

# 永久配置
vim /etc/my.cnf
[mysqld]
max_connections = 1000

# 查找空闲连接
mysql> SELECT * FROM information_schema.processlist 
WHERE command = 'Sleep' AND time > 60;

# 批量终止空闲连接
mysql> SELECT CONCAT('KILL ', id, ';') FROM information_schema.processlist 
WHERE command = 'Sleep' AND time > 3600;

2. 连接超时

sql

复制代码
-- 查看超时设置
SHOW VARIABLES LIKE '%timeout%';

-- 常见超时参数
-- connect_timeout: 连接超时
-- wait_timeout: 非交互式连接超时
-- interactive_timeout: 交互式连接超时

-- 调整超时设置
SET GLOBAL wait_timeout = 600;
SET GLOBAL connect_timeout = 30;

3. Too many connections 错误

sql

复制代码
-- 1. 检查当前连接数
mysqladmin -u root -p status

-- 2. 查看最大连接数
SHOW VARIABLES LIKE 'max_connections';

-- 3. 紧急处理(使用预留连接)
-- 预留连接需要root或具有CONNECTION_ADMIN权限的用户
mysql -u root -p -h localhost --protocol=socket

-- 4. 终止大量空闲连接
SELECT CONCAT('KILL ', id, ';') FROM information_schema.processlist 
WHERE command = 'Sleep' AND time > 300 INTO OUTFILE '/tmp/kill.sql';
SOURCE /tmp/kill.sql;

-- 5. 临时增加连接数
SET GLOBAL max_connections = 2000;

4. 连接被拒绝

bash

复制代码
# 检查绑定地址
netstat -tlnp | grep 3306

# 检查bind-address配置
mysql> SHOW VARIABLES LIKE 'bind_address';

# 检查用户权限
mysql> SELECT user, host FROM mysql.user WHERE user='appuser';

# 添加远程访问权限
mysql> GRANT ALL PRIVILEGES ON *.* TO 'appuser'@'%' IDENTIFIED BY 'password';
mysql> FLUSH PRIVILEGES;

# 检查防火墙
iptables -L -n | grep 3306
firewall-cmd --list-all

5. 连接排查脚本

bash

复制代码
#!/bin/bash
# connection_check.sh - 连接问题排查脚本

echo "=== MySQL连接问题排查 ==="
echo "时间: $(date)"
echo "=========================="

# 1. 检查MySQL进程
echo "1. MySQL进程状态:"
ps aux | grep mysqld | grep -v grep

# 2. 检查端口监听
echo -e "\n2. 端口监听状态:"
netstat -tlnp | grep 3306

# 3. 检查最大连接数配置
echo -e "\n3. 连接数配置:"
mysql -u root -p -e "SHOW VARIABLES LIKE 'max_connections';"

# 4. 查看当前连接数
echo -e "\n4. 当前连接状态:"
mysql -u root -p -e "SHOW STATUS LIKE 'Threads_connected';"
mysql -u root -p -e "SHOW STATUS LIKE 'Max_used_connections';"

# 5. 查看连接来源
echo -e "\n5. 连接来源统计:"
mysql -u root -p -e "SELECT host, COUNT(*) FROM information_schema.processlist GROUP BY host ORDER BY COUNT(*) DESC;"

# 6. 查看空闲连接
echo -e "\n6. 空闲连接统计:"
mysql -u root -p -e "SELECT COUNT(*) FROM information_schema.processlist WHERE command='Sleep' AND time > 60;"

# 7. 检查错误日志
echo -e "\n7. 最近10条错误日志:"
tail -10 /var/log/mysql/error.log 2>/dev/null || tail -10 /var/lib/mysql/*.err 2>/dev/null

echo -e "\n=== 排查完成 ==="

三、性能类故障排查

1. CPU使用率过高

sql

复制代码
-- 1. 查看当前执行的查询
SHOW PROCESSLIST;

-- 2. 查看慢查询数量
SHOW STATUS LIKE 'Slow_queries';

-- 3. 查看查询统计
SELECT * FROM performance_schema.events_statements_summary_by_digest 
ORDER BY SUM_TIMER_WAIT DESC LIMIT 10;

-- 4. 查看当前线程状态
SELECT * FROM performance_schema.threads WHERE PROCESSLIST_STATE IS NOT NULL;

bash

复制代码
# 使用top查看MySQL进程CPU使用
top -p $(pgrep mysqld)

# 使用perf分析
perf top -p $(pgrep mysqld)

# 使用strace跟踪
strace -p $(pgrep mysqld) -c

2. 慢查询分析

sql

复制代码
-- 1. 开启慢查询日志
SET GLOBAL slow_query_log = ON;
SET GLOBAL long_query_time = 2;
SET GLOBAL log_queries_not_using_indexes = ON;

-- 2. 查看慢查询配置
SHOW VARIABLES LIKE 'slow%';
SHOW VARIABLES LIKE 'long_query_time';

-- 3. 查看慢查询日志
-- Linux: /var/log/mysql/slow.log

-- 4. 使用pt-query-digest分析
pt-query-digest /var/log/mysql/slow.log > slow_report.txt

-- 5. 临时分析慢查询
SELECT * FROM information_schema.processlist WHERE command='Query' AND time > 5;

3. 锁等待分析

sql

复制代码
-- 1. 查看当前锁信息
SHOW ENGINE INNODB STATUS\G

-- 2. 查看锁等待
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

-- 3. 查看当前事务
SELECT * FROM information_schema.INNODB_TRX;

-- 4. 查看阻塞事务
SELECT 
    waiting_trx_id,
    waiting_thread,
    blocking_trx_id,
    blocking_thread
FROM sys.innodb_lock_waits;

-- 5. 终止阻塞事务
KILL <thread_id>;

4. I/O瓶颈分析

bash

复制代码
# 1. 查看磁盘I/O
iostat -x 1 10

# 2. 查看MySQL I/O统计
mysql> SHOW STATUS LIKE 'Innodb_data_reads';
mysql> SHOW STATUS LIKE 'Innodb_data_writes';
mysql> SHOW STATUS LIKE 'Innodb_buffer_pool_reads';

# 3. 查看文件打开情况
lsof | grep mysql | wc -l
mysql> SHOW VARIABLES LIKE 'open_files_limit';

# 4. 查看表缓存
mysql> SHOW STATUS LIKE 'Open_tables';
mysql> SHOW STATUS LIKE 'Opened_tables';

5. 内存使用分析

sql

复制代码
-- 1. 查看内存相关配置
SHOW VARIABLES LIKE '%buffer%';
SHOW VARIABLES LIKE '%cache%';

-- 2. 查看InnoDB缓冲池使用
SHOW STATUS LIKE 'Innodb_buffer_pool_pages%';

-- 3. 计算缓冲池命中率
SELECT 
    (SELECT variable_value FROM performance_schema.global_status 
     WHERE variable_name='Innodb_buffer_pool_read_requests') / 
    (SELECT variable_value FROM performance_schema.global_status 
     WHERE variable_name='Innodb_buffer_pool_reads') * 100 AS hit_ratio;

-- 4. 查看临时表使用
SHOW STATUS LIKE 'Created_tmp%';

bash

复制代码
# 查看MySQL内存使用
ps aux | grep mysqld
pmap -x $(pgrep mysqld) | head -20

# 查看系统内存
free -h
vmstat 1

四、数据类故障排查

1. 表损坏修复

sql

复制代码
-- 1. 检查表
CHECK TABLE table_name;

-- 2. 修复表
REPAIR TABLE table_name;

-- 3. 强制修复
REPAIR TABLE table_name USE_FRM;

-- 4. 使用mysqlcheck
mysqlcheck -u root -p --check --all-databases
mysqlcheck -u root -p --repair --all-databases

-- 5. 查看错误信息
SHOW ENGINE INNODB STATUS\G

2. 数据不一致

sql

复制代码
-- 1. 检查主从数据一致性
-- 使用pt-table-checksum
pt-table-checksum --host=master --user=root --password=pass

-- 2. 修复数据不一致
pt-table-sync --sync-to-master --host=slave --user=root --password=pass

-- 3. 校验数据库
mysqlcheck -u root -p --check --all-databases --extended

-- 4. 分析表
ANALYZE TABLE table_name;
OPTIMIZE TABLE table_name;

3. 死锁分析

sql

复制代码
-- 1. 查看死锁信息
SHOW ENGINE INNODB STATUS\G
-- 在输出中查找 "LATEST DETECTED DEADLOCK"

-- 2. 启用死锁日志
SET GLOBAL innodb_print_all_deadlocks = ON;

-- 3. 查询当前锁等待
SELECT 
    r.trx_id AS waiting_trx_id,
    r.trx_mysql_thread_id AS waiting_thread,
    r.trx_query AS waiting_query,
    b.trx_id AS blocking_trx_id,
    b.trx_mysql_thread_id AS blocking_thread,
    b.trx_query AS blocking_query
FROM information_schema.innodb_lock_waits w
JOIN information_schema.innodb_trx r ON w.requesting_trx_id = r.trx_id
JOIN information_schema.innodb_trx b ON w.blocking_trx_id = b.trx_id;

-- 4. 终止阻塞事务
KILL <blocking_thread_id>;

4. 表空间不足

sql

复制代码
-- 1. 查看数据库大小
SELECT 
    table_schema AS 'Database',
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
FROM information_schema.tables
GROUP BY table_schema
ORDER BY SUM(data_length + index_length) DESC;

-- 2. 查看表大小
SELECT 
    table_name,
    ROUND(data_length / 1024 / 1024, 2) AS 'Data (MB)',
    ROUND(index_length / 1024 / 1024, 2) AS 'Index (MB)'
FROM information_schema.tables
WHERE table_schema = 'database_name'
ORDER BY data_length DESC;

-- 3. 清理碎片
OPTIMIZE TABLE table_name;

-- 4. 回收未使用空间
ALTER TABLE table_name ENGINE=InnoDB;

5. 数据恢复

bash

复制代码
# 1. 从备份恢复
mysql -u root -p database_name < backup.sql

# 2. 从二进制日志恢复
mysqlbinlog mysql-bin.000001 --start-datetime="2024-01-01 10:00:00" --stop-datetime="2024-01-01 11:00:00" | mysql -u root -p

# 3. 从特定位置恢复
mysqlbinlog mysql-bin.000001 --start-position=123 --stop-position=456 | mysql -u root -p

# 4. 恢复误删除的表
# 从备份中提取表
sed -n '/CREATE TABLE `table_name`/,/UNLOCK TABLES/p' backup.sql > table.sql
mysql -u root -p database_name < table.sql

五、复制类故障排查

1. 复制中断

sql

复制代码
-- 1. 查看复制状态
SHOW SLAVE STATUS\G

-- 重点关注:
-- Slave_IO_Running: Yes/No
-- Slave_SQL_Running: Yes/No
-- Last_IO_Error
-- Last_SQL_Error
-- Seconds_Behind_Master

-- 2. 常见错误及处理

-- 错误1: 1062 - Duplicate entry
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

-- 错误2: 1032 - Can't find record
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

-- 错误3: 1452 - Cannot add or update a child row
-- 手动插入缺失数据
INSERT INTO child_table (id) VALUES (xxx);
START SLAVE;

2. 复制延迟

sql

复制代码
-- 1. 查看延迟
SHOW SLAVE STATUS\G
-- 查看 Seconds_Behind_Master

-- 2. 查看中继日志
SHOW SLAVE STATUS\G
-- 查看 Relay_Log_File 和 Relay_Log_Pos

-- 3. 优化并行复制
SET GLOBAL slave_parallel_workers = 4;
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';

-- 4. 查看复制线程
SHOW PROCESSLIST;

3. 复制故障排查脚本

bash

复制代码
#!/bin/bash
# replication_check.sh - 复制故障排查脚本

echo "=== MySQL复制状态检查 ==="
echo "时间: $(date)"
echo "=========================="

# 获取复制状态
SLAVE_STATUS=$(mysql -u root -p -e "SHOW SLAVE STATUS\G")

# 检查I/O线程
IO_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_IO_Running:" | awk '{print $2}')
SQL_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_SQL_Running:" | awk '{print $2}')
SECONDS_BEHIND=$(echo "$SLAVE_STATUS" | grep "Seconds_Behind_Master:" | awk '{print $2}')

echo "I/O线程状态: $IO_RUNNING"
echo "SQL线程状态: $SQL_RUNNING"
echo "复制延迟: $SECONDS_BEHIND 秒"

# 检查错误
LAST_IO_ERROR=$(echo "$SLAVE_STATUS" | grep "Last_IO_Error:" | cut -d':' -f2-)
LAST_SQL_ERROR=$(echo "$SLAVE_STATUS" | grep "Last_SQL_Error:" | cut -d':' -f2-)

if [ -n "$LAST_IO_ERROR" ] && [ "$LAST_IO_ERROR" != " " ]; then
    echo -e "\nI/O错误: $LAST_IO_ERROR"
fi

if [ -n "$LAST_SQL_ERROR" ] && [ "$LAST_SQL_ERROR" != " " ]; then
    echo -e "\nSQL错误: $LAST_SQL_ERROR"
fi

# 检查主库连接
MASTER_HOST=$(echo "$SLAVE_STATUS" | grep "Master_Host:" | awk '{print $2}')
echo -e "\n主库连接: $MASTER_HOST"
ping -c 2 $MASTER_HOST > /dev/null 2>&1
if [ $? -eq 0 ]; then
    echo "主库网络可达"
else
    echo "主库网络不可达"
fi

# 检查中继日志
RELAY_LOG=$(echo "$SLAVE_STATUS" | grep "Relay_Log_File:" | awk '{print $2}')
echo "当前中继日志: $RELAY_LOG"

echo -e "\n=== 检查完成 ==="

六、生产环境优化

1. 硬件层面优化

bash

复制代码
# 1. CPU
# 选择高频CPU,多核心

# 2. 内存
# InnoDB缓冲池建议设置为物理内存的70-80%
# 计算公式:内存 = 系统内存 - (OS内存 + 其他进程内存)

# 3. 磁盘
# 使用SSD,RAID10
# 分离数据目录和日志目录

# 4. 网络
# 使用万兆网络
# 分离复制网络和应用网络

2. 操作系统优化

bash

复制代码
# /etc/sysctl.conf 内核参数优化
# 网络优化
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_keepalive_time = 600
net.core.somaxconn = 65535

# 文件系统
fs.file-max = 6553500
fs.nr_open = 6553500

# 内存
vm.swappiness = 10
vm.dirty_ratio = 30
vm.dirty_background_ratio = 5

# 应用参数
sysctl -p

# 系统限制
cat >> /etc/security/limits.conf << EOF
mysql soft nofile 65535
mysql hard nofile 65535
mysql soft nproc 65535
mysql hard nproc 65535
EOF

3. MySQL参数优化

ini

复制代码
# /etc/my.cnf 完整优化配置

[client]
port = 3306
socket = /var/lib/mysql/mysql.sock
default-character-set = utf8mb4

[mysql]
prompt="\\u@\\h [\\d]> "
default-character-set = utf8mb4
no-auto-rehash

[mysqld]
# 基础设置
user = mysql
port = 3306
socket = /var/lib/mysql/mysql.sock
pid-file = /var/run/mysqld/mysqld.pid
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp

# 字符集
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect = 'SET NAMES utf8mb4'
skip-character-set-client-handshake

# 连接设置
max_connections = 1000
max_connect_errors = 100
connect_timeout = 10
wait_timeout = 600
interactive_timeout = 28800
max_allowed_packet = 64M

# 线程设置
thread_cache_size = 256
thread_stack = 256K

# 缓冲区设置
key_buffer_size = 256M
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 4M
join_buffer_size = 2M

# 临时表
tmp_table_size = 64M
max_heap_table_size = 64M

# 表缓存
table_open_cache = 2048
table_definition_cache = 2048

# InnoDB核心设置
default-storage-engine = InnoDB
innodb_buffer_pool_size = 8G                 # 物理内存的70-80%
innodb_buffer_pool_instances = 8
innodb_buffer_pool_chunk_size = 128M
innodb_log_file_size = 2G
innodb_log_buffer_size = 16M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
innodb_open_files = 2048
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_purge_threads = 4
innodb_page_cleaners = 4
innodb_adaptive_hash_index = ON
innodb_lock_wait_timeout = 50

# 事务设置
transaction_isolation = READ-COMMITTED
autocommit = 1

# 日志设置
log_error = /var/log/mysql/error.log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 2
log_queries_not_using_indexes = 1
min_examined_row_limit = 100

# 二进制日志
server-id = 1
log_bin = /var/lib/mysql/mysql-bin
binlog_format = ROW
binlog_row_image = full
expire_logs_days = 7
max_binlog_size = 100M
sync_binlog = 1
binlog_cache_size = 32K

# GTID(如果需要)
gtid_mode = OFF
enforce_gtid_consistency = OFF

# 复制设置
relay_log = /var/lib/mysql/mysql-relay-bin
relay_log_recovery = 1
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK

# 性能优化
performance_schema = ON
performance_schema_consumer_events_statements_history_long = ON
performance_schema_consumer_events_statements_history = ON

# 安全设置
local_infile = 0
skip_symbolic_links = yes
secure_file_priv = /var/lib/mysql-files

[mysqldump]
quick
quote-names
max_allowed_packet = 64M

[mysqld_safe]
log-error = /var/log/mysql/error.log
pid-file = /var/run/mysqld/mysqld.pid

4. InnoDB缓冲池优化

sql

复制代码
-- 1. 查看缓冲池大小
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';

-- 2. 查看缓冲池使用情况
SHOW STATUS LIKE 'Innodb_buffer_pool_pages%';

-- 3. 查看缓冲池命中率
SELECT 
    (SELECT variable_value FROM performance_schema.global_status 
     WHERE variable_name='Innodb_buffer_pool_read_requests') /
    (SELECT variable_value FROM performance_schema.global_status 
     WHERE variable_name='Innodb_buffer_pool_reads') * 100 AS hit_ratio;

-- 4. 设置缓冲池大小
SET GLOBAL innodb_buffer_pool_size = 8589934592;  -- 8GB

-- 5. 设置缓冲池实例数
SET GLOBAL innodb_buffer_pool_instances = 8;

5. 查询优化器优化

sql

复制代码
-- 1. 查看优化器开关
SHOW VARIABLES LIKE 'optimizer_switch';

-- 2. 启用优化器特性
SET GLOBAL optimizer_switch = 'index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on';

-- 3. 查看统计信息
ANALYZE TABLE table_name;

-- 4. 查看表统计信息
SELECT * FROM information_schema.statistics WHERE table_schema='database_name' AND table_name='table_name';

6. 表结构优化

sql

复制代码
-- 1. 选择合适的数据类型
-- 使用INT代替VARCHAR存储数字
-- 使用ENUM代替VARCHAR存储有限选项
-- 使用TIMESTAMP代替DATETIME

-- 2. 优化索引
-- 为WHERE、ORDER BY、GROUP BY字段创建索引
-- 避免在索引列上使用函数
-- 使用覆盖索引

-- 3. 分区表
CREATE TABLE orders (
    id INT NOT NULL,
    order_date DATE NOT NULL,
    amount DECIMAL(10,2)
) PARTITION BY RANGE (YEAR(order_date)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

-- 4. 查看表碎片
SELECT 
    table_name,
    ROUND(data_free / 1024 / 1024, 2) AS '碎片(MB)'
FROM information_schema.tables
WHERE table_schema = 'database_name'
ORDER BY data_free DESC;

-- 5. 整理碎片
OPTIMIZE TABLE table_name;

7. SQL优化示例

sql

复制代码
-- 1. 避免SELECT *
-- 不好的写法
SELECT * FROM users WHERE id = 1;

-- 好的写法
SELECT id, name, email FROM users WHERE id = 1;

-- 2. 使用EXISTS代替IN(当子查询数据量大时)
-- 不好的写法
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders);

-- 好的写法
SELECT * FROM users u WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);

-- 3. 使用JOIN代替子查询
-- 不好的写法
SELECT u.name, (SELECT COUNT(*) FROM orders WHERE user_id = u.id) AS order_count FROM users u;

-- 好的写法
SELECT u.name, COUNT(o.id) AS order_count 
FROM users u 
LEFT JOIN orders o ON u.id = o.user_id 
GROUP BY u.id;

-- 4. 使用UNION ALL代替UNION
-- UNION会去重,UNION ALL不会

-- 5. 使用批量操作
-- 不好的写法
INSERT INTO users (name) VALUES ('user1');
INSERT INTO users (name) VALUES ('user2');

-- 好的写法
INSERT INTO users (name) VALUES ('user1'), ('user2');

-- 6. 分页优化
-- 不好的写法
SELECT * FROM orders LIMIT 100000, 20;

-- 好的写法
SELECT * FROM orders WHERE id > 100000 LIMIT 20;

8. 索引优化建议

sql

复制代码
-- 1. 查看索引使用情况
SELECT 
    index_name,
    COUNT(*) as usage_count
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE object_schema = 'database_name'
GROUP BY index_name
ORDER BY usage_count;

-- 2. 查找未使用的索引
SELECT 
    table_name,
    index_name
FROM information_schema.statistics
WHERE table_schema = 'database_name'
AND index_name NOT IN (
    SELECT index_name 
    FROM performance_schema.table_io_waits_summary_by_index_usage
    WHERE object_schema = 'database_name'
);

-- 3. 删除重复索引
-- 使用pt-duplicate-key-checker
pt-duplicate-key-checker --host=localhost --user=root --password=pass

-- 4. 索引选择原则
-- 高选择性字段优先
-- 联合索引遵循最左前缀原则
-- 避免在频繁更新的列上建索引

9. 备份策略优化

bash

复制代码
#!/bin/bash
# backup_optimize.sh - 优化备份脚本

# 配置
BACKUP_DIR="/backup/mysql"
DATE=$(date +%Y%m%d_%H%M%S)
MYSQL_USER="root"
MYSQL_PASS="password"

# 1. 使用压缩备份
mysqldump -u$MYSQL_USER -p$MYSQL_PASS --all-databases \
    --single-transaction \
    --quick \
    --master-data=2 \
    --routines \
    --triggers \
    --events | gzip > $BACKUP_DIR/full_$DATE.sql.gz

# 2. 分库备份
for db in $(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW DATABASES;" | grep -Ev "Database|information_schema|performance_schema|mysql|sys"); do
    mysqldump -u$MYSQL_USER -p$MYSQL_PASS --databases $db \
        --single-transaction \
        --quick | gzip > $BACKUP_DIR/${db}_$DATE.sql.gz
done

# 3. 使用xtrabackup热备份
innobackupex --user=$MYSQL_USER --password=$MYSQL_PASS \
    --stream=tar ./ | gzip > $BACKUP_DIR/xtrabackup_$DATE.tar.gz

# 4. 备份二进制日志
mysqlbinlog --read-from-remote-server \
    --host=localhost \
    --user=$MYSQL_USER \
    --password=$MYSQL_PASS \
    --raw \
    --result-file=$BACKUP_DIR/binlog_$DATE_ \
    mysql-bin.000001

# 5. 清理过期备份
find $BACKUP_DIR -type f -mtime +7 -delete

echo "备份完成: $BACKUP_DIR/full_$DATE.sql.gz"

10. 性能监控脚本

bash

复制代码
#!/bin/bash
# mysql_performance_monitor.sh - 性能监控脚本

# 配置
MYSQL_USER="root"
MYSQL_PASS="password"
LOG_FILE="/var/log/mysql/performance.log"
ALERT_EMAIL="admin@example.com"

# 监控函数
monitor_performance() {
    local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    
    # 获取关键指标
    local qps=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW STATUS LIKE 'Queries';" | grep Queries | awk '{print $2}')
    local connections=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW STATUS LIKE 'Threads_connected';" | grep Threads_connected | awk '{print $2}')
    local slow_queries=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW STATUS LIKE 'Slow_queries';" | grep Slow_queries | awk '{print $2}')
    local buffer_hit=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SELECT ROUND((SELECT variable_value FROM performance_schema.global_status WHERE variable_name='Innodb_buffer_pool_read_requests') / (SELECT variable_value FROM performance_schema.global_status WHERE variable_name='Innodb_buffer_pool_reads') * 100, 2) AS hit_ratio;" | grep -v hit_ratio)
    
    # 写入日志
    echo "$timestamp | QPS: $qps | Connections: $connections | Slow: $slow_queries | BufferHit: $buffer_hit%" >> $LOG_FILE
    
    # 告警检查
    if [ $connections -gt 800 ]; then
        echo "连接数过高: $connections" | mail -s "MySQL Alert: High Connections" $ALERT_EMAIL
    fi
    
    if [ $slow_queries -gt 100 ]; then
        echo "慢查询过多: $slow_queries" | mail -s "MySQL Alert: High Slow Queries" $ALERT_EMAIL
    fi
}

# 主循环
while true; do
    monitor_performance
    sleep 60
done

11. 自动优化脚本

bash

复制代码
#!/bin/bash
# mysql_tuner.sh - MySQL自动优化脚本

# 配置
MYSQL_USER="root"
MYSQL_PASS="password"
CONFIG_FILE="/etc/my.cnf"

# 收集当前配置
get_config() {
    local var=$1
    mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW VARIABLES LIKE '$var';" | grep $var | awk '{print $2}'
}

# 获取系统内存
TOTAL_MEM=$(free -g | grep Mem | awk '{print $2}')
INNODB_BUFFER=$(($TOTAL_MEM * 70 / 100))

# 生成优化建议
cat > /tmp/mysql_optimize.sh << EOF
#!/bin/bash
# 自动生成的优化脚本

echo "开始MySQL优化..."

# 备份当前配置
cp $CONFIG_FILE ${CONFIG_FILE}.backup

# 应用优化配置
cat >> $CONFIG_FILE << EOC

# ========== 自动优化配置 ==========
# 内存: ${TOTAL_MEM}GB
innodb_buffer_pool_size = ${INNODB_BUFFER}G

# 连接设置
max_connections = 1000
thread_cache_size = 256

# 缓存设置
key_buffer_size = 256M
tmp_table_size = 64M
max_heap_table_size = 64M

# InnoDB设置
innodb_log_file_size = 2G
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT

# 日志设置
slow_query_log = 1
long_query_time = 2

# ========== 优化配置结束 ==========
EOC

echo "配置已更新,请重启MySQL"
echo "systemctl restart mysqld"
EOF

chmod +x /tmp/mysql_optimize.sh
echo "优化脚本已生成: /tmp/mysql_optimize.sh"

七、总结

故障排查清单

text

复制代码
□ 连接问题检查
  □ 网络连接
  □ 连接数配置
  □ 用户权限
  □ 防火墙设置

□ 性能问题检查
  □ CPU使用率
  □ 内存使用
  □ 磁盘I/O
  □ 慢查询日志

□ 复制问题检查
  □ 复制状态
  □ 网络连接
  □ 数据一致性
  □ 延迟时间

□ 数据问题检查
  □ 表完整性
  □ 索引有效性
  □ 锁等待
  □ 死锁情况

优化最佳实践

  1. 硬件层面

    • 使用SSD存储

    • 充足内存(InnoDB缓冲池)

    • 多核CPU

  2. 配置层面

    • 定期审查配置参数

    • 根据监控数据调整

    • 测试环境验证

  3. 应用层面

    • SQL优化

    • 索引优化

    • 批量操作

  4. 运维层面

    • 定期备份

    • 监控告警

    • 容量规划

相关推荐
PD我是你的真爱粉2 小时前
MySQL 锁机制:从理论分类到死锁实战
数据库·mysql·adb
Benszen2 小时前
SQL 基础及 MySQL DBA 运维实战 - 6:Mycat代理技术
sql·mysql·dba
会飞的大可2 小时前
Redis 故障排查与应急手册:从理论到实践
数据库·redis·缓存
Li emily2 小时前
解决了用美股历史数据api分析价格波动的困扰
数据库·人工智能·python
茉莉玫瑰花茶3 小时前
MySQL 存储过程与触发器超详解:从基础到实战(含面试题 + 案例)
数据库·mysql
xiaokangzhe3 小时前
MySQL故障排查与优化
数据库·mysql
圣光SG3 小时前
Java类与对象及面向对象基础核心详细笔记
java·前端·数据库
2601_949818093 小时前
LangChain-08 Query SQL DB 通过GPT自动查询SQL
数据库·sql·langchain
ytttr8733 小时前
C# 读取数据库表结构工具设计与实现
开发语言·数据库·c#