一、Linux
过滤文件中非空行
查看当前系统时区
递归修改目录下所有文件权限为 644
bash
grep -v "^$" app.log
timedatectl
find . -type f -exec chmod 644 {} \;
grep -v "^$" 反向过滤空行,清洗日志、配置文件常用
timedatectl 查看系统时区 / 时间,集群时间同步排查必备
find + exec 批量递归授权,运维批量权限标准化操作
二、SQL
196. 删除重复的电子邮箱

sql
DELETE p1
FROM Person p1
JOIN Person p2
ON p1.email = p2.email
AND p1.id > p2.id;
自连接删重复,保留最小 id
数据去重、脏数据删除经典模板
生产清洗重复数据高频写法
197. 上升的温度

sql
SELECT w1.id
FROM Weather w1
JOIN Weather w2
ON DATE_SUB(w1.recordDate, INTERVAL 1 DAY) = w2.recordDate
WHERE w1.temperature > w2.temperature;
DATE_SUB 日期减 1 天,关联前一日数据
相邻日期同比、环比分析标准套路
时序数据、指标环比必备题型
584. 寻找用户推荐人

sql
SELECT name
FROM Customer
WHERE referee_id != 2 OR referee_id IS NULL;
NULL 不能用!= 判断,必须单独写条件
空值 + 条件筛选易错点
业务圈人、渠道筛选常用
三、Pyspark
python
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, date_sub, to_date
spark = SparkSession.builder \
.master("local[*]") \
.appName("Day34") \
.getOrCreate()
# 1. 上升的温度 日期环比
df = spark.createDataFrame([
(1, "2025-05-01", 20),
(2, "2025-05-02", 25)
], ["id","recordDate","temperature"])
df.alias("w1").join(
df.alias("w2"),
date_sub(to_date("w1.recordDate"),1) == to_date("w2.recordDate")
).filter(col("w1.temperature") > col("w2.temperature"))
.select("w1.id").show()
# 2. 推荐人筛选
cust = spark.createDataFrame([
(1,"Alice",None),
(2,"Bob",2),
(3,"Cindy",3)
], ["id","name","referee_id"])
cust.filter( (col("referee_id") != 2) | col("referee_id").isNull() )
.select("name").show()
spark.stop()
Spark date_sub、to_date 处理日期偏移,和 MySQL 日期逻辑对齐
多表自连接实现时序环比
空值判断 .isNull() 搭配或条件筛选
四、算法
21. 合并两个有序链表
python
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def mergeTwoLists(l1, l2):
dummy = ListNode()
cur = dummy
while l1 and l2:
if l1.val < l2.val:
cur.next = l1
l1 = l1.next
else:
cur.next = l2
l2 = l2.next
cur = cur.next
cur.next = l1 if l1 else l2
return dummy.next
虚拟头结点法,链表合并标准模板
迭代双指针,时间 O (n)
链表入门必背高频题