【Java 基础编程】Java 正则表达式实战：Pattern/Matcher、元字符与常用正则，验证与提取必备

正则表达式是处理字符串的强大工具，通过定义模式可以高效地进行字符串匹配、查找、替换等操作，是文本处理的重要技能。

⚡ 快速参考

正则表达式：用描述性语言定义字符串规则，用于匹配、查找、替换字符串
元字符：.（任意字符）、\d（数字）、\w（单词字符）、\s（空白字符）
量词：*（0次或多次）、+（1次或多次）、?（0次或1次）、{n}（n次）
Pattern 和 Matcher：Pattern 编译正则，Matcher 执行匹配操作
分组：使用 () 进行分组，可以捕获和引用

📚 学习目标

理解正则表达式的概念和应用场景
掌握正则表达式的基本语法和元字符
掌握 Pattern 和 Matcher 类的使用
理解分组和捕获的概念
掌握常用正则表达式的编写

一、正则表达式概述

1.1 什么是正则表达式？

正则表达式（Regular Expression，简称 regex）： 一种用来匹配字符串的工具，用描述性的语言给字符串定义一个规则，凡是符合规则的字符串就认为"匹配"了。

应用场景：

数据验证（邮箱、手机号、身份证号等）
文本搜索和替换
数据提取
字符串分割

二、正则表达式入门

2.1 什么是正则表达式

定义： 正则表达式（Regular Expression，简称regex）是一种用来匹配字符串的强有力的武器。它的设计思想是用一种描述性的语言来给字符串定义一个规则，凡是符合规则的字符串，我们就认为它"匹配"了，否则，该字符串就是不合法的。

为什么需要正则表达式

场景	传统方式	正则表达式
验证邮箱	复杂的字符串判断	一行正则搞定 ✅
提取数字	循环遍历字符	简单模式匹配 ✅
替换文本	多次字符串操作	一次性替换 ✅
分割字符串	复杂的split逻辑	灵活的分隔符 ✅

2.2 正则表达式的应用场景

数据验证 - 验证邮箱、手机号、身份证号等
文本搜索 - 在大量文本中查找特定模式
文本替换 - 批量替换符合规则的文本
数据提取 - 从字符串中提取需要的信息
字符串分割 - 按照复杂规则分割字符串

2.3 快速入门示例

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

/**
 * 正则表达式快速入门
 */
public class RegexDemo01 {
    public static void main(String[] args) {
        // 需求：判断字符串是否为有效的邮箱地址
        String email1 = "test@example.com";
        String email2 = "invalid-email";
        
        // 定义邮箱的正则表达式
        String regex = "^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+$";
        
        // 使用正则表达式匹配
        System.out.println(email1 + " 是否为有效邮箱: " + email1.matches(regex));
        System.out.println(email2 + " 是否为有效邮箱: " + email2.matches(regex));
        
        // 输出：
        // test@example.com 是否为有效邮箱: true
        // invalid-email 是否为有效邮箱: false
    }
}

三、正则表达式语法

3.1 基本字符匹配

3.1.1 普通字符

字符	说明	示例
`a-z`	匹配小写字母	`abc` 匹配 "abc"
`A-Z`	匹配大写字母	`ABC` 匹配 "ABC"
`0-9`	匹配数字	`123` 匹配 "123"
`中文`	匹配中文字符	`你好` 匹配 "你好"

3.1.2 元字符

元字符	说明	示例
`.`	匹配任意单个字符（除换行符）	`a.c` 匹配 "abc", "a1c"
`\d`	匹配数字，等价于 `[0-9]`	`\d\d` 匹配 "12"
`\D`	匹配非数字	`\D` 匹配 "a", "!"
`\w`	匹配字母、数字、下划线	`\w+` 匹配 "abc_123"
`\W`	匹配非单词字符	`\W` 匹配 "!", "@"
`\s`	匹配空白字符（空格、制表符等）	`\s+` 匹配多个空格
`\S`	匹配非空白字符	`\S+` 匹配 "hello"

java 复制代码

public class RegexMetaChar {
    public static void main(String[] args) {
        // \d 匹配数字
        System.out.println("123".matches("\\d+")); // true
        System.out.println("abc".matches("\\d+")); // false
        
        // \w 匹配单词字符
        System.out.println("hello_123".matches("\\w+")); // true
        System.out.println("hello world".matches("\\w+")); // false（有空格）
        
        // \s 匹配空白字符
        System.out.println("   ".matches("\\s+")); // true
        System.out.println("hello".matches("\\s+")); // false
    }
}

3.2 字符类

语法	说明	示例
`[abc]`	匹配a、b或c中的任意一个	`[abc]` 匹配 "a", "b", "c"
`[^abc]`	匹配除了a、b、c之外的任意字符	`[^abc]` 匹配 "d", "1"
`[a-z]`	匹配a到z的任意小写字母	`[a-z]` 匹配 "a" 到 "z"
`[A-Z]`	匹配A到Z的任意大写字母	`[A-Z]` 匹配 "A" 到 "Z"
`[0-9]`	匹配0到9的任意数字	`[0-9]` 匹配 "0" 到 "9"
`[a-zA-Z0-9]`	匹配字母和数字	匹配 "a", "Z", "5"

java 复制代码

public class RegexCharClass {
    public static void main(String[] args) {
        // [abc] 匹配a、b或c
        System.out.println("a".matches("[abc]")); // true
        System.out.println("d".matches("[abc]")); // false
        
        // [^abc] 匹配除了a、b、c之外的字符
        System.out.println("d".matches("[^abc]")); // true
        System.out.println("a".matches("[^abc]")); // false
        
        // [a-z] 匹配小写字母
        System.out.println("hello".matches("[a-z]+")); // true
        System.out.println("Hello".matches("[a-z]+")); // false（有大写）
        
        // [a-zA-Z0-9] 匹配字母和数字
        System.out.println("abc123".matches("[a-zA-Z0-9]+")); // true
    }
}

3.3 量词

量词	说明	示例
`*`	匹配前面的字符0次或多次	`a*` 匹配 "", "a", "aa", "aaa"
`+`	匹配前面的字符1次或多次	`a+` 匹配 "a", "aa", "aaa"
`?`	匹配前面的字符0次或1次	`a?` 匹配 "", "a"
`{n}`	匹配前面的字符恰好n次	`a{3}` 匹配 "aaa"
`{n,}`	匹配前面的字符至少n次	`a{2,}` 匹配 "aa", "aaa", "aaaa"
`{n,m}`	匹配前面的字符n到m次	`a{2,4}` 匹配 "aa", "aaa", "aaaa"

java 复制代码

public class RegexQuantifier {
    public static void main(String[] args) {
        // * 匹配0次或多次
        System.out.println("".matches("a*")); // true
        System.out.println("aaa".matches("a*")); // true
        
        // + 匹配1次或多次
        System.out.println("".matches("a+")); // false
        System.out.println("aaa".matches("a+")); // true
        
        // ? 匹配0次或1次
        System.out.println("".matches("a?")); // true
        System.out.println("a".matches("a?")); // true
        System.out.println("aa".matches("a?")); // false
        
        // {n} 匹配恰好n次
        System.out.println("aaa".matches("a{3}")); // true
        System.out.println("aa".matches("a{3}")); // false
        
        // {n,m} 匹配n到m次
        System.out.println("aa".matches("a{2,4}")); // true
        System.out.println("aaa".matches("a{2,4}")); // true
        System.out.println("aaaaa".matches("a{2,4}")); // false
    }
}

3.4 位置匹配

符号	说明	示例
`^`	匹配字符串的开始	`^abc` 匹配以abc开头
`$`	匹配字符串的结束	`abc$` 匹配以abc结尾
`\b`	匹配单词边界	`\bhello\b` 匹配独立的hello
`\B`	匹配非单词边界	`\Bhello\B` 匹配hello在单词中间

java 复制代码

public class RegexPosition {
    public static void main(String[] args) {
        // ^ 匹配开始
        System.out.println("hello world".matches("^hello.*")); // true
        System.out.println("world hello".matches("^hello.*")); // false
        
        // $ 匹配结束
        System.out.println("hello world".matches(".*world$")); // true
        System.out.println("world hello".matches(".*world$")); // false
        
        // ^...$ 完全匹配
        System.out.println("hello".matches("^hello$")); // true
        System.out.println("hello world".matches("^hello$")); // false
    }
}

3.5 逻辑运算符

符号	说明	示例
`	`	或运算
`()`	分组	`(abc)+` 匹配 "abc", "abcabc"
`(?:)`	非捕获分组	`(?:abc)+` 匹配但不捕获

java 复制代码

public class RegexLogic {
    public static void main(String[] args) {
        // | 或运算
        System.out.println("cat".matches("cat|dog")); // true
        System.out.println("dog".matches("cat|dog")); // true
        System.out.println("bird".matches("cat|dog")); // false
        
        // () 分组
        System.out.println("abcabc".matches("(abc)+")); // true
        System.out.println("abc".matches("(abc)+")); // true
    }
}

3.6 贪婪与非贪婪匹配

模式	说明	示例
`*`	贪婪匹配（尽可能多）	`a.*b` 匹配 "aXXXXb" 整个字符串
`*?`	非贪婪匹配（尽可能少）	`a.*?b` 匹配 "aXb" 最短匹配
`+?`	非贪婪的1次或多次	`a.+?b`
`??`	非贪婪的0次或1次	`a.??b`
`{n,m}?`	非贪婪的n到m次	`a.{2,4}?b`

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexGreedy {
    public static void main(String[] args) {
        String text = "a123b456b";
        
        // 贪婪匹配
        Pattern p1 = Pattern.compile("a.*b");
        Matcher m1 = p1.matcher(text);
        if (m1.find()) {
            System.out.println("贪婪匹配: " + m1.group()); // a123b456b
        }
        
        // 非贪婪匹配
        Pattern p2 = Pattern.compile("a.*?b");
        Matcher m2 = p2.matcher(text);
        if (m2.find()) {
            System.out.println("非贪婪匹配: " + m2.group()); // a123b
        }
    }
}

四、Pattern和Matcher类

4.1 Pattern类

Pattern类 是正则表达式的编译表示形式。

4.1.1 Pattern类常用方法

方法	说明	返回值
`compile(String regex)`	编译正则表达式	Pattern
`matcher(CharSequence input)`	创建匹配器	Matcher
`matches(String regex, CharSequence input)`	静态方法，快速匹配	boolean
`split(CharSequence input)`	分割字符串	String[]

java 复制代码

import java.util.regex.Pattern;

public class PatternDemo {
    public static void main(String[] args) {
        // 1. 编译正则表达式
        Pattern pattern = Pattern.compile("\\d+");
        
        // 2. 创建匹配器
        Matcher matcher = pattern.matcher("abc123def456");
        
        // 3. 静态方法快速匹配
        boolean result = Pattern.matches("\\d+", "123");
        System.out.println("是否全为数字: " + result); // true
        
        // 4. 分割字符串
        Pattern p = Pattern.compile("\\s+");
        String[] words = p.split("hello   world  java");
        for (String word : words) {
            System.out.println(word);
        }
    }
}

4.2 Matcher类

Matcher类 用于对输入字符串进行匹配操作。

4.2.1 Matcher类常用方法

方法	说明	返回值
`matches()`	整个字符串是否匹配	boolean
`find()`	查找下一个匹配	boolean
`group()`	返回匹配的子串	String
`group(int group)`	返回指定分组	String
`start()`	返回匹配的起始位置	int
`end()`	返回匹配的结束位置	int
`replaceAll(String replacement)`	替换所有匹配	String
`replaceFirst(String replacement)`	替换第一个匹配	String

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MatcherDemo {
    public static void main(String[] args) {
        String text = "Java8, Java11, Java17";
        Pattern pattern = Pattern.compile("Java(\\d+)");
        Matcher matcher = pattern.matcher(text);
        
        // 1. find() 查找所有匹配
        System.out.println("=== 查找所有匹配 ===");
        while (matcher.find()) {
            System.out.println("找到: " + matcher.group());
            System.out.println("版本号: " + matcher.group(1));
            System.out.println("位置: " + matcher.start() + "-" + matcher.end());
        }
        
        // 2. matches() 完全匹配
        System.out.println("\n=== 完全匹配 ===");
        Matcher m2 = pattern.matcher("Java17");
        System.out.println("Java17 完全匹配: " + m2.matches()); // true
        
        // 3. replaceAll() 替换
        System.out.println("\n=== 替换 ===");
        matcher.reset(); // 重置匹配器
        String result = matcher.replaceAll("JDK$1");
        System.out.println("替换后: " + result); // JDK8, JDK11, JDK17
        
        // 4. replaceFirst() 替换第一个
        matcher.reset();
        String result2 = matcher.replaceFirst("JDK$1");
        System.out.println("替换第一个: " + result2); // JDK8, Java11, Java17
    }
}

4.3 完整示例

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

/**
 * Pattern和Matcher综合示例
 */
public class RegexExample {
    public static void main(String[] args) {
        // 需求：从文本中提取所有的邮箱地址
        String text = "联系我们: admin@example.com 或 support@test.org，" +
                     "也可以发送到 info@company.com.cn";
        
        // 定义邮箱正则表达式
        String regex = "[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+";
        
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(text);
        
        System.out.println("找到的邮箱地址：");
        int count = 0;
        while (matcher.find()) {
            count++;
            System.out.println(count + ". " + matcher.group());
        }
    }
}

五、正则表达式应用

5.1 数据验证

5.1.1 验证手机号

java 复制代码

public class ValidatePhone {
    public static void main(String[] args) {
        // 中国大陆手机号：1开头，第二位是3-9，共11位
        String phoneRegex = "^1[3-9]\\d{9}$";
        
        String[] phones = {"13812345678", "12345678901", "1381234567"};
        
        for (String phone : phones) {
            boolean valid = phone.matches(phoneRegex);
            System.out.println(phone + " 是否有效: " + valid);
        }
    }
}

5.1.2 验证邮箱

java 复制代码

public class ValidateEmail {
    public static void main(String[] args) {
        // 邮箱格式：用户名@域名.后缀
        String emailRegex = "^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+$";
        
        String[] emails = {
            "test@example.com",
            "user.name@test.co.uk",
            "invalid@",
            "@invalid.com"
        };
        
        for (String email : emails) {
            boolean valid = email.matches(emailRegex);
            System.out.println(email + " 是否有效: " + valid);
        }
    }
}

5.1.3 验证身份证号

java 复制代码

public class ValidateIDCard {
    public static void main(String[] args) {
        // 18位身份证号：前17位数字，最后一位数字或X
        String idCardRegex = "^\\d{17}[\\dXx]$";
        
        String[] idCards = {
            "110101199001011234",
            "11010119900101123X",
            "1101011990010112"  // 长度不够
        };
        
        for (String idCard : idCards) {
            boolean valid = idCard.matches(idCardRegex);
            System.out.println(idCard + " 是否有效: " + valid);
        }
    }
}

5.1.4 验证URL

java 复制代码

public class ValidateURL {
    public static void main(String[] args) {
        // URL格式：协议://域名:端口/路径
        String urlRegex = "^(https?|ftp)://[^\\s/$.?#].[^\\s]*$";
        
        String[] urls = {
            "https://www.example.com",
            "http://localhost:8080/path",
            "ftp://ftp.example.com",
            "invalid-url"
        };
        
        for (String url : urls) {
            boolean valid = url.matches(urlRegex);
            System.out.println(url + " 是否有效: " + valid);
        }
    }
}

5.2 文本提取

5.2.1 提取数字

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class ExtractNumbers {
    public static void main(String[] args) {
        String text = "订单号：12345，金额：￥299.99，数量：3件";
        
        Pattern pattern = Pattern.compile("\\d+(\\.\\d+)?");
        Matcher matcher = pattern.matcher(text);
        
        System.out.println("提取的数字：");
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

5.2.2 提取日期

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class ExtractDates {
    public static void main(String[] args) {
        String text = "会议时间：2024-01-15，截止日期：2024/02/20";
        
        // 匹配 YYYY-MM-DD 或 YYYY/MM/DD 格式
        Pattern pattern = Pattern.compile("\\d{4}[-/]\\d{2}[-/]\\d{2}");
        Matcher matcher = pattern.matcher(text);
        
        System.out.println("提取的日期：");
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

5.3 文本替换

5.3.1 敏感词过滤

java 复制代码

public class FilterSensitiveWords {
    public static void main(String[] args) {
        String text = "这是一段包含敏感词的文本";
        
        // 定义敏感词列表
        String[] sensitiveWords = {"敏感词1", "敏感词2"};
        
        // 构建正则表达式
        String regex = String.join("|", sensitiveWords);
        
        // 替换为***
        String result = text.replaceAll("(?i)" + regex, "***");
        System.out.println("过滤后: " + result);
    }
}

5.3.2 格式化电话号码

java 复制代码

public class FormatPhone {
    public static void main(String[] args) {
        String phone = "13812345678";
        
        // 格式化为 138-1234-5678
        String formatted = phone.replaceAll("(\\d{3})(\\d{4})(\\d{4})", "$1-$2-$3");
        System.out.println("格式化后: " + formatted);
    }
}

5.4 字符串分割

java 复制代码

import java.util.Arrays;

public class SplitString {
    public static void main(String[] args) {
        // 1. 按多个空格分割
        String text1 = "hello   world  java";
        String[] words1 = text1.split("\\s+");
        System.out.println("按空格分割: " + Arrays.toString(words1));
        
        // 2. 按多种分隔符分割
        String text2 = "apple,banana;orange|grape";
        String[] fruits = text2.split("[,;|]");
        System.out.println("按多种分隔符分割: " + Arrays.toString(fruits));
        
        // 3. 按数字分割
        String text3 = "part1123part2456part3";
        String[] parts = text3.split("\\d+");
        System.out.println("按数字分割: " + Arrays.toString(parts));
    }
}

六、分组和捕获

6.1 分组的概念

分组是用圆括号 () 将正则表达式的一部分括起来，形成一个子表达式。

6.1.1 分组的作用

提取匹配的子串
反向引用
应用量词

6.2 捕获分组

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class GroupCapture {
    public static void main(String[] args) {
        String text = "张三的电话是13812345678，李四的电话是13987654321";
        
        // 分组：(姓名) (电话)
        Pattern pattern = Pattern.compile("(\\S+)的电话是(1[3-9]\\d{9})");
        Matcher matcher = pattern.matcher(text);
        
        while (matcher.find()) {
            System.out.println("完整匹配: " + matcher.group(0));
            System.out.println("姓名: " + matcher.group(1));
            System.out.println("电话: " + matcher.group(2));
            System.out.println("---");
        }
    }
}

6.3 非捕获分组

非捕获分组 (?:pattern) 只分组不捕获，不能通过 group() 获取。

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class NonCapturingGroup {
    public static void main(String[] args) {
        String text = "Java8, Java11, Java17";
        
        // 非捕获分组 (?:Java)
        Pattern pattern = Pattern.compile("(?:Java)(\\d+)");
        Matcher matcher = pattern.matcher(text);
        
        while (matcher.find()) {
            System.out.println("完整匹配: " + matcher.group(0));
            System.out.println("版本号: " + matcher.group(1));
            // matcher.group(2) 会报错，因为只有一个捕获分组
        }
    }
}

6.4 反向引用

反向引用 可以引用前面捕获的分组。

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Backreference {
    public static void main(String[] args) {
        // 查找重复的单词
        String text = "hello hello world world java";
        
        // \1 引用第一个分组
        Pattern pattern = Pattern.compile("(\\w+)\\s+\\1");
        Matcher matcher = pattern.matcher(text);
        
        System.out.println("找到的重复单词：");
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

6.5 命名分组

Java 7+ 支持命名分组 (?<name>pattern)。

java 复制代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class NamedGroup {
    public static void main(String[] args) {
        String text = "2024-01-15";
        
        // 命名分组
        Pattern pattern = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
        Matcher matcher = pattern.matcher(text);
        
        if (matcher.find()) {
            System.out.println("年: " + matcher.group("year"));
            System.out.println("月: " + matcher.group("month"));
            System.out.println("日: " + matcher.group("day"));
        }
    }
}

七、常用正则表达式

7.1 常用验证正则

类型	正则表达式	说明
手机号	`^1[3-9]\\d{9}$`	中国大陆手机号
邮箱	`^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+$`	标准邮箱格式
身份证	`^\\d{17}[\\dXx]$`	18位身份证号
QQ号	`^[1-9]\\d{4,10}$`	5-11位数字
微信号	`^[a-zA-Z][a-zA-Z0-9_-]{5,19}$`	字母开头，6-20位
密码	`^(?=.[a-z])(?=.[A-Z])(?=.*\\d)[a-zA-Z\\d]{8,}$`	至少8位，包含大小写字母和数字
URL	`^(https?	ftp)😕/[^\s/KaTeX parse error: Expected 'EOF', got '#' at position 3: .?#̲].[^\\s]*`
IP地址	`^((25[0-5]	2[0-4]\d
中文	`^[\\u4e00-\\u9fa5]+$`	纯中文
日期	`^\d{4}-(0[1-9]	1[0-2])-(0[1-9]

7.2 常用工具类

java 复制代码

import java.util.regex.Pattern;

/**
 * 正则表达式工具类
 */
public class RegexUtils {
    
    // 手机号验证
    public static boolean isPhone(String phone) {
        return phone != null && phone.matches("^1[3-9]\\d{9}$");
    }
    
    // 邮箱验证
    public static boolean isEmail(String email) {
        String regex = "^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+$";
        return email != null && email.matches(regex);
    }
    
    // 身份证验证
    public static boolean isIDCard(String idCard) {
        return idCard != null && idCard.matches("^\\d{17}[\\dXx]$");
    }
    
    // URL验证
    public static boolean isURL(String url) {
        String regex = "^(https?|ftp)://[^\\s/$.?#].[^\\s]*$";
        return url != null && url.matches(regex);
    }
    
    // 提取所有数字
    public static java.util.List<String> extractNumbers(String text) {
        java.util.List<String> numbers = new java.util.ArrayList<>();
        java.util.regex.Matcher matcher = Pattern.compile("\\d+(\\.\\d+)?").matcher(text);
        while (matcher.find()) {
            numbers.add(matcher.group());
        }
        return numbers;
    }
    
    // 脱敏手机号（中间4位替换为****）
    public static String maskPhone(String phone) {
        if (isPhone(phone)) {
            return phone.replaceAll("(\\d{3})\\d{4}(\\d{4})", "$1****$2");
        }
        return phone;
    }
    
    // 脱敏身份证号（中间10位替换为****）
    public static String maskIDCard(String idCard) {
        if (isIDCard(idCard)) {
            return idCard.replaceAll("(\\d{4})\\d{10}(\\d{4})", "$1**********$2");
        }
        return idCard;
    }
}

7.3 使用示例

java 复制代码

public class RegexUtilsTest {
    public static void main(String[] args) {
        // 验证
        System.out.println("手机号验证: " + RegexUtils.isPhone("13812345678"));
        System.out.println("邮箱验证: " + RegexUtils.isEmail("test@example.com"));
        
        // 提取数字
        String text = "订单号12345，金额299.99元";
        System.out.println("提取的数字: " + RegexUtils.extractNumbers(text));
        
        // 脱敏
        System.out.println("脱敏手机号: " + RegexUtils.maskPhone("13812345678"));
        System.out.println("脱敏身份证: " + RegexUtils.maskIDCard("110101199001011234"));
    }
}

八、总结

8.1 正则表达式基础

概念	说明
元字符	`.` `\d` `\w` `\s` 等特殊字符
量词	`*` `+` `?` `{n,m}` 控制匹配次数
位置	`^` `$` `\b` 匹配位置
分组	`()` 捕获分组，`(?:)` 非捕获分组

8.2 Pattern和Matcher

类	作用	常用方法
Pattern	编译正则表达式	`compile()` `matcher()` `split()`
Matcher	执行匹配操作	`find()` `group()` `replaceAll()`

8.3 应用场景

✅ 数据验证 - 验证邮箱、手机号、身份证等
✅ 文本提取 - 提取数字、日期、邮箱等信息
✅ 文本替换 - 敏感词过滤、格式化等
✅ 字符串分割 - 按复杂规则分割字符串

8.4 最佳实践

✅ 推荐做法：

使用 Pattern.compile() 预编译正则表达式
复杂正则使用注释和分组提高可读性
使用工具类封装常用正则验证
注意转义字符（Java中需要双反斜杠 \\）

❌ 避免做法：

不要在循环中重复编译正则表达式
避免过度复杂的正则，影响性能和可读性
不要忽略边界情况的测试

8.5 性能优化

优化方法	说明
预编译	使用 `Pattern.compile()` 预编译
非捕获分组	不需要提取时使用 `(?:)`
非贪婪匹配	合理使用 `*?` `+?` 等
避免回溯	简化正则表达式，减少分支

8.6 常见错误

错误	原因	解决方案
PatternSyntaxException	正则语法错误	检查正则表达式语法
IndexOutOfBoundsException	分组索引越界	确认分组数量
贪婪匹配问题	匹配过多内容	使用非贪婪匹配 `*?`

8.7 记忆口诀

复制代码

正则三步走：
1. Pattern.compile() 编译正则
2. pattern.matcher() 创建匹配器
3. matcher.find() 查找匹配

常用元字符：
\d 数字  \w 单词  \s 空白
\D 非数字 \W 非单词 \S 非空白

量词记忆：
* 零或多  + 一或多  ? 零或一
{n} 恰好n  {n,} 至少n  {n,m} n到m

8.8 学习建议

掌握基础语法 - 元字符、量词、位置匹配
理解分组捕获 - 捕获分组和反向引用
多练习应用 - 数据验证、文本提取、替换
使用在线工具 - regex101.com 等在线测试工具
阅读源码 - 学习框架中的正则使用

正则表达式是文本处理的利器，掌握它能大大提高开发效率！