正则表达式
正则表达式(Regular Expression)或 RegExp 是一种小型编程语言,有助于在数据中查找模式。RegExp 可以用来检查某种模式是否存在于不同的数据类型中。在 JavaScript 中使用 RegExp,可以使用 RegExp 构造函数,或者使用两个斜杠后跟一个标志来声明 RegExp 模式。我们可以通过两种方式创建模式。
声明字符串时,我们使用单引号、双引号或反引号;声明正则表达式时,我们使用两个斜杠和一个可选的标志。标志可以是 g、i、m、s、u 或 y。
RegExp 参数
正则表达式接受两个参数。一个是必需的搜索模式,另一个是可选的标志。
模式
模式可以是文本或任何具有某种相似性的模式。例如,电子邮件中的"spam"一词可以是我们感兴趣的模式,或者电话号码的格式可能是我们想要查找的。
标志
标志是正则表达式中的可选参数,决定了搜索的类型。让我们看看一些标志:
- g:全局标志,表示在整个文本中查找模式
- i:不区分大小写标志(它会搜索小写和大写)
- m:多行
使用 RegExp 构造函数创建模式
声明不带全局标志和不区分大小写标志的正则表达式。
js
// 不带标志
let pattern = 'love'
let regEx = new RegExp(pattern)
声明带有全局标志和不区分大小写标志的正则表达式。
js
let pattern = 'love'
let flag = 'gi'
let regEx = new RegExp(pattern, flag)
使用 RegExp 对象声明正则模式。在 RegExp 构造函数内编写模式和标志。
js
let regEx = new RegExp('love','gi')
不使用 RegExp 构造函数创建模式
声明带有全局标志和不区分大小写标志的正则表达式。
js
let regEx= /love/gi
上述正则表达式与我们使用 RegExp 构造函数创建的相同。
js
let regEx= new RegExp('love','gi')
RegExp 对象方法
让我们看看一些 RegExp 方法。
测试匹配
test():测试字符串中是否存在匹配项。返回 true 或 false。
js
const str = 'I love JavaScript'
const pattern = /love/
const result = pattern.test(str)
console.log(result)
sh
true
包含所有匹配的数组
match() :返回一个包含所有匹配项的数组,包括捕获组,如果没有找到匹配则返回 null。
如果不使用全局标志,match() 返回一个包含模式、索引、输入和组的数组。
js
const str = 'I love JavaScript'
const pattern = /love/
const result = str.match(pattern)
console.log(result)
sh
["love", index: 2, input: "I love JavaScript", groups: undefined]
js
const str = 'I love JavaScript'
const pattern = /love/g
const result = str.match(pattern)
console.log(result)
sh
["love"]
search():在字符串中测试匹配。返回匹配的索引,或如果搜索失败则返回 -1。
js
const str = 'I love JavaScript'
const pattern = /love/g
const result = str.search(pattern)
console.log(result)
sh
2
替换子字符串
replace():在字符串中搜索匹配项,并将匹配的子字符串替换为替换字符串。
js
const txt = 'Python is the most beautiful language that a human begin has ever created.\
I recommend python for a first programming language'
matchReplaced = txt.replace(/Python|python/, 'JavaScript')
console.log(matchReplaced)
sh
JavaScript is the most beautiful language that a human begin has ever created.I recommend python for a first programming language
js
const txt = 'Python is the most beautiful language that a human begin has ever created.\
I recommend python for a first programming language'
matchReplaced = txt.replace(/Python|python/g, 'JavaScript')
console.log(matchReplaced)
sh
JavaScript is the most beautiful language that a human begin has ever created.I recommend JavaScript for a first programming language
js
const txt = 'Python is the most beautiful language that a human begin has ever created.\
I recommend python for a first programming language'
matchReplaced = txt.replace(/Python/gi, 'JavaScript')
console.log(matchReplaced)
sh
JavaScript is the most beautiful language that a human begin has ever created.I recommend JavaScript for a first programming language
js
const txt = '%I a%m te%%a%%che%r% a%n%d %% I l%o%ve te%ach%ing.\
T%he%re i%s n%o%th%ing as m%ore r%ewarding a%s e%duc%at%i%ng a%n%d e%m%p%ow%er%ing \
p%e%o%ple.\
I fo%und te%a%ching m%ore i%n%t%er%%es%ting t%h%an any other %jobs.\
D%o%es thi%s m%ot%iv%a%te %y%o%u to b%e a t%e%a%cher.'
matches = txt.replace(/%/g, '')
console.log(matches)
sh
I am teacher and I love teaching.There is nothing as more rewarding as educating and empowering people.I found teaching more interesting than any other jobs.Does this motivate you to be a teacher.
- []:一组字符
- [a-c] 表示 a 或 b 或 c
- [a-z] 表示任何字母 a 到 z
- [A-Z] 表示任何字符 A 到 Z
- [0-3] 表示 0 或 1 或 2 或 3
- [0-9] 表示任何数字 0 到 9
- [A-Za-z0-9] 表示任何 a 到 z、A 到 Z、0 到 9 的字符
- \:用于转义特殊字符
- \d 表示匹配字符串中包含数字(0-9)
- \D 表示匹配字符串中不包含数字
- .:除了换行符 (\n) 之外的任何字符
- ^:以...开头
- /^substring/ 例如 /^love/,表示以"love"开头的句子
- /[^abc]/ 表示不是 a、不是 b、不是 c
- $:以...结尾
- /substring / 例如 / l o v e / 例如 /love /例如/love/,表示以"love"结尾的句子
- *:零次或多次
- /[a]*/ 表示 a 是可选的,或者可以出现多次
- +:一次或多次
- /[a]+/ 表示至少出现一次或多次
- ?:零次或一次
- /[a]?/ 表示零次或一次
- \b:单词边界,匹配单词的开始或结束
- {3}:正好 3 个字符
- {3,}:至少 3 个字符
- {3,8}:3 到 8 个字符
- |:或者
- /apple|banana/ 表示苹果或香蕉中的任意一个
- ():捕获和分组
让我们用例子来说明上述元字符。
方括号
让我们使用方括号来包含大小写。
js
const pattern = '[Aa]pple' // 方括号表示 A 或 a
const txt = 'Apple and banana are fruits. An old cliche says an apple a day keeps the doctor way has been replaced by a banana a day keeps the doctor far far away. '
const matches = txt.match(pattern)
console.log(matches)
sh
["Apple", index: 0, input: "Apple and banana are fruits. An old cliche says an apple a day keeps the doctor way has been replaced by a banana a day keeps the doctor far far away.", groups: undefined]
js
const pattern = /[Aa]pple/g // 方括号表示 A 或 a
const txt = 'Apple and banana are fruits. An old cliche says an apple a day a doctor way has been replaced by a banana a day keeps the doctor far far away. '
const matches = txt.match(pattern)
console.log(matches)
sh
["Apple", "apple"]
如果我们想查找香蕉,可以按如下方式编写模式:
js
const pattern = /[Aa]pple|[Bb]anana/g // 方括号表示 A 或 a,B 或 b
const txt = 'Apple and banana are fruits. An old cliche says an apple a day a doctor way has been replaced by a banana a day keeps the doctor far far away. Banana is easy to eat too.'
const matches = txt.match(pattern)
console.log(matches)
sh
["Apple", "banana", "apple", "banana", "Banana"]
使用方括号和或运算符,我们成功提取了 Apple、apple、Banana 和 banana。
正则表达式中的转义字符(\)
js
const pattern = /\d/g // \d 是一个特殊字符,表示数字
const txt = 'This regular expression example was made in January 12, 2020.'
const matches = txt.match(pattern)
console.log(matches) // ["1", "2", "2", "0", "2", "0"], 这不是我们想要的
js
const pattern = /\d+/g // \d 是一个特殊字符,表示数字
const txt = 'This regular expression example was made in January 12, 2020.'
const matches = txt.match(pattern)
console.log(matches) // ["12", "2020"], 这不是我们想要的
一次或多次(+)
js
const pattern = /\d+/g // \d 是一个特殊字符,表示数字
const txt = 'This regular expression example was made in January 12, 2020.'
const matches = txt.match(pattern)
console.log(matches) // ["12", "2020"], 这不是我们想要的
句点(.)
js
const pattern = /[a]./g // 方括号表示 a,. 表示除换行符外的任何字符
const txt = 'Apple and banana are fruits'
const matches = txt.match(pattern)
console.log(matches) // ["an", "an", "an", "a ", "ar"]
js
const pattern = /[a].+/g // . 表示任何字符,+ 表示一次或多次
const txt = 'Apple and banana are fruits'
const matches = txt.match(pattern)
console.log(matches) // ['and banana are fruits']
零次或多次(*)
零次或多次。模式可以不出现,或者可以出现多次。
js
const pattern = /[a].*/g //. 表示任何字符,* 表示零次或多次
const txt = 'Apple and banana are fruits'
const matches = txt.match(pattern)
console.log(matches) // ['and banana are fruits']
零次或一次(?)
零次或一次。模式可以不出现,或者可以出现一次。
js
const txt = 'I am not sure if there is a convention how to write the word e-mail.\
Some people write it email others may write it as Email or E-mail.'
const pattern = /[Ee]-?mail/g // ? 表示可选
matches = txt.match(pattern)
console.log(matches) // ["e-mail", "email", "Email", "E-mail"]
正则表达式中的量词
我们可以使用大括号指定我们在文本中查找的子字符串的长度。让我们看看如何使用 RegExp 量词。假设我们对长度为 4 个字符的子字符串感兴趣。
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /\b\w{4}\b/g // 精确匹配四个字符的单词
const matches = txt.match(pattern)
console.log(matches) //['This', 'made', '2019']
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /\b[a-zA-Z]{4}\b/g // 精确匹配四个字符的单词,不包括数字
const matches = txt.match(pattern)
console.log(matches) //['This', 'made']
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /\d{4}/g // 数字,且精确为四位
const matches = txt.match(pattern)
console.log(matches) // ['2019']
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /\d{1,4}/g // 1 到 4 位
const matches = txt.match(pattern)
console.log(matches) // ['6', '2019']
插入符号 ^
- 以...开头
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /^This/ // ^ 表示以...开头
const matches = txt.match(pattern)
console.log(matches) // ['This']
- 否定
js
const txt = 'This regular expression example was made in December 6, 2019.'
const pattern = /[^A-Za-z,. ]+/g // [^] 表示否定,不是 A-Z,不是 a-z,不是空格,不是逗号,不是句号
const matches = txt.match(pattern)
console.log(matches) // ["6", "2019"]
精确匹配
它应该以 ^ 开头,并以 $ 结尾。
js
let pattern = /^[A-Z][a-z]{3,12}$/;
let name = 'Asabeneh';
let result = pattern.test(name)
console.log(result) // true