perl
#!/usr/bin/perl -w
use strict;
my $cigar = "18S20M30D20I50H";
my $read_span = 0;
while($cigar =~ s/([0-9]+)([MIDNSHP=X])//){
#$read_span += $1; #这个不会报错
if ($2 =~ /[MD]/) {$read_span += $1;} #这个报错,匹配几次就报错几次
}
print "read_span\t$read_span\n";
运行之后,报错的信息如下:
Use of uninitialized value $1 in addition (+) at a.pl line 11.
Use of uninitialized value $1 in addition (+) at a.pl line 11.
read_span 0
为什么直接"read_span += 1;"不会报错,
而多一个if条件中的"if (2 =\~ /\[MD\]/) {read_span += $1;}" 就会报错?
原因在于1,2这类神奇的变量,
每发生一次正则表达式匹配,会生成一套新的捕获结果1,2....$n.
哪怕没有指定1, 2,等,这些变量也会被undef代替,
而不是用上次匹配的1,2.
以我的例子来说,while循环每循环一次,会发生两次匹配。
第一次是:
perl
$cigar =~ s/([0-9]+)([MIDNSHP=X])//
第二次是:
perl
if ($2 =~ /[MD]/) {$read_span += $1;}
很明显,if条件语句中的1是对应 2 =~ /[MD]/ 的,
而这次匹配并没有指定捕获什么, 所以$1变成undef,
此时再去操作$1(已经变成了undef),当然会报错
当然,匹配一次,重新生成一次1...n的前提是 :得匹配成功
比如,重新修改一下代码:
perl
#!/usr/bin/perl -w
use strict;
my $cigar = "18S20M30D20I50H";
my $read_span = 0;
while($cigar =~ s/([0-9]+)([MIDNSHP=X])//){
#$read_span += $1;
if ($2 =~ /[MD]/) {$read_span += $1;}
print "$1\t$2\n";
}
print "read_span\t$read_span\n";
输出信息(包括报错信息)如下:
18 S
Use of uninitialized value $1 in addition (+) at c.pl line 11.
Use of uninitialized value $1 in concatenation (.) or string at c.pl line 12.
Use of uninitialized value $2 in concatenation (.) or string at c.pl line 12.
Use of uninitialized value $1 in addition (+) at c.pl line 11.
Use of uninitialized value $1 in concatenation (.) or string at c.pl line 12.
Use of uninitialized value $2 in concatenation (.) or string at c.pl line 12.
20 I
50 H
read_span 0
可以看出,
如果 2 =\~ /\[MD\]/ 没有成功,1就仍然是 $cigar =~ s/([0-9]+)([MIDNSHP=X])// 的。
只要 2 =\~ /\[MD\]/ 匹配成功了, 1,$2 都会变。
那么对应的策略就是:
谨慎使用1...n,最好第一时间赋值给其他变量。
比如:
perl
#!/usr/bin/perl -w
use strict;
my $cigar = "18S20M30D20I50H";
my $read_span = 0;
while($cigar =~ s/([0-9]+)([MIDNSHP=X])//){
#$read_span += $1;
my $number = $1;
my $operation = $2;
if ($operation =~ /[MD]/) {$read_span += $number;}
}
print "read_span\t$read_span\n";