检查它而不是使用$_的好方法?
我想这样编码
if($_ eq”)#检查当前行是否为空行(没有任何字符)
{
$x = 0;
}
我的test.txt用于解析:
constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG"> vid = 0 name = "" units = "" constant fixup private GemConfigAlarms = <U1 0> /* my Comment */ vid = 1 name = "CONFIGALARMS" units = "" min = <U1 0> max = <U1 2> default = <U1 0>
我的代码如下.
这就是为什么我需要初步设置$x = 0.我不确定它是否正常
解决与否.
sub ConstantParseAndPrint { if (/^$/) // SOLUTION! { $x = 0; } if ($x == 0) { if (/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/) { $name1 = $1; # Constant $name2 = $2; # Fixup $name3 = $3; # Private $name4 = $4; $name5 = $5; $name6 = $7; $name7 = $8; # start print if (!$name7 eq '') { print DEST_XML_FILE "<!-- $name7-->\n"; } print DEST_XML_FILE " <ECID"; print DEST_XML_FILE " logicalName=\"$name4\""; print DEST_XML_FILE " valueType=\"$name5\""; print DEST_XML_FILE " value=\"$name6\""; $x = 1; } } elsif ($x == 1) { if(/\s*vid\s*=\s*(.*?)(\s|\n|\r)/) { $nID = $1; print DEST_XML_FILE " vid=\"$nID\""; $x = 2; } } elsif ($x == 2) { if(/\s*name\s*=\s*(.*?)(\s|\n|\r)/) { $nName = $1; print DEST_XML_FILE " name=$nName"; $x = 3; } } elsif ($x == 3) { if (/\s*units\s*=\s*(.*?)(\s|\n|\r)/) { $nUnits = $1; print DEST_XML_FILE " units=$nUnits"; $x = 4; } } elsif ($x == 4) { # \s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>\ if (/\s*min\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/) { #$nMinName1 = $1; $nMinName2 = $2; # Find the nMin Value #$nMinName3 = $3; #$nMinName4 = $4; print DEST_XML_FILE " min=\"$nMinName2\""; $x = 5; } else { print DEST_XML_FILE "></ECID>\n"; $x = 0; # There is no line 4 and line 5 } } elsif ($x == 5) { if (/\s*max\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/) { #$nMaxName1 = $1; $nMaxName2 = $2; # Find the nMax Value #$nMaxName3 = $3; #$nMaxName4 = $4; print DEST_XML_FILE " max=\"$nMaxName2\""; $x = 6; } } elsif ($x == 6) { if (/\s*default\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/) { #$nDefault1 = $1; $nDefault2 = $2; # Find the default Value #$nDefault3 = $3; #$nDefault4 = $4; print DEST_XML_FILE " default=\"$nDefault2\">"; print DEST_XML_FILE "</ECID>\n"; $x = 0; } } }
解决方法
问题不在于如何找到空白行.问题不在于使用哪个正则表达式.根本问题是理解如何分析问题并将分析转化为代码.
在这种情况下,问题是“我如何解析这种格式?”
我已经为你写了一个解析器.我还花时间写了我用来写它的过程的详细描述.
警告:解析器未针对所有情况进行仔细测试.它没有足够的内置错误处理.对于这些功能,您可以请求价目表或自己编写.
这是您提供的数据样本(我不确定您从中提取了哪些问题):
constant fixup GemEstabCommDelay = <U2 20> vid = 6 name = "ESTABLISHCOMMUNICATIONSTIMEOUT" units = "s" min = <U2 0> max = <U2 1800> default = <U2 20> constant fixup private GemConstantFileName = <A "C:\\TMP\\CONST.LOG"> vid = 4 name = "" units = "" constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG"> vid = 0 name = "" units = ""
在为数据文件编写解析器之前,需要先描述文件的结构.如果您使用的是标准格式(比如XML),则可以阅读现有规范.如果您使用的是本土格式,您可以自己编写.
因此,根据示例数据,我们可以看到:
>数据被分成块.
>每个块以第0列中的单词常量开头.
>每个块以空行结束.
>块由起始行和零个或多个附加行组成.
>起始行由关键字常量后跟一个或多个以空格分隔的单词,’=’符号和<>组成.引用数据值.
>最后一个关键字似乎是常量的名称.称之为constant_name
><> – 引用数据似乎是组合的类型/值说明符.
>早期的关键字似乎指定了有关常量的其他元数据.我们称之为选项.
>附加行指定其他键值对.我们称之为属性.属性可以具有单个值,也可以具有类型/值说明符.
>一个或多个属性可能出现在一行中.
好的,现在我们有一个粗略的规范.我们用它做什么?
格式是如何构建的?考虑从最大到最小的组织逻辑单位.这些将决定我们的代码的结构和流程.
>文件由BLOCKS组成.
> BLOCKS由LINES制成.
所以我们的解析器应该将文件分解为块,然后处理块.
现在我们在评论中粗略化了一个解析器:
# Parse a constant spec file. # Until file is done: # Read in a whole block # Parse the block and return key/value pairs for a hash. # Store a ref to the hash in a big hash of all blocks,keyed by constant_name. # Return ref to big hash with all block data
现在我们开始填写一些代码:
# Parse a constant spec file. sub parse_constant_spec { my $fh = shift; my %spec; # Until file is done: # Read in a whole block while( my $block = read_block($fh) ) { # Parse the and return key/value pairs for a hash. my %constant = parse_block( $block ); # Store a ref to the hash in a big hash of all blocks,keyed by constant_name. $spec{ $constant{name} } = \%constant; } # Return ref to big hash with all block data return \%spec; }
但它不会起作用.尚未编写parse_block和read_block subs.在这个阶段,没关系.关键在于以小的,可理解的块为特征的粗糙.每隔一段时间,为了保持可读性,你需要掩盖子程序中的细节下降 – 否则你最终会遇到无法调试的可怕的1000行潜艇.
现在我们知道我们需要编写几个子程序来完成,et viola:
#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $fh = \*DATA; print Dumper parse_constant_spec( $fh ); # Parse a constant spec file. # Pass in a handle to process. # As long as it acts like a file handle,it will work. sub parse_constant_spec { my $fh = shift; my %spec; # Until file is done: # Read in a whole block while( my $block = read_block($fh) ) { # Parse the and return key/value pairs for a hash. my %constant = parse_block( $block ); # Store a ref to the hash in a big hash of all blocks,keyed by constant_name. $spec{ $constant{const_name} } = \%constant; } # Return ref to big hash with all block data return \%spec; } # Read a constant definition block from a file handle. # void return when there is no data left in the file. # Otherwise return an array ref containing lines to in the block. sub read_block { my $fh = shift; my @lines; my $block_started = 0; while( my $line = <$fh> ) { $block_started++ if $line =~ /^constant/; if( $block_started ) { last if $line =~ /^\s*$/; push @lines,$line; } } return \@lines if @lines; return; } sub parse_block { my $block = shift; my ($start_line,@attribs) = @$block; my %constant; # Break down first line: # First separate assignment from option list. my ($start_head,$start_tail) = split /=/,$start_line; # work on option list my @options = split /\s+/,$start_head; # Recover constant_name from options: $constant{const_name} = pop @options; $constant{options} = \@options; # Now we parse the value/type specifier @constant{'type','value' } = parse_type_value_specifier( $start_tail ); # Parse attribute lines. # since we've already got multiple per line,get them all at once. chomp @attribs; my $attribs = join ' ',@attribs; # we have one long line of mixed key = "value" or key = <TYPE VALUE> @attribs = $attribs =~ /\s*(\w+\s+=\s+".*?"|\w+\s+=\s+<.*?>)\s*/g; for my $attrib ( @attribs ) { warn "$attrib\n"; my ($name,$value) = split /\s*=\s*/,$attrib; if( $value =~ /^"/ ) { $value =~ s/^"|"\s*$//g; } elsif( $value =~ /^</ ) { $value = [ parse_type_value_specifier( $start_tail ) ]; } else { warn "Bad line"; } $constant{ $name } = $value; } return %constant; } sub parse_type_value_specifier { my $tvs = shift; my ($type,$value) = $tvs =~ /<(\w+)\s+(.*?)>/; return $type,$value; } __DATA__ constant fixup GemEstabCommDelay = <U2 20> vid = 6 name = "ESTABLISHCOMMUNICATIONSTIMEOUT" units = "s" min = <U2 0> max = <U2 1800> default = <U2 20> constant fixup private GemConstantFileName = <A "C:\\TMP\\CONST.LOG"> vid = 4 name = "" units = "" constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG"> vid = 0 name = "" units = ""
上面的代码远非完美. IMO,parse_block太长了,应该分成更小的子.此外,对形式良好的输入的验证和执行还不够.变量名称和描述可能更清楚,但我并不真正理解数据格式的语义.更好的名称将更接近地匹配数据格式的语义.
尽管存在这些问题,但它确实解析了您的格式并生成了一个非常方便的数据结构,可以填充到您想要的任何输出格式中.
如果您在许多地方使用此格式,我建议将解析代码放入模块中.有关详细信息,请参阅perldoc perlmod.
现在,请停止使用全局变量并忽略好的建议.请开始阅读perldoc,阅读Learning Perl和Perl Best Practices,使用strict,使用警告.当我在阅读阅读列表时,请阅读Global Variables are Bad,然后在wiki周围漫步阅读和学习.通过阅读c2,我学到了更多关于编写软件的知识.
如果您对此代码的工作方式有疑问,为什么要按原样布局,可以做出其他选择,说出来并提出问题.我愿意帮助一个自愿的学生.
你的英语很好,但很明显你不是母语.我可能使用了太多复杂的句子.如果你需要用简单的句子写的部分,我可以尝试帮助.我知道用外语工作非常困难.