$tree . ├── a ├── b └── dir └── c 1 directory,3 files
也就是说,两个文件a和b与dir dir一起,其中另一个文件c代表.
我想用awk(完全是GNU Awk 4.1.1)处理所有文件,所以我这样做:
$gawk '{print FILENAME; nextfile}' * */* a b awk: cmd. line:1: warning: command line argument `dir' is a directory: skipped dir/c
一切都很好,但*也扩展到目录目录,awk尝试处理它.
所以我想知道:是否有任何本地方式awk可以检查给定元素是否是一个文件,如果是,跳过它?也就是说,不使用system().
$gawk 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME,"is a dir,skipping"; nextfile}} ENDFILE{print FILENAME,FNR}' * */* a a 10 a.wk a.wk 3 b b 10 dir dir is a dir,skipping dir/c dir/c 10
还要注意if(system(“[!-d”FILENAME“]”)){print FILENAME,“是dir,skip”“; nextfile}直观地工作:它应该在true时返回1,但它返回退出代码.
我在A.5 Extensions in gawk Not in POSIX awk读到:
- Directories on the command line produce a warning and are skipped (see 07002)
4.11 Directories on the Command Line
According to the POSIX standard,files named on the awk command line
must be text files; it is a fatal error if they are not. Most versions
of awk treat a directory on the command line as a fatal error.By default,gawk produces a warning for a directory on the command
line,but otherwise ignores it. This makes it easier to use shell
wildcards with your awk program:06003
If either of the –posix or –traditional options is given,then gawk
reverts to treating a directory on the command line as a fatal error.See 07003,for a way to treat directories as usable
data from an awk program.
事实上情况就是这样:与之前相同的命令–posix失败:
$gawk --posix 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME,NR}' * */* gawk: cmd. line:1: fatal: cannot open file `dir' for reading (Is a directory)
我检查了上面链接的16.7.6阅读目录部分,他们讨论了readdir:
The readdir extension adds an input parser for directories. The usage
is as follows:@load “readdir”
但我不确定如何调用它以及如何从命令行使用它.
解决方法
$ls -F tmp bar dir/ foo $cat tmp/foo line 1 $cat tmp/bar line 1 line 2 $cat tmp/dir cat: tmp/dir: Is a directory $cat tst.awk BEGIN { for (i=1;i<ARGC;i++) { if ( (getline line < ARGV[i]) <= 0 ) { print "Skipping:",ARGV[i],ERRNO delete ARGV[i] } close(ARGV[i]) } } { print FILENAME,$0 } $awk -f tst.awk tmp/* Skipping: tmp/dir Is a directory tmp/bar line 1 tmp/bar line 2 tmp/foo line 1 $awk --posix -f tst.awk tmp/* Skipping: tmp/dir tmp/bar line 1 tmp/bar line 2 tmp/foo line 1
如果/当它尝试从文件中检索记录失败时,每个POSIX getline返回-1(例如,不可读的文件或文件不存在或文件是目录),你只需要GNU awk告诉你它是哪个失败的如果你关心的话,ERRNO的价值.