如何跳过awk中的目录?

前端之家收集整理的这篇文章主要介绍了如何跳过awk中的目录?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
假设我有以下文件和目录结构:
$tree
.
├── a
├── b
└── dir
    └── c

1 directory,3 files

也就是说,两个文件a和b与dir dir一起,其中另一个文件c代表.

我想用awk(完全是GNU Awk 4.1.1)处理所有文件,所以我这样做:

$gawk '{print FILENAME; nextfile}' * */*
a
b
awk: cmd. line:1: warning: command line argument `dir' is a directory: skipped
dir/c

一切都很好,但*也扩展到目录目录,awk尝试处理它.

所以我想知道:是否有任何本地方式awk可以检查给定元素是否是一个文件,如果是,跳过它?也就是说,不使用system().

我通过在BEGINFILE调用外部系统使其工作:

$gawk 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME,"is a dir,skipping"; nextfile}} ENDFILE{print FILENAME,FNR}' * */*
a
a 10
a.wk
a.wk 3
b
b 10
dir
dir is a dir,skipping
dir/c
dir/c 10

还要注意if(system(“[!-d”FILENAME“]”)){print FILENAME,“是dir,skip”“; nextfile}直观地工作:它应该在true时返回1,但它返回退出代码.

我在A.5 Extensions in gawk Not in POSIX awk读到:

  • Directories on the command line produce a warning and are skipped (see 07002)

然后链接页面说:

4.11 Directories on the Command Line

According to the POSIX standard,files named on the awk command line
must be text files; it is a fatal error if they are not. Most versions
of awk treat a directory on the command line as a fatal error.

By default,gawk produces a warning for a directory on the command
line,but otherwise ignores it. This makes it easier to use shell
wildcards with your awk program:

06003

If either of the –posix or –traditional options is given,then gawk
reverts to treating a directory on the command line as a fatal error.

See 07003,for a way to treat directories as usable
data from an awk program.

事实上情况就是这样:与之前相同的命令–posix失败:

$gawk --posix 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME,NR}' * */*
gawk: cmd. line:1: fatal: cannot open file `dir' for reading (Is a directory)

我检查了上面链接的16.7.6阅读目录部分,他们讨论了readdir:

The readdir extension adds an input parser for directories. The usage
is as follows:

@load “readdir”

但我不确定如何调用它以及如何从命令行使用它.

解决方法

如果您想保护您的脚本免受其他人错误地将目录(或其他任何不可读的文本文件)传递给它,您可以这样做:
$ls -F tmp
bar  dir/  foo

$cat tmp/foo
line 1

$cat tmp/bar
line 1
line 2

$cat tmp/dir
cat: tmp/dir: Is a directory

$cat tst.awk
BEGIN {
    for (i=1;i<ARGC;i++) {
        if ( (getline line < ARGV[i]) <= 0 ) {
            print "Skipping:",ARGV[i],ERRNO
            delete ARGV[i]
        }
        close(ARGV[i])
    }
}
{ print FILENAME,$0 }

$awk -f tst.awk tmp/*
Skipping: tmp/dir Is a directory
tmp/bar line 1
tmp/bar line 2
tmp/foo line 1

$awk --posix -f tst.awk tmp/*
Skipping: tmp/dir
tmp/bar line 1
tmp/bar line 2
tmp/foo line 1

如果/当它尝试从文件中检索记录失败时,每个POSIX getline返回-1(例如,不可读的文件文件不存在或文件是目录),你只需要GNU awk告诉你它是哪个失败的如果你关心的话,ERRNO的价值.

猜你在找的Linux相关文章