之前要处理一个25W行的文件,用shell,慢的简直不能忍,即使优化了把那种通过管道启动新程序的脚本,如echo 'aaa' |grep xxx这种去掉,也用了7分钟,
于是乎,我又拿起了2年前用的perl重写一份,耗时0.6秒 !!!
real 0m0.647s
user 0m0.560s
sys 0m0.032s
本着我是c++爱好者的兴趣,我用c++重写了一份,发现还是0.5-0.6秒,怎么c++也没比perl快多少嘛
改编译参数,加了个-O3,发现速度没多少变化
real 0m0.560s
user 0m0.236s
sys 0m0.276s
我想,肯定是我写的有问题
看到sys 用了0.27,比perl的0.03多很多,我就猜测可能是io方面没有写好
后来,我发现我写的程序有诸如cout<< "xxx"<<endl;获取此处有问题
于是乎我改成了cout<<"xxx\n";
效率马上提高了
real 0m0.205s
user 0m0.160s
sys 0m0.024s
以上
my $PREV_TIME=""; my $SUM=0; my $INIT=1; my $line; my $TIME; my $TPS; my $MSG; while(<>) { # Remove the line break chomp; $line=$_; # Skip blank line if (!$line){next;} $TIME=substr($line,8); $TPS=substr($line,10,99); # Handle TPS Log is Enabled/Disabled $MSG=substr($TPS,10); if ( $MSG eq "TPS Log is" ) { print "$line\n"; next; } $TPS=substr($TPS,rindex($TPS,"TPS")-1); if ($INIT==1) { $PREV_TIME=$TIME; $INIT=0; } if ($PREV_TIME eq $TIME) { $SUM=$SUM+$TPS; }else { print "$PREV_TIME $SUM TPS\n"; $SUM=$TPS; } $PREV_TIME=$TIME; } print "$PREV_TIME $SUM TPS\n";
string PREV_TIME; int SUM = 0; int INIT = 1; string line; string TIME; string TPS; fstream ifs; string MSG; int tps_number = 0; ifs.open("/var/tmp/sorted.tmp"); while (!std::getline(ifs,line).eof()) { if (line == "") continue; TIME = line.substr(0,8); TPS = line.substr(10,99); MSG = TPS.substr(0,10); if (MSG == "TPS Log is") { cout << line << endl; continue; } int index = TPS.rfind("TPS"); tps_number = atoi(TPS.substr(0,index - 1).c_str()); if (INIT == 1) { PREV_TIME = TIME; INIT = 0; } if (PREV_TIME == TIME) { SUM = SUM + tps_number; } else { cout << PREV_TIME << " " << SUM << " " << "TPS\n"; SUM = tps_number; } PREV_TIME = TIME; } cout << PREV_TIME << " " << SUM << " " << "TPS\n";