我挂载了一个文件系统
sudo mount -o 'rw,bg,hard,nointr,rsize=1048576,wsize=1048576,vers=4' toto:/test /test
然后跑
dd if=/test/file of=/dev/null bs=1024k
我可以读取200-400MB / s但是当我将版本更改为vers = 3时,重新安装并重新运行dd我只能获得90MB / s.我正在读取的文件是NFS服务器上的内存文件.连接的两端都是Solaris并且具有10GbE NIC.我通过在所有测试之间重新安装来避免任何客户端缓存.我使用dtrace在服务器上查看通过NFS测量数据的速度.对于v3和v4,我改变了:
nfs4_bsize nfs3_bsize
从默认的32K到1M(在v4上我最大为150MB / s,32K)
我试过调整
> nfs3_max_threads
> clnt_max_conns
> nfs3_async_clusters
提高v3性能,但没有去.
在v3上,如果我运行四个并行dd,吞吐量从90MB / s下降到70-80MBs,这使我相信问题是一些共享资源,如果是这样,那么我想知道它是什么以及我是否可以增加该资源.
#!/usr/sbin/dtrace -s #pragma D option quiet #pragma D option defaultargs inline string ADDR=$$1; dtrace:::BEGIN { TITLE = 10; title = 0; printf("starting up ...\n"); self->start = 0; } tcp:::send,tcp:::receive / self->start == 0 / { walltime[args[1]->cs_cid]= timestamp; self->start = 1; } tcp:::send,tcp:::receive / title == 0 && ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) / { printf("%4s %15s %6s %6s %6s %8s %8s %8s %8s %8s %8s %8s %8s %8s %8s\n","cid","ip","usend","urecd","delta","send","recd","ssz","sscal","rsz","rscal","congw","conthr","flags","retran" ); title = TITLE ; } tcp:::send / ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) / { nfs[args[1]->cs_cid]=1; /* this is an NFS thread */ this->delta= timestamp-walltime[args[1]->cs_cid]; walltime[args[1]->cs_cid]=timestamp; this->flags=""; this->flags= strjoin((( args[4]->tcp_flags & TH_FIN ) ? "FIN|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_SYN ) ? "SYN|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_RST ) ? "RST|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_PUSH ) ? "PUSH|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_ACK ) ? "ACK|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_URG ) ? "URG|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_ECE ) ? "ECE|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags & TH_CWR ) ? "CWR|" : ""),this->flags); this->flags= strjoin((( args[4]->tcp_flags == 0 ) ? "null " : ""),this->flags); printf("%5d %14s %6d %6d %6d %8d \ %-8s %8d %6d %8d %8d %8d %12d %s %d \n",args[1]->cs_cid%1000,args[3]->tcps_raddr,args[3]->tcps_snxt - args[3]->tcps_suna,args[3]->tcps_rnxt - args[3]->tcps_rack,this->delta/1000,args[2]->ip_plength - args[4]->tcp_offset,"",args[3]->tcps_swnd,args[3]->tcps_snd_ws,args[3]->tcps_rwnd,args[3]->tcps_rcv_ws,args[3]->tcps_cwnd,args[3]->tcps_cwnd_ssthresh,this->flags,args[3]->tcps_retransmit ); this->flags=0; title--; this->delta=0; } tcp:::receive / nfs[args[1]->cs_cid] && ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) / { this->delta= timestamp-walltime[args[1]->cs_cid]; walltime[args[1]->cs_cid]=timestamp; this->flags=""; this->flags= strjoin((( args[4]->tcp_flags & TH_FIN ) ? "FIN|" : ""),this->flags); printf("%5d %14s %6d %6d %6d %8s / %-8d %8d %6d %8d %8d %8d %12d %s %d \n",args[3]->tcps_retransmit ); this->flags=0; title--; this->delta=0; }
输出看起来像(不是来自这种特殊情况):
cid ip usend urecd delta send recd ssz sscal rsz rscal congw conthr flags retran 320 192.168.100.186 240 0 272 240 \ 49232 0 1049800 5 1049800 2896 ACK|PUSH| 0 320 192.168.100.186 240 0 196 / 68 49232 0 1049800 5 1049800 2896 ACK|PUSH| 0 320 192.168.100.186 0 0 27445 0 \ 49232 0 1049800 5 1049800 2896 ACK| 0 24 192.168.100.177 0 0 255562 / 52 64060 0 64240 0 91980 2920 ACK|PUSH| 0 24 192.168.100.177 52 0 301 52 \ 64060 0 64240 0 91980 2920 ACK|PUSH| 0
一些标题
usend - unacknowledged send bytes urecd - unacknowledged received bytes ssz - send window rsz - receive window congw - congestion window
计划在v3和v4上采用dd的窥探并进行比较.已经完成了但是流量太大而且我使用了磁盘文件而不是缓存文件,这使得比较时间毫无意义.将使用缓存数据运行其他snoop,而不会在框之间运行其他流量. TBD
此外,网络人员表示,连接上没有流量整形或带宽限制.
解决方法
这包括client-side caching,虽然在这种情况下不相关,但是parallel-NFS (pNFS).主要的变化是协议现在是有状态的.
http://www.netapp.com/us/communities/tech-ontap/nfsv4-0408.html
我认为这是使用NetApps时推荐的协议,从性能文档来看.该技术类似于Windows Vista机会锁定.
NFSv4 differs from prevIoUs versions of NFS by allowing a server to
delegate specific actions on a file to a client to enable more
aggressive client caching of data and to allow caching of the locking
state. A server cedes control of file updates and the locking state to
a client via a delegation. This reduces latency by allowing the client
to perform varIoUs operations and cache data locally. Two types of
delegations currently exist: read and write. The server has the
ability to call back a delegation from a client should there be
contention for a file. Once a client holds a delegation,it can
perform operations on files whose data has been cached locally to
avoid network latency and optimize I/O. The more aggressive caching
that results from delegations can be a big help in environments with
the following characteristics:
- Frequent opens and closes
- Frequent GETATTRs
- File locking
- Read-only sharing
- High latency
- Fast clients
- Heavily loaded server with many clients