solaris – NFS v3与v4

前端之家收集整理的这篇文章主要介绍了solaris – NFS v3与v4前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我想知道为什么NFS v4会比NFS v3快得多,并且如果v3上有任何可以调整的参数.

我挂载了一个文件系统

  1. sudo mount -o 'rw,bg,hard,nointr,rsize=1048576,wsize=1048576,vers=4' toto:/test /test

然后跑

  1. dd if=/test/file of=/dev/null bs=1024k

我可以读取200-400MB / s但是当我将版本更改为vers = 3时,重新安装并重新运行dd我只能获得90MB / s.我正在读取的文件是NFS服务器上的内存文件.连接的两端都是Solaris并且具有10GbE NIC.我通过在所有测试之间重新安装来避免任何客户端缓存.我使用dtrace在服务器上查看通过NFS测量数据的速度.对于v3和v4,我改变了:

  1. nfs4_bsize
  2. nfs3_bsize

从默认的32K到1M(在v4上我最大为150MB / s,32K)
我试过调整

> nfs3_max_threads
> clnt_max_conns
> nfs3_async_clusters

提高v3性能,但没有去.

在v3上,如果我运行四个并行dd,吞吐量从90MB / s下降到70-80MBs,这使我相信问题是一些共享资源,如果是这样,那么我想知道它是什么以及我是否可以增加该资源.

dtrace代码获取窗口大小:

  1. #!/usr/sbin/dtrace -s
  2. #pragma D option quiet
  3. #pragma D option defaultargs
  4.  
  5. inline string ADDR=$$1;
  6.  
  7. dtrace:::BEGIN
  8. {
  9. TITLE = 10;
  10. title = 0;
  11. printf("starting up ...\n");
  12. self->start = 0;
  13. }
  14.  
  15. tcp:::send,tcp:::receive
  16. / self->start == 0 /
  17. {
  18. walltime[args[1]->cs_cid]= timestamp;
  19. self->start = 1;
  20. }
  21.  
  22. tcp:::send,tcp:::receive
  23. / title == 0 &&
  24. ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) /
  25. {
  26. printf("%4s %15s %6s %6s %6s %8s %8s %8s %8s %8s %8s %8s %8s %8s %8s\n","cid","ip","usend","urecd","delta","send","recd","ssz","sscal","rsz","rscal","congw","conthr","flags","retran"
  27. );
  28. title = TITLE ;
  29. }
  30.  
  31. tcp:::send
  32. / ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) /
  33. {
  34. nfs[args[1]->cs_cid]=1; /* this is an NFS thread */
  35. this->delta= timestamp-walltime[args[1]->cs_cid];
  36. walltime[args[1]->cs_cid]=timestamp;
  37. this->flags="";
  38. this->flags= strjoin((( args[4]->tcp_flags & TH_FIN ) ? "FIN|" : ""),this->flags);
  39. this->flags= strjoin((( args[4]->tcp_flags & TH_SYN ) ? "SYN|" : ""),this->flags);
  40. this->flags= strjoin((( args[4]->tcp_flags & TH_RST ) ? "RST|" : ""),this->flags);
  41. this->flags= strjoin((( args[4]->tcp_flags & TH_PUSH ) ? "PUSH|" : ""),this->flags);
  42. this->flags= strjoin((( args[4]->tcp_flags & TH_ACK ) ? "ACK|" : ""),this->flags);
  43. this->flags= strjoin((( args[4]->tcp_flags & TH_URG ) ? "URG|" : ""),this->flags);
  44. this->flags= strjoin((( args[4]->tcp_flags & TH_ECE ) ? "ECE|" : ""),this->flags);
  45. this->flags= strjoin((( args[4]->tcp_flags & TH_CWR ) ? "CWR|" : ""),this->flags);
  46. this->flags= strjoin((( args[4]->tcp_flags == 0 ) ? "null " : ""),this->flags);
  47. printf("%5d %14s %6d %6d %6d %8d \ %-8s %8d %6d %8d %8d %8d %12d %s %d \n",args[1]->cs_cid%1000,args[3]->tcps_raddr,args[3]->tcps_snxt - args[3]->tcps_suna,args[3]->tcps_rnxt - args[3]->tcps_rack,this->delta/1000,args[2]->ip_plength - args[4]->tcp_offset,"",args[3]->tcps_swnd,args[3]->tcps_snd_ws,args[3]->tcps_rwnd,args[3]->tcps_rcv_ws,args[3]->tcps_cwnd,args[3]->tcps_cwnd_ssthresh,this->flags,args[3]->tcps_retransmit
  48. );
  49. this->flags=0;
  50. title--;
  51. this->delta=0;
  52. }
  53.  
  54. tcp:::receive
  55. / nfs[args[1]->cs_cid] && ( ADDR == NULL || args[3]->tcps_raddr == ADDR ) /
  56. {
  57. this->delta= timestamp-walltime[args[1]->cs_cid];
  58. walltime[args[1]->cs_cid]=timestamp;
  59. this->flags="";
  60. this->flags= strjoin((( args[4]->tcp_flags & TH_FIN ) ? "FIN|" : ""),this->flags);
  61. printf("%5d %14s %6d %6d %6d %8s / %-8d %8d %6d %8d %8d %8d %12d %s %d \n",args[3]->tcps_retransmit
  62. );
  63. this->flags=0;
  64. title--;
  65. this->delta=0;
  66. }

输出看起来像(不是来自这种特殊情况):

  1. cid ip usend urecd delta send recd ssz sscal rsz rscal congw conthr flags retran
  2. 320 192.168.100.186 240 0 272 240 \ 49232 0 1049800 5 1049800 2896 ACK|PUSH| 0
  3. 320 192.168.100.186 240 0 196 / 68 49232 0 1049800 5 1049800 2896 ACK|PUSH| 0
  4. 320 192.168.100.186 0 0 27445 0 \ 49232 0 1049800 5 1049800 2896 ACK| 0
  5. 24 192.168.100.177 0 0 255562 / 52 64060 0 64240 0 91980 2920 ACK|PUSH| 0
  6. 24 192.168.100.177 52 0 301 52 \ 64060 0 64240 0 91980 2920 ACK|PUSH| 0

一些标题

  1. usend - unacknowledged send bytes
  2. urecd - unacknowledged received bytes
  3. ssz - send window
  4. rsz - receive window
  5. congw - congestion window

计划在v3和v4上采用dd的窥探并进行比较.已经完成了但是流量太大而且我使用了磁盘文件而不是缓存文件,这使得比较时间毫无意义.将使用缓存数据运行其他snoop,而不会在框之间运行其他流量. TBD

此外,网络人员表示,连接上没有流量整形或带宽限制.

解决方法

NFS 4.1 (minor 1)旨在成为一种更快,更高效的协议,推荐使用以前的版本,尤其是4.0.

包括client-side caching,虽然在这种情况下不相关,但是parallel-NFS (pNFS).主要的变化是协议现在是有状态的.

http://www.netapp.com/us/communities/tech-ontap/nfsv4-0408.html

我认为这是使用NetApps时推荐的协议,从性能文档来看.该技术类似于Windows Vista机会锁定.

NFSv4 differs from prevIoUs versions of NFS by allowing a server to
delegate specific actions on a file to a client to enable more
aggressive client caching of data and to allow caching of the locking
state. A server cedes control of file updates and the locking state to
a client via a delegation. This reduces latency by allowing the client
to perform varIoUs operations and cache data locally. Two types of
delegations currently exist: read and write. The server has the
ability to call back a delegation from a client should there be
contention for a file. Once a client holds a delegation,it can
perform operations on files whose data has been cached locally to
avoid network latency and optimize I/O. The more aggressive caching
that results from delegations can be a big help in environments with
the following characteristics:

  • Frequent opens and closes
  • Frequent GETATTRs
  • File locking
  • Read-only sharing
  • High latency
  • Fast clients
  • Heavily loaded server with many clients

猜你在找的Linux相关文章