我们使用ssh deployer@10.170.4.11 sudo /root/run-chef-client.sh等命令远程运行一个简单的部署脚本.它开始挂起今天因为sshd在10.170.4.11上永远等待,即使sudo已经完成了.我们在调试模式下启动了sshd并获得了两种不同类型的日志.以下是会话未挂起时的正常日志:
debug1: Received SIGCHLD. debug1: session_by_pid: pid 23187 debug1: session_exit_message: session 0 channel 0 pid 23187 debug1: session_exit_message: release channel 0 Received disconnect from 10.170.4.6: 11: disconnected by user
当它挂起时,我们得到以下内容:
debug1: Received SIGCHLD. debug1: session_by_pid: pid 24209 debug1: session_exit_message: session 0 channel 0 pid 24209 debug1: session_exit_message: release channel 0
我们的理解是服务器进程等待来自客户端的一些通信而永远不会得到它.很难说它是客户端还是服务器端问题.
我们尝试在strace下运行sshd但是没有成功,因为在这种情况下,sudo上的SUID位被忽略了.那么,我们还应该尝试调试/防止这种情况呢?
解决方法
在客户端使用ssh -t(强制PTY分配)解决了这个问题:
debug1: Received SIGCHLD. debug1: session_by_pid: pid 31701 debug1: session_exit_message: session 0 channel 0 pid 31701 debug1: session_exit_message: release channel 0 debug1: session_pty_cleanup: session 0 release /dev/pts/1 Received disconnect from 127.0.0.1: 11: disconnected by user debug1: do_cleanup debug1: PAM: cleanup debug1: PAM: closing session debug1: PAM: deleting credentials
sshd由伪TTY控制,而不是由客户端控制.