我使用带有线程线程的
Python 2子进程来采用标准输入,使用二进制文件A,B和C进行处理,并将修改后的数据写入标准输出.
这个脚本(我们称之为:A_to_C.py)非常慢,我想学习如何解决它.
一般流程如下:
A_process = subprocess.Popen(['A','-'],stdin=subprocess.PIPE,stdout=subprocess.PIPE) produce_A_thread = threading.Thread(target=produceA,args=(sys.stdin,A_process.stdin)) B_process = subprocess.Popen(['B',stdout=subprocess.PIPE) convert_A_to_B_thread = threading.Thread(target=produceB,args=(A_process.stdout,B_process.stdin)) C_process = subprocess.Popen(['C',stdin=subprocess.PIPE) convert_B_to_C_thread = threading.Thread(target=produceC,args=(B_process.stdout,C_process.stdin)) produce_A_thread.start() convert_A_to_B_thread.start() convert_B_to_C_thread.start() produce_A_thread.join() convert_A_to_B_thread.join() convert_B_to_C_thread.join() A_process.wait() B_process.wait() C_process.wait()
这个想法是标准输入到A_to_C.py:
> A二进制处理一个标准输入块,并使用函数generateA创建A输出.
> B二进制处理A的标准输出块,并通过函数generateB创建B输出.
> C二进制通过函数produceC处理B的标准输出块,并将C输出写入标准输出.
我用cProfile进行了剖析,几乎所有的时间在这个脚本似乎都花在了获取线程锁.
例如,在测试417s作业中,416s(总运行时的99%)用于获取线程锁:
$python Python 2.6.6 (r266:84292,Nov 21 2013,10:50:32) [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2 Type "help","copyright","credits" or "license" for more information. >>> import pstats >>> p = pstats.Stats('1.profile') >>> p.sort_stats('cumulative').print_stats(10) Thu Jun 12 22:19:07 2014 1.profile 1755 function calls (1752 primitive calls) in 417.203 cpu seconds Ordered by: cumulative time List reduced from 162 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.020 0.020 417.203 417.203 A_to_C.py:90(<module>) 1 0.000 0.000 417.123 417.123 A_to_C.py:809(main) 6 0.000 0.000 416.424 69.404 /foo/python/2.7.3/lib/python2.7/threading.py:234(wait) 32 416.424 13.013 416.424 13.013 {method 'acquire' of 'thread.lock' objects} 3 0.000 0.000 416.422 138.807 /foo/python/2.7.3/lib/python2.7/threading.py:648(join) 3 0.000 0.000 0.498 0.166 A_to_C.py:473(which) 37 0.000 0.000 0.498 0.013 A_to_C.py:475(is_exe) 3 0.496 0.165 0.496 0.165 {posix.access} 6 0.000 0.000 0.194 0.032 /foo/python/2.7.3/lib/python2.7/subprocess.py:475(_eintr_retry_call) 3 0.000 0.000 0.191 0.064 /foo/python/2.7.3/lib/python2.7/subprocess.py:1286(wait)
我的线程错误我在做什么.Thread和/或subprocess.Popen安排导致这个问题?