问题描述
为什么以下使用concurrent.futures
模块的Python代码会永远挂起?
Why does the following Python code using the concurrent.futures
module hang forever?
import concurrent.futures
class A:
def f(self):
print("called")
class B(A):
def f(self):
executor = concurrent.futures.ProcessPoolExecutor(max_workers=2)
executor.submit(super().f)
if __name__ == "__main__":
B().f()
该调用引发一个不可见的异常[Errno 24] Too many open files
(要查看此异常,请将executor.submit(super().f)
行替换为print(executor.submit(super().f).exception())
).
The call raises an invisible exception [Errno 24] Too many open files
(to see it, replace the line executor.submit(super().f)
with print(executor.submit(super().f).exception())
).
但是,将ProcessPoolExecutor
替换为ThreadPoolExecutor
会按预期打印被调用".
However, replacing ProcessPoolExecutor
with ThreadPoolExecutor
prints "called" as expected.
为什么以下使用multiprocessing.pool
模块的Python代码引发异常AssertionError: daemonic processes are not allowed to have children
?
Why does the following Python code using the multiprocessing.pool
module raise the exception AssertionError: daemonic processes are not allowed to have children
?
import multiprocessing.pool
class A:
def f(self):
print("called")
class B(A):
def f(self):
pool = multiprocessing.pool.Pool(2)
pool.apply(super().f)
if __name__ == "__main__":
B().f()
但是,用ThreadPool
替换Pool
会按预期打印被调用".
However, replacing Pool
with ThreadPool
prints "called" as expected.
环境:CPython 3.7,MacOS 10.14.
Environment: CPython 3.7, MacOS 10.14.
推荐答案
concurrent.futures.ProcessPoolExecutor
和multiprocessing.pool.Pool
使用multiprocessing.queues.Queue
将工作功能对象从调用方传递到工作进程,Queue
使用pickle
模块进行序列化/unserialize,但无法正确处理带有子类实例的绑定方法对象:
concurrent.futures.ProcessPoolExecutor
and multiprocessing.pool.Pool
uses multiprocessing.queues.Queue
to pass the work function object from caller to worker process, Queue
uses pickle
module to serialize/unserialize, but it failed to proper processing bound method object with child class instance:
f = super().f
print(f)
pf = pickle.loads(pickle.dumps(f))
print(pf)
输出:
<bound method A.f of <__main__.B object at 0x104b24da0>>
<bound method B.f of <__main__.B object at 0x104cfab38>>
A.f
变为B.f
,这实际上在工作进程中创建了对B.f
的无限递归调用B.f
.
A.f
becomes B.f
, this effectly creates infinite recursive calling B.f
to B.f
in the worker process.
pickle.dumps
利用绑定方法对象IMO的__reduce__
方法,其实现,没有考虑这种情况,这种情况不涉及真正的func
对象,而只是尝试从实例self
obj()的简单名称(f
),结果是B.f
,很可能是一个错误.
pickle.dumps
utilize __reduce__
method of bound method object, IMO, its implementation, has no consideration of this scenario, which does not take care of the real func
object, but only try to get back from instance self
obj (B()
) with the simple name (f
), which resulting B.f
, very likely a bug.
好消息是,我们知道问题出在哪里,我们可以通过实施我们自己的归约函数来解决它,该函数尝试从原始函数(A.f
)和实例obj(B()
)重新创建绑定的方法对象. :
good news is, as we know where the issue is, we could fix it by implementing our own reduction function that tries to recreate the bound method object from the original function (A.f
) and instance obj (B()
):
import types
import copyreg
import multiprocessing
def my_reduce(obj):
return (obj.__func__.__get__, (obj.__self__,))
copyreg.pickle(types.MethodType, my_reduce)
multiprocessing.reduction.register(types.MethodType, my_reduce)
我们可以这样做,因为绑定方法是一个描述符.
we could do this because bound method is a descriptor.
ps:我已经提交了错误报告.
ps: I have filed a bug report.
这篇关于为什么并发.futures.ProcessPoolExecutor和multiprocessing.pool.Pool在Python中使用super失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!