问题描述
在对一些Clojure代码进行压力测试时,我注意到当迭代大数据集时,它会耗尽堆空间。我最终设法将问题追溯到Clojure的 doseq
函数和实现惰性序列的组合。
When stress-testing some Clojure code at work, I noticed it runs out of heap space when iterating over large data-sets. I eventually managed to trace the issues back to the combination of Clojure's doseq
function, and implementation fo lazy sequences.
这是最小的代码片段,通过耗尽可用的堆空间来崩溃Clojure:
This is the minimal code snippet that crashes Clojure by exhausting available heap space:
(doseq [e (take 1000000000 (iterate inc 1))] (identity e))
doseq
清楚地表明它不保留延迟序列的头,所以我希望上述代码的内存复杂性接近O(1)。有什么我错过了吗?如果 doseq
不能胜任这项工作,那么Clojure-idiomatic对非常大的延迟序列进行迭代的方式是什么?
The documentation for doseq
clearly states that it doesn't retain the head of the lazy sequence, so I would expect the memory complexity of the above code to be close to O(1). Is there something I'm missing? What's the Clojure-idiomatic way of iterating over extremely large lazy sequences, if doseq
isn't up to the job?
推荐答案
当我运行这个示例,我看到内存使用率达到2.0吉姆,所以也许你实际上只是用完了ram。
When I run this sample I see the memory usage hit 2.0 Gigs so perhaps you are actually just running out of ram.
它确实需要一段时间才能运行:
it sure does take a while to run:
user=> (time (doseq [e (take 1000000000 (iterate inc 1))] (identity e)))
"Elapsed time: 266396.221132 msecs"
形式顶部:
23999 arthur 20 0 4001m 1.2g 5932 S 213 15.3 17:11.35 java
24017 arthur 20 0 3721m 740m 5548 S 88 9.3 13:49.95 java
这篇关于doseq在一个简单的延迟seq用完的堆空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!