本文介绍了惰性IO - 字符串没有垃圾收集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将XML文件的内容读入 Map Int(Map Int字符串),它工作得很好(使用HaXml)。但是,我对我的程序的内存消耗并不满意,并且问题似乎是垃圾回收。



这里是我用来读取XML的代码file:

  type TextFile = Map Int(Map Int String)

uildTextFile :: String - > ; IO TextFile
buildTextFile filename = do content< - readFile文件名
let doc = xmlParse文件名内容
con = docContent(posInNewCxt文件名无)doc
返回$ buildTF con

我的猜测是 content 即使在返回后仍保存在内存中,尽管它不需要(当然,它也可以是 doc 或 con )。我得出这个结论是因为尽管得到的 TextFile 只是一个singleton映射的singleton映射(使用特殊的测试文件,当然,这通常是不同的)。因此,最后,我有一个 Map Map ,它只有一个字符串,但是内存消耗高达19 MB。使用严格的应用程序( $!)或使用 Data.Text 而不是字符串在 TextFile 中不会改变任何内容。 p>

所以我的问题是:有什么办法告诉编译器字符串 content (或者 doc 或 con )是不再需要的,它可以被垃圾回收?



更普遍的是:如何在没有所有猜测的情况下找出问题的真正来源?

编辑: FUZxxl建议我尝试使用deepseq并更改了 buildTextFile 的第二行,如下所示:

  let doc = content`deepseq` xmlParse filename content 

不幸的是,这并没有改变任何事情(或我用错了吗?)... ...

解决方案

不要猜测什么是消费记忆,找出当然

GHC手册。


强制计算

如果问题是懒惰评估(你正在构建一个可以计算XML文档类型并将字符串留在堆中的on-heap thunk),则使用rnf和seq:

  buildTextFile :: String  - > IO TextFile 
buildTextFile filename = do content< - readFile文件名
let doc = xmlParse文件名内容
con = docContent(posInNewCxt文件名无)doc
res = buildTF con
返回$ rnf res`seq` res

或者只是使用爆炸模式( !res = buildTF con ),无论哪种方式应该强制thunk,并允许GC收集 String 。


I'm currently trying to read the contents of an XML file into a Map Int (Map Int String) and it works quite well (using HaXml). However, I'm not satisfied with the memory consumption of my program and the problems seems to be the garbage collection.

Here's the code I'm using to read the XML file:

type TextFile = Map Int (Map Int String)

buildTextFile :: String -> IO TextFile
buildTextFile filename = do content <- readFile filename
                            let doc = xmlParse filename content
                                con = docContent (posInNewCxt filename Nothing) doc
                            return $ buildTF con

My guess is that content is held in memory even after the return, although it doesn't need to be (of course it could also be doc or con). I come to this conclusion because the memory consumption rises quickly with very large XML files, although the resulting TextFile is only a singleton map of a singleton map (using a special testing file, generally it's different, of course). So in the end, I have a Map of a Map Int String, with only one string in it, but the memory consumption is up to 19 MB.

Using strict application ($!) or using Data.Text instead of String in TextFile doesn't change anything.

So my question is: Is there some way to tell the compiler that the string content (or doc or con) isn't needed anymore and that it can be garbage collected?

And more generally: How can I find out where the problem really comes from without all the guessing?

Edit: As FUZxxl suggested I tried using deepseq and changed the second line of buildTextFile like so:

let doc = content `deepseq` xmlParse filename content

Unfortunately that didn't change anything really (or am I using it wrong?)...

解决方案

Don't Guess What Is Consuming Memory, Find Out For Sure

The first step is to determine what types are consuming the most memory. You can see lots of examples of heap profiling here on SO or read the GHC manual.

Forcing Computation

If the problem is lazy evaluation (you're building an on-heap thunk that can compute the XML document type and leaving the string in heap too) then use rnf and seq:

buildTextFile :: String -> IO TextFile
buildTextFile filename = do content <- readFile filename
                            let doc = xmlParse filename content
                                con = docContent (posInNewCxt filename Nothing) doc
                                res = buildTF con
                            return $ rnf res `seq` res

Or just use bang patterns (!res = buildTF con), either way that should force the thunks and allow the GC to collect String.

这篇关于惰性IO - 字符串没有垃圾收集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 06:51