本文介绍了如何使用 Scala Stream 类读取大型 CSV 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!


如何使用 Scala Stream 读取大型 CSV 文件(> 1 Gb)?你有代码示例吗?或者您会使用不同的方式读取大型 CSV 文件而不先将其加载到内存中?

How do I read a large CSV file (> 1 Gb) with a Scala Stream? Do you have a code example? Or would you use a different way to read a large CSV file without loading it into memory first?


只需使用您已经说过的 Source.fromFile(...).getLines.

Just use Source.fromFile(...).getLines as you already stated.


That returns an Iterator, which is already lazy (You'd use stream as a lazy collection where you wanted previously retrieved values to be memoized, so you can read them again)

如果您遇到内存问题,那么问题将出在您 getLines 之后正在做什么.任何像 toList 这样强制严格收集的操作都会导致问题.

If you're getting memory problems, then the problem will lie in what you're doing after getLines. Any operation like toList, which forces a strict collection, will cause the problem.

这篇关于如何使用 Scala Stream 类读取大型 CSV 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 21:18