本文介绍了如何在hdfs中压缩文件而无需将其拉入本地文件系统的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
是否可以在hdfs中压缩文件而不将其拉入本地系统驱动器?我通常通过执行 hadoop fs -get filename
然后通过linux zip命令压缩该文件...但是我可以在hdfs本身执行此操作吗?
is it possible to zip a file in hdfs without pulling it into the local system drive? I usually do this by doing an hadoop fs -get filename
and then zipping that via linux zip command...but can I do this in hdfs itself?
解决方案
You can create a MapReduce job using Identity Mapper (the output is the same than the input) and not reducer; configuring the Mapper output to be compressed. I want to suggest use GZip or LZO instead Zip format, but only you know your requirements.
这篇关于如何在hdfs中压缩文件而无需将其拉入本地文件系统的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!