本文介绍了必须包括log4j的,但它造成的Apache星火外壳的错误。如何避免错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于一个,我必须包括成火花code坛子里的复杂性,我想问一下搞清楚的方式来解决这个问题,无需拆卸进口log4j的帮助。

Due to a complexity of the jars that I must include into a Spark code, I would like to ask for a help figuring out the way to solve this issue without removing the log4j import.

简单code是如下:

    :cp symjar/log4j-1.2.17.jar
import org.apache.spark.rdd._

      val hadoopConf=sc.hadoopConfiguration;
      hadoopConf.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
      hadoopConf.set("fs.s3n.awsAccessKeyId","AKEY")
      hadoopConf.set("fs.s3n.awsSecretAccessKey","SKEY") 
    val numOfProcessors = 2
    val filePath = "s3n://SOMEFILE.csv"
    var rdd = sc.textFile(filePath, numOfProcessors)
    def doStuff(rdd: RDD[String]): RDD[String] = {rdd}
    doStuff(rdd)

首先,我得到这个错误:

First, I am getting this error:

error: error while loading StorageLevel, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/storage/StorageLevel.class)' has location not matching its contents: contains class StorageLevel
error: error while loading Partitioner, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/Partitioner.class)' has location not matching its contents: contains class Partitioner
error: error while loading BoundedDouble, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/partial/BoundedDouble.class)' has location not matching its contents: contains class BoundedDouble
error: error while loading CompressionCodec, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/hadoop/io/compress/CompressionCodec.class)' has location not matching its contents: contains class CompressionCodec

然后,我再次运行此线,误差自败:

Then, I run this line again, and the error dissapears:

var rdd = sc.textFile(filePath, numOfProcessors)

不过,code的最终结果是:

However, the end-result of the code is:

error: type mismatch;
 found   : org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[String]
 required: org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[String]
              doStuff(rdd)
                      ^

我怎样才能避免进口删除log4j的,并没有得到所提到的错误? (这是项重要的,因为我有大量使用Log4j和罐子在与火花壳牌冲突)。

How can I avoid removing the log4j from the import and not get the mentioned errors ? (this is imporant, since the jars that I have use log4j heavily and are in conflict with Spark-Shell).

推荐答案

答案是不是只使用:cp命令,但也增加了包括一切都在... /火花/ conf目录/ spark-env.sh在出口的 SPARK_SUBMIT_CLASSPATH =... /的/路径/要/ a.jar文件

The answer is not to use just the :cp command but to also to add the include everything in .../spark/conf/spark-env.sh under the export SPARK_SUBMIT_CLASSPATH=".../the/path/to/a.jar"

这篇关于必须包括log4j的,但它造成的Apache星火外壳的错误。如何避免错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-02 10:06