本文介绍了Apache SPARK:-Nullpointer广播变量异常(YARN Cluster模式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的spark应用程序,我试图在YARN Cluster上广播一个字符串类型变量。
但是每次我试图访问广播的变量值,我在任务中变为null。这将是真的有帮助,如果你们可以建议,我在这里做错了。
我的代码如下: -

I have a simple spark application, where I am trying to broadcast a String type variable on YARN Cluster.But every time I am trying to access the broadcast-ed variable value , I am getting null within the Task. It will be really helpful, if you guys can suggest, what I am doing wrong here.My code is like follows:-

    public class TestApp implements Serializable{
static Broadcast<String[]> mongoConnectionString;


public static void main( String[] args )
{
String mongoBaseURL = args[0];
SparkConf sparkConf =  new SparkConf().setAppName(Constants.appName);
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);

mongoConnectionString = javaSparkContext.broadcast(args);

JavaSQLContext javaSQLContext = new JavaSQLContext(javaSparkContext);

JavaSchemaRDD javaSchemaRDD = javaSQLContext.jsonFile(hdfsBaseURL+Constants.hdfsInputDirectoryPath);

if(javaSchemaRDD!=null){
javaSchemaRDD.registerTempTable("LogAction");
javaSchemaRDD.cache();
pageSchemaRDD = javaSQLContext.sql(SqlConstants.getLogActionPage);
pageSchemaRDD.foreach(new Test());

}
}

private static class Test implements VoidFunction<Row>{
    /**
                 *
                 */
                private static final long serialVersionUID = 1L;

                public void call(Row t) throws Exception {
                        // TODO Auto-generated method stub
                        logger.info("mongoConnectionString "+mongoConnectionString.value());
                }
    }

感谢和回馈
Sam

Thanks and RegardsSam

推荐答案

这是因为您的广播变量是在类级别。并且因为当类在工作节点中初始化时,它不会看到您在main方法中分配的值。它只会看到一个null,因为广播变量没有初始化为任何东西。解决方案我发现是将广播变量传递给调用方法的方法。蓄能器

This is because your broadcast variable is in class level. And since when the class is initialized in the worker node it will not see the value you assigned in the main method. It will only see a null since the broadcast variable is not initialized to anything. The Solution i found was to pass the broadcast variable to the method when calling the method. This is also the case for Accumulators

这篇关于Apache SPARK:-Nullpointer广播变量异常(YARN Cluster模式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-04 17:49