Support Questions

Find answers, ask questions, and share your expertise

Nullpointer Exception on broadcast variables (YARN Cluster mode)

avatar
New Contributor

Hi All

I have a simple spark application, where I am trying to broadcast a String type variable on YARN Cluster.
But every time I am trying to access the broadcast-ed variable value , I am getting null within the Task. It will be really helpful, if you guys can suggest, what I am doing wrong here.
My code is like follows:-

public class TestApp implements Serializable{
static Broadcast<String[]> mongoConnectionString;


public static void main( String[] args )
{
String mongoBaseURL = args[0];
SparkConf sparkConf =  new SparkConf().setAppName(Constants.appName);
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);

mongoConnectionString = javaSparkContext.broadcast(args);

JavaSQLContext javaSQLContext = new JavaSQLContext(javaSparkContext);

JavaSchemaRDD javaSchemaRDD = javaSQLContext.jsonFile(hdfsBaseURL+Constants.hdfsInputDirectoryPath);

if(javaSchemaRDD!=null){
javaSchemaRDD.registerTempTable("LogAction");
javaSchemaRDD.cache();
pageSchemaRDD = javaSQLContext.sql(SqlConstants.getLogActionPage);
pageSchemaRDD.foreach(new Test());

}
}

private static class Test implements VoidFunction<Row>{
    /**
                 *
                 */
                private static final long serialVersionUID = 1L;

                public void call(Row t) throws Exception {
                        // TODO Auto-generated method stub
                        logger.info("mongoConnectionString "+mongoConnectionString.value());
                }
    }



Thanks and Regards
Samriddha

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Yes but it's a member of a class. When the class is instantiated on the remote worker, it is null again. Make the Broadcast a member of the new function you are defining.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

What is null? I don't see you using a broadcast fariable in a closure here. You just put one in a static member, which isn't going to work.

avatar
New Contributor

I am using the broadcast variable within the class Test, which implements the VoidFunction (this is a closure rght ?) , there I am trying to print its value . And the Broadcast variable is coming as null there. Please suggest. Please note, when I am running the program locally , it worked fine.

avatar
Master Collaborator

Yes but it's a member of a class. When the class is instantiated on the remote worker, it is null again. Make the Broadcast a member of the new function you are defining.