Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Nullpointer Exception on broadcast variables (YARN Cluster mode)

avatar
Visitor

Hi All

I have a simple spark application, where I am trying to broadcast a String type variable on YARN Cluster.
But every time I am trying to access the broadcast-ed variable value , I am getting null within the Task. It will be really helpful, if you guys can suggest, what I am doing wrong here.
My code is like follows:-

public class TestApp implements Serializable{
static Broadcast<String[]> mongoConnectionString;


public static void main( String[] args )
{
String mongoBaseURL = args[0];
SparkConf sparkConf =  new SparkConf().setAppName(Constants.appName);
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);

mongoConnectionString = javaSparkContext.broadcast(args);

JavaSQLContext javaSQLContext = new JavaSQLContext(javaSparkContext);

JavaSchemaRDD javaSchemaRDD = javaSQLContext.jsonFile(hdfsBaseURL+Constants.hdfsInputDirectoryPath);

if(javaSchemaRDD!=null){
javaSchemaRDD.registerTempTable("LogAction");
javaSchemaRDD.cache();
pageSchemaRDD = javaSQLContext.sql(SqlConstants.getLogActionPage);
pageSchemaRDD.foreach(new Test());

}
}

private static class Test implements VoidFunction<Row>{
    /**
                 *
                 */
                private static final long serialVersionUID = 1L;

                public void call(Row t) throws Exception {
                        // TODO Auto-generated method stub
                        logger.info("mongoConnectionString "+mongoConnectionString.value());
                }
    }



Thanks and Regards
Samriddha

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Yes but it's a member of a class. When the class is instantiated on the remote worker, it is null again. Make the Broadcast a member of the new function you are defining.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

What is null? I don't see you using a broadcast fariable in a closure here. You just put one in a static member, which isn't going to work.

avatar
Visitor

I am using the broadcast variable within the class Test, which implements the VoidFunction (this is a closure rght ?) , there I am trying to print its value . And the Broadcast variable is coming as null there. Please suggest. Please note, when I am running the program locally , it worked fine.

avatar
Master Collaborator

Yes but it's a member of a class. When the class is instantiated on the remote worker, it is null again. Make the Broadcast a member of the new function you are defining.