Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Non-local session path expected to be non-null trying to write batch on Hive

avatar
New Member

Greetings,

I'm currently using HDP 2.5 to get a working dataflow stack on a 6 node cluster (2 masters + 4 slaves, no Kerberos) using Kafka (2.10_0.10), Storm (1.0.1) and Hive2 (1.2.1). Storm (nimbus is on master-2) is able to communicate with Kafka (broker is on master-1), but can't seemingly operate on Hive (on master-1), even though it manages to connect to its metastore. For the Storm topology I'm using Trident APIs, here's what I'm using:

//HiveBolt
val mapper: DelimitedRecordHiveMapper =
  new DelimitedRecordHiveMapper()
    .withColumnFields(new Fields(colNames))
    .withTimeAsPartitionField("YYYY/MM/DD")
val hiveOptions: HiveOptions =
  new HiveOptions(metastore, dbName, tblName, mapper)
    .withTxnsPerBatch(10)
    .withBatchSize(1000)
    .withIdleTimeout(10)

//KafkaSpout
val spoutConf = new TridentKafkaConfig(zkHosts_2, "air_traffic")
spoutConf.scheme = new SchemeAsMultiScheme(new StringScheme())
val kafkaSpout = new OpaqueTridentKafkaSpout(spoutConf)

//Topology
val topology: TridentTopology = new TridentTopology
val factory: StateFactory = new HiveStateFactory().withOptions(hiveOptions)
val stream: trident.Stream = topology.newStream("jsonEmitter", kafkaSpout)
                              .each(new Fields("str"), new ParseJSON , new Fields(colNames))

stream.partitionPersist(factory, new Fields(colNames), new HiveUpdater(), new Fields()).parallelismHint(8)

I thought originally it was a problem of permissions on either HDFS or the local filesystem, but even though I set 777 permissions on /tmp/hive, there's still this issue.

In core-site.xml:

  • hadoop.proxyuser.hcat.group
  • hadoop.proxyuser.hcat.hosts

  • hadoop.proxyuser.hdfs.groups

  • hadoop.proxyuser.hdfs.hosts

  • hadoop.proxyuser.hive.groups

  • hadoop.proxyuser.hive.hosts

  • hadoop.proxyuser.root.groups

  • hadoop.proxyuser.root.hosts

are all set to '*'.

Hive2, as far as I see is correctly set to work with transactions, being the target table with transactional=true, stored as orc and bucketed. In the hive-site.xml:

  • hive.compactor.worker.threads = 1
  • hive.compactor.initiator.on = true
  • hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager

Here's the stacktrace:

2017-07-08 10:02:36.896 o.a.s.h.t.HiveState [INFO] Creating Writer to Hive end point : {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[2017/07/189] }
2017-07-08 10:02:36.911 h.metastore [INFO] Trying to connect to metastore with URI thrift://master-1.localdomain:9083
2017-07-08 10:02:36.912 h.metastore [INFO] Connected to metastore.
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 STDIO [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null
2017-07-08 10:02:36.923 o.a.h.h.q.Driver [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null
java.lang.NullPointerException: Non-local session path expected to be non-null
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:229)
	at org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:590)
	at org.apache.hadoop.hive.ql.Context.<init>(Context.java:129)
	at org.apache.hadoop.hive.ql.Context.<init>(Context.java:116)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:382)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.runDDL(HiveEndPoint.java:404)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.createPartitionIfNotExists(HiveEndPoint.java:369)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:276)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157)
	at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:238)
	at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:235)
	at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] </PERFLOG method=compile start=1499504556923 end=1499504556923 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.924 o.a.s.h.t.HiveState [WARN] hive streaming failed.
java.lang.NullPointerException

From what I can gather, the NullpointerException is thrown in the following method inside SessionState:

public static Path getHDFSSessionPath(Configuration conf) {  
	SessionState ss = SessionState.get();  
	if (ss == null) {  
		String sessionPathString = conf.get(HDFS_SESSION_PATH_KEY);  
		Preconditions.checkNotNull(sessionPathString,  "Conf non-local session path expected to be non-null");  
		return new Path(sessionPathString);  
	}  
	Preconditions.checkNotNull(ss.hdfsSessionPath,  "Non-local session path expected to be non-null");  
	return ss.hdfsSessionPath;
}

Specifically, by:

Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path expected to be non-null");

So, it seems to be an hdfs related issue, but I can't understand why it's happening.

I hope someone can help me, I'll update the post with more information if needed.

1 ACCEPTED SOLUTION

avatar
New Member

Apparently that was really a jar dependencies issue. Even though I'm using HDP 2.5.5, so I'm not sure what @Sindhu describes applies to me, it turns out that the tag versioning applied to Hortonworks jars was the only factor causing this kind of issue.

As a matter of fact, I was bundling in the jar submitted to storm the dependencies without the "-157" (while using the correct major, minor and hotfix versions), which can be found on mavencentral. In order to get everything working I had to use the dependencies from http://nexus-private.hortonworks.com/nexus/content/groups/public/ , since I couldn't find them on http://repo.hortonworks.com/content/repositories/releases/

View solution in original post

2 REPLIES 2

avatar
@Federico D'Ambrosio

The original error looks to be caused by a lib mismatch where storm-hive is set to use 0.9.0 ( https://github.com/hortonworks/storm-release/blob/HDP-2.5.3.0-tag/external/storm-hive/pom.xml ) when I think it should be set to the thrift.version variable from the parent storm-release pom (https://github.com/hortonworks/storm-release/blob/HDP-2.5.3.0-tag/pom.xml).

This issue is addressed in HDP 2.5.5 and HDP 2.6.1.

avatar
New Member

Apparently that was really a jar dependencies issue. Even though I'm using HDP 2.5.5, so I'm not sure what @Sindhu describes applies to me, it turns out that the tag versioning applied to Hortonworks jars was the only factor causing this kind of issue.

As a matter of fact, I was bundling in the jar submitted to storm the dependencies without the "-157" (while using the correct major, minor and hotfix versions), which can be found on mavencentral. In order to get everything working I had to use the dependencies from http://nexus-private.hortonworks.com/nexus/content/groups/public/ , since I couldn't find them on http://repo.hortonworks.com/content/repositories/releases/