Member since
07-07-2017
6
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2464 | 07-12-2017 07:46 AM |
08-20-2018
11:48 AM
Hi, yes, I upgraded from HDP 2.6.5
... View more
07-31-2018
03:15 PM
Hi @yves dormont, I believe that if you write in your classpath: /usr/hdp/2.6.0.3-8/hive/conf/conf.server/*
with the wildcard '\*' you are actually including all of the files in the folder which have the extension *.jar. (Source) So if you use /usr/hdp/2.6.0.3-8/hive/conf/conf.server
you would be actually including all of the config files (properties and xml, not really sure which are used by the preupgrade tool, presumably hivemetastore-site.xml) within the conf.server folder, where all the information needed for the instantiation of the HiveMetaStoreClient. Could you please change that part of your classpath and see if it changes anything?
... View more
07-26-2018
03:10 PM
Hi, I just had myself quite a few issues with this preupgrade tool, mainly from the fact that within the documentation there was the wrong classpath and I had to correct myself all of the wrong occurrences. The preupgradetool needs the conf.server folder passed (/usr/hdp/current/hive-client/conf/conf.server), other than the other jars needed for execution. It looks like you haven't passed that argument in your classpath. Could you check if passing all of the correct folders/files in your execution command works? My complete corrected classpath was the following: $JAVA_HOME/bin/java -Djavax.security.auth.useSubjectCredsOnly=false -cp /usr/hdp/current/hive-server2-hive2/lib/derby-10.10.2.0.jar:/usr/hdp/current/hive-server2-hive2/lib/*:/usr/hdp/current/hadoop/*:/usr/hdp/current/hadoop/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*:/usr/hdp/current/hadoop-hdfs/*:/usr/hdp/current/hadoop-hdfs/lib/*:/usr/hdp/current/hadoop/etc/hadoop/*:/tmp/hive-pre-upgrade-3.1.0.3.0.0.0-1634.jar:/usr/hdp/current/hive-client/conf/conf.server:/usr/hdp/current/hive-metastore/lib/hive-metastore.jar:/usr/hdp/current/hive-metastore/lib/libthrift-0.9.3.jar:/usr/hdp/current/hadoop-client/hadoop-common.jar:/usr/hdp/current/hive-client/lib/hive-common.jar:/usr/hdp/current/hive-client/lib/commons-cli-1.2.jar:/usr/hdp/current/hadoop-client/lib/* org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool -execute
... View more
07-12-2017
07:46 AM
Apparently that was really a jar dependencies issue. Even though I'm
using HDP 2.5.5, so I'm not sure what @Sindhu describes applies to me, it
turns out that the tag versioning applied to Hortonworks jars was the
only factor causing this kind of issue. As a matter of fact, I
was bundling in the jar submitted to storm the dependencies without the
"-157" (while using the correct major, minor and hotfix versions), which can be found on mavencentral. In order to get everything
working I had to use the dependencies from
http://nexus-private.hortonworks.com/nexus/content/groups/public/ ,
since I couldn't find them on
http://repo.hortonworks.com/content/repositories/releases/
... View more
07-08-2017
09:46 AM
1 Kudo
Greetings, I'm currently using HDP 2.5 to get a working dataflow stack on a 6 node cluster (2 masters + 4 slaves, no Kerberos) using Kafka (2.10_0.10), Storm (1.0.1) and Hive2 (1.2.1). Storm (nimbus is on master-2) is able to communicate with Kafka (broker is on master-1), but can't seemingly operate on Hive (on master-1), even though it manages to connect to its metastore. For the Storm topology I'm using Trident APIs, here's what I'm using: //HiveBolt
val mapper: DelimitedRecordHiveMapper =
new DelimitedRecordHiveMapper()
.withColumnFields(new Fields(colNames))
.withTimeAsPartitionField("YYYY/MM/DD")
val hiveOptions: HiveOptions =
new HiveOptions(metastore, dbName, tblName, mapper)
.withTxnsPerBatch(10)
.withBatchSize(1000)
.withIdleTimeout(10)
//KafkaSpout
val spoutConf = new TridentKafkaConfig(zkHosts_2, "air_traffic")
spoutConf.scheme = new SchemeAsMultiScheme(new StringScheme())
val kafkaSpout = new OpaqueTridentKafkaSpout(spoutConf)
//Topology
val topology: TridentTopology = new TridentTopology
val factory: StateFactory = new HiveStateFactory().withOptions(hiveOptions)
val stream: trident.Stream = topology.newStream("jsonEmitter", kafkaSpout)
.each(new Fields("str"), new ParseJSON , new Fields(colNames))
stream.partitionPersist(factory, new Fields(colNames), new HiveUpdater(), new Fields()).parallelismHint(8)
I thought originally it was a problem of permissions on either HDFS or the local filesystem, but even though I set 777 permissions on /tmp/hive, there's still this issue. In core-site.xml:
hadoop.proxyuser.hcat.group
hadoop.proxyuser.hcat.hosts hadoop.proxyuser.hdfs.groups hadoop.proxyuser.hdfs.hosts hadoop.proxyuser.hive.groups hadoop.proxyuser.hive.hosts hadoop.proxyuser.root.groups hadoop.proxyuser.root.hosts
are all set to '*'. Hive2, as far as I see is correctly set to work with transactions, being the target table with transactional=true, stored as orc and bucketed. In the hive-site.xml: hive.compactor.worker.threads = 1 hive.compactor.initiator.on = true hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager Here's the stacktrace: 2017-07-08 10:02:36.896 o.a.s.h.t.HiveState [INFO] Creating Writer to Hive end point : {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[2017/07/189] }
2017-07-08 10:02:36.911 h.metastore [INFO] Trying to connect to metastore with URI thrift://master-1.localdomain:9083
2017-07-08 10:02:36.912 h.metastore [INFO] Connected to metastore.
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.923 STDIO [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null
2017-07-08 10:02:36.923 o.a.h.h.q.Driver [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null
java.lang.NullPointerException: Non-local session path expected to be non-null
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:229)
at org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:590)
at org.apache.hadoop.hive.ql.Context.<init>(Context.java:129)
at org.apache.hadoop.hive.ql.Context.<init>(Context.java:116)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:382)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.runDDL(HiveEndPoint.java:404)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.createPartitionIfNotExists(HiveEndPoint.java:369)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:276)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243)
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180)
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157)
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:238)
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:235)
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] </PERFLOG method=compile start=1499504556923 end=1499504556923 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2017-07-08 10:02:36.924 o.a.s.h.t.HiveState [WARN] hive streaming failed.
java.lang.NullPointerException From what I can gather, the NullpointerException is thrown in the following method inside SessionState: public static Path getHDFSSessionPath(Configuration conf) {
SessionState ss = SessionState.get();
if (ss == null) {
String sessionPathString = conf.get(HDFS_SESSION_PATH_KEY);
Preconditions.checkNotNull(sessionPathString, "Conf non-local session path expected to be non-null");
return new Path(sessionPathString);
}
Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path expected to be non-null");
return ss.hdfsSessionPath;
} Specifically, by: Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path expected to be non-null"); So, it seems to be an hdfs related issue, but I can't understand why it's happening. I hope someone can help me, I'll update the post with more information if needed.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Storm