Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hive load data query is failing with following exception

avatar
Explorer

22/05/25 01:43:01 INFO utils.HiveUtils: load data into hive table
Hive Session ID = fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO SessionState: Hive Session ID = fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created HDFS directory: /tmp/hive/hive/fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created local directory: /tmp/root/fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created HDFS directory: /tmp/hive/hive/fc864c41-2236-472d-ac57-96c6df6d8e48/_tmp_space.db
22/05/25 01:43:01 INFO tez.TezSessionState: User of session id fc864c41-2236-472d-ac57-96c6df6d8e48 is hive
22/05/25 01:43:01 INFO tez.TezSessionState: Created new resources: null
22/05/25 01:43:01 INFO tez.DagUtils: Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hive/.hiveJars
22/05/25 01:43:01 INFO tez.TezSessionState: Computed sha: c765910321290465015ccaaeae5dde480fd29e141a4c11594756ecfe145f4d5b for file: file:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/hive-exec-3.1.3000.7.1.7.1000-141.jar of length: 43.58MB in 301 ms
22/05/25 01:43:01 INFO tez.DagUtils: Resource modification time: 1648793437245 for hdfs://sdl60070.labs.teradata.com:8020/user/hive/.hiveJars/hive-exec-3.1.3000.7.1.7.1000-141-c765910321290465015ccaaeae5dde480fd29e141a4c11594756ecfe145f4d5b.jar
22/05/25 01:43:01 INFO sqlstd.SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=fc864c41-2236-472d-ac57-96c6df6d8e48, clientType=HIVECLI]
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: HMS client filtering is enabled.
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Trying to connect to metastore with URI thrift://sdl60070.labs.teradata.com:9083
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Opened a connection to metastore, current connections: 3
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Connected to metastore.
22/05/25 01:43:02 INFO metastore.RetryingMetaStoreClient: RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive (auth:SIMPLE) retries=1 delay=1 lifetime=0
22/05/25 01:43:02 INFO counters.Limits: Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=3000, COUNTER_NAME_MAX=64, MAX_COUNTERS=10000
22/05/25 01:43:02 INFO counters.Limits: Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=3000, COUNTER_NAME_MAX=64, MAX_COUNTERS=10000
22/05/25 01:43:02 INFO client.TezClient: Tez Client Version: [ component=tez-api, version=0.9.1.7.1.7.1000-141, revision=2bb0a521cb559d31291f1ee5d2ea2176c843303a, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2022-03-24T17:23:26Z ]
22/05/25 01:43:02 INFO tez.TezSessionState: Opening new Tez Session (id: fc864c41-2236-472d-ac57-96c6df6d8e48, scratch dir: hdfs://sdl60070.labs.teradata.com:8020/tmp/hive/hive/_tez_session_dir/fc864c41-2236-472d-ac57-96c6df6d8e48)
22/05/25 01:43:02 INFO client.RMProxy: Connecting to ResourceManager at sdl60070.labs.teradata.com/10.27.110.8:8032
22/05/25 01:43:02 INFO client.TezClient: Session mode. Starting session.
22/05/25 01:43:02 INFO client.TezClientUtils: Using tez.lib.uris value from configuration: /user/tez/0.9.1.7.1.7.1000-141/tez.tar.gz
22/05/25 01:43:02 INFO client.TezClientUtils: Using tez.lib.uris.classpath value from configuration: null
22/05/25 01:43:02 INFO client.TezClient: Tez system stage directory hdfs://sdl60070.labs.teradata.com:8020/tmp/hive/hive/_tez_session_dir/fc864c41-2236-472d-ac57-96c6df6d8e48/.tez/application_1648800271973_1472 doesn't exist and is created
22/05/25 01:43:02 INFO impl.YarnClientImpl: Submitted application application_1648800271973_1472
22/05/25 01:43:02 INFO client.TezClient: The url to track the Tez Session: http://sdl60070.labs.teradata.com:8088/proxy/application_1648800271973_1472/
22/05/25 01:43:05 INFO ql.Driver: Compiling command(queryId=root_20220525014305_5670d206-fe95-46e0-9cc1-8218de717c96): Load data inpath '/user/hive/temp_014244000548_36' into table hivetest.EXPORT_case
22/05/25 01:43:06 INFO reflections.Reflections: Reflections took 104 ms to scan 1 urls, producing 35 keys and 689 values
22/05/25 01:43:06 INFO reflections.Reflections: Reflections took 54 ms to scan 1 urls, producing 35 keys and 689 values
22/05/25 01:43:06 INFO ql.Driver: Completed compiling command(queryId=root_20220525014305_5670d206-fe95-46e0-9cc1-8218de717c96); Time taken: 1.029 seconds
22/05/25 01:43:06 INFO metastore.HiveMetaStoreClient: Closed a connection to metastore, current connections: 2
22/05/25 01:43:06 INFO tool.ConnectorImportTool: com.teradata.connector.common.exception.ConnectorException: java.lang.ExceptionInInitializerError
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzerFactory.getInternal(SemanticAnalyzerFactory.java:62)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzerFactory.get(SemanticAnalyzerFactory.java:41)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:208)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:614)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:672)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:493)
at com.teradata.connector.hive.utils.HiveUtils.loadDataintoHiveTable(HiveUtils.java:347)
at com.teradata.connector.hive.processor.HiveOutputProcessor.outputPostProcessor(HiveOutputProcessor.java:267)
at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:172)
at com.teradata.connector.common.tool.ConnectorImportTool.run(ConnectorImportTool.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at com.teradata.connector.common.tool.ConnectorImportTool.main(ConnectorImportTool.java:906)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.<clinit>(DDLSemanticAnalyzerFactory.java:79)
... 22 more

at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:210)
at com.teradata.connector.common.tool.ConnectorImportTool.run(ConnectorImportTool.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at com.teradata.connector.common.tool.ConnectorImportTool.main(ConnectorImportTool.java:906)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)

5 REPLIES 5

avatar

Hi @tallamohan ,

I see that the "Load data inpath" statement is failing with a NPE:

Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.<clinit>(DDLSemanticAnalyzerFactory.java:79)

...

at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:210)

Has this worked before in your cluster?

Is this a new integration of Teradata with Hive / CDP?

The NPE happens at a phase where the DDLSemanticAnalyzerFactory is searching for subclasses of BaseSemanticAnalyzer (which extend BaseSemanticAnalyzer) under the "org.apache.hadoop.hive.ql.ddl" package. Do you have custom analyzer classes under that package? Do they have the "@DDLType" annotation? See 

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/database/desc/D...

as an example.
That annotation is likely missing, causing an NPE. 

If not, please check the classpath of HS2 for old / custom jars, which may still have some classes under the "org.apache.hadoop.hive.ql.ddl" package.

 

Hope this helps,

Best regards

 Miklos Szurap

Customer Operations Engineer

avatar
Explorer

Hi Miklos,

 

Has this worked before in your cluster?

= = = = = = = = = = = = = = = = = = = =

Yes, it worked in CDP 7.1.7.

When I upgraded my cluster to CDP 7.1.7 SP1 it failed.

 

Is this a new integration of Teradata with Hive / CDP?

= = = = = = = = = = = = = = = = = = = = = = = = = = = =

No, existing tool (TDCH)

 

Do you have custom analyzer classes under that package? Do they have the "@DDLType" annotation?

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 

No

 

Attached below sample MapReduce job. (Main.java)

This job worked fine in CDP 7.1.7 but failed in CDP 7.1.7 SP1

 

Thanks,

Mohan

 

avatar

Have you reviewed the classpath of the HS2 and all the jars?

$JAVA_HOME/bin/jinfo <hs2_pid> | grep java.class.path

Do they have some classes under the "org.apache.hadoop.hive.ql.ddl" package?


The attached code does not work on my cluster (it is missing some tez related configs). What configuration does it require?

avatar
Explorer

Set these environment variables and try submitting the job:

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

export HIVE_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive

export HCAT_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive-hcatalog/share/hcatalog

export HADOOP_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hadoop

export SQOOP_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/sqoop

export ATLAS_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/atlas/hook/hive/atlas-hive-plugin-impl

 

export LIB_JARS=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-1.8.2.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-mapred-1.8.2.7.1.7.1000-141-hadoop2.jar,$HIVE_HOME/conf,$HIVE_HOME/lib/antlr-runtime-3.5.2.jar,$HIVE_HOME/lib/commons-dbcp-1.4.jar,$HIVE_HOME/lib/commons-pool-1.5.4.jar,$HIVE_HOME/lib/datanucleus-core-4.1.17.jar,$HIVE_HOME/lib/datanucleus-rdbms-4.1.19.jar,$HIVE_HOME/lib/hive-cli-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-exec-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-jdbc-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-service-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-metastore-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/libfb303-0.9.3.jar,$HIVE_HOME/lib/libthrift-0.9.3-1.jar,$HCAT_HOME/hive-hcatalog-core-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/parquet-hadoop-bundle.jar,$HIVE_HOME/lib/hive-llap-tez-3.1.3000.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/*,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/bin/*,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/calcite-core-1.19.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-common-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-intg-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-notification-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/commons-configuration-1.10.jar,${ATLAS_HOME}/hive-bridge-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-client-common-2.1.0.7.1.7.1000-141.jar,$HIVE_HOME/lib/reflections-0.9.10.jar,$HIVE_HOME/lib/javassist-3.19.0-GA.jar,$HIVE_HOME/lib/cron-utils-9.1.3.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/hive-beeline-3.1.3000.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/jline-2.12.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/super-csv-2.2.0.jar

 

export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-1.8.2.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-mapred-1.8.2.7.1.7.1000-141-hadoop2.jar:$HIVE_HOME/conf:$HIVE_HOME/lib/antlr-runtime-3.5.2.jar:$HIVE_HOME/lib/commons-dbcp-1.4.jar:$HIVE_HOME/lib/commons-pool-1.5.4.jar:$HIVE_HOME/lib/datanucleus-core-4.1.17.jar:$HIVE_HOME/lib/datanucleus-rdbms-4.1.19.jar:$HIVE_HOME/lib/hive-cli-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-exec-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-jdbc-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-service-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-metastore-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/libfb303-0.9.3.jar:$HIVE_HOME/lib/libthrift-0.9.3-1.jar:$HCAT_HOME/hive-hcatalog-core-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/parquet-hadoop-bundle.jar:$HIVE_HOME/lib/hive-llap-tez-3.1.3000.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/*:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/bin/*:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/calcite-core-1.19.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-common-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-intg-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-notification-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/commons-configuration-1.10.jar:${ATLAS_HOME}/hive-bridge-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-client-common-2.1.0.7.1.7.1000-141.jar:$HIVE_HOME/lib/reflections-0.9.10.jar:$HIVE_HOME/lib/javassist-3.19.0-GA.jar:$HIVE_HOME/lib/cron-utils-9.1.3.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/hive-beeline-3.1.3000.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/jline-2.12.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/super-csv-2.2.0.jar

avatar

Hi @tallamohan 

The direct usage of the Hive classes (CliSessionState, SessionState, Driver) in the provided code falls under the "Hive CLI" or "Hcat CLI" access, which is not supported in CDP:

https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade/topics/hive-unsupported.html

Please open a case on MyCloudera Support Portal to get that clarified.

The recommended approach would be to use beeline and access the Hive service through HiveServer2.

 

Best regards

 Miklos