Created 05-25-2022 01:56 AM
22/05/25 01:43:01 INFO utils.HiveUtils: load data into hive table
Hive Session ID = fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO SessionState: Hive Session ID = fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created HDFS directory: /tmp/hive/hive/fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created local directory: /tmp/root/fc864c41-2236-472d-ac57-96c6df6d8e48
22/05/25 01:43:01 INFO session.SessionState: Created HDFS directory: /tmp/hive/hive/fc864c41-2236-472d-ac57-96c6df6d8e48/_tmp_space.db
22/05/25 01:43:01 INFO tez.TezSessionState: User of session id fc864c41-2236-472d-ac57-96c6df6d8e48 is hive
22/05/25 01:43:01 INFO tez.TezSessionState: Created new resources: null
22/05/25 01:43:01 INFO tez.DagUtils: Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hive/.hiveJars
22/05/25 01:43:01 INFO tez.TezSessionState: Computed sha: c765910321290465015ccaaeae5dde480fd29e141a4c11594756ecfe145f4d5b for file: file:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/hive-exec-3.1.3000.7.1.7.1000-141.jar of length: 43.58MB in 301 ms
22/05/25 01:43:01 INFO tez.DagUtils: Resource modification time: 1648793437245 for hdfs://sdl60070.labs.teradata.com:8020/user/hive/.hiveJars/hive-exec-3.1.3000.7.1.7.1000-141-c765910321290465015ccaaeae5dde480fd29e141a4c11594756ecfe145f4d5b.jar
22/05/25 01:43:01 INFO sqlstd.SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=fc864c41-2236-472d-ac57-96c6df6d8e48, clientType=HIVECLI]
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: HMS client filtering is enabled.
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Trying to connect to metastore with URI thrift://sdl60070.labs.teradata.com:9083
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Opened a connection to metastore, current connections: 3
22/05/25 01:43:02 INFO metastore.HiveMetaStoreClient: Connected to metastore.
22/05/25 01:43:02 INFO metastore.RetryingMetaStoreClient: RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive (auth:SIMPLE) retries=1 delay=1 lifetime=0
22/05/25 01:43:02 INFO counters.Limits: Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=3000, COUNTER_NAME_MAX=64, MAX_COUNTERS=10000
22/05/25 01:43:02 INFO counters.Limits: Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=3000, COUNTER_NAME_MAX=64, MAX_COUNTERS=10000
22/05/25 01:43:02 INFO client.TezClient: Tez Client Version: [ component=tez-api, version=0.9.1.7.1.7.1000-141, revision=2bb0a521cb559d31291f1ee5d2ea2176c843303a, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2022-03-24T17:23:26Z ]
22/05/25 01:43:02 INFO tez.TezSessionState: Opening new Tez Session (id: fc864c41-2236-472d-ac57-96c6df6d8e48, scratch dir: hdfs://sdl60070.labs.teradata.com:8020/tmp/hive/hive/_tez_session_dir/fc864c41-2236-472d-ac57-96c6df6d8e48)
22/05/25 01:43:02 INFO client.RMProxy: Connecting to ResourceManager at sdl60070.labs.teradata.com/10.27.110.8:8032
22/05/25 01:43:02 INFO client.TezClient: Session mode. Starting session.
22/05/25 01:43:02 INFO client.TezClientUtils: Using tez.lib.uris value from configuration: /user/tez/0.9.1.7.1.7.1000-141/tez.tar.gz
22/05/25 01:43:02 INFO client.TezClientUtils: Using tez.lib.uris.classpath value from configuration: null
22/05/25 01:43:02 INFO client.TezClient: Tez system stage directory hdfs://sdl60070.labs.teradata.com:8020/tmp/hive/hive/_tez_session_dir/fc864c41-2236-472d-ac57-96c6df6d8e48/.tez/application_1648800271973_1472 doesn't exist and is created
22/05/25 01:43:02 INFO impl.YarnClientImpl: Submitted application application_1648800271973_1472
22/05/25 01:43:02 INFO client.TezClient: The url to track the Tez Session: http://sdl60070.labs.teradata.com:8088/proxy/application_1648800271973_1472/
22/05/25 01:43:05 INFO ql.Driver: Compiling command(queryId=root_20220525014305_5670d206-fe95-46e0-9cc1-8218de717c96): Load data inpath '/user/hive/temp_014244000548_36' into table hivetest.EXPORT_case
22/05/25 01:43:06 INFO reflections.Reflections: Reflections took 104 ms to scan 1 urls, producing 35 keys and 689 values
22/05/25 01:43:06 INFO reflections.Reflections: Reflections took 54 ms to scan 1 urls, producing 35 keys and 689 values
22/05/25 01:43:06 INFO ql.Driver: Completed compiling command(queryId=root_20220525014305_5670d206-fe95-46e0-9cc1-8218de717c96); Time taken: 1.029 seconds
22/05/25 01:43:06 INFO metastore.HiveMetaStoreClient: Closed a connection to metastore, current connections: 2
22/05/25 01:43:06 INFO tool.ConnectorImportTool: com.teradata.connector.common.exception.ConnectorException: java.lang.ExceptionInInitializerError
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzerFactory.getInternal(SemanticAnalyzerFactory.java:62)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzerFactory.get(SemanticAnalyzerFactory.java:41)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:208)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:614)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:672)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:493)
at com.teradata.connector.hive.utils.HiveUtils.loadDataintoHiveTable(HiveUtils.java:347)
at com.teradata.connector.hive.processor.HiveOutputProcessor.outputPostProcessor(HiveOutputProcessor.java:267)
at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:172)
at com.teradata.connector.common.tool.ConnectorImportTool.run(ConnectorImportTool.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at com.teradata.connector.common.tool.ConnectorImportTool.main(ConnectorImportTool.java:906)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.<clinit>(DDLSemanticAnalyzerFactory.java:79)
... 22 more
at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:210)
at com.teradata.connector.common.tool.ConnectorImportTool.run(ConnectorImportTool.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at com.teradata.connector.common.tool.ConnectorImportTool.main(ConnectorImportTool.java:906)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Created 05-25-2022 02:59 AM
Hi @tallamohan ,
I see that the "Load data inpath" statement is failing with a NPE:
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.<clinit>(DDLSemanticAnalyzerFactory.java:79)
...
at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:210)
Has this worked before in your cluster?
Is this a new integration of Teradata with Hive / CDP?
The NPE happens at a phase where the DDLSemanticAnalyzerFactory is searching for subclasses of BaseSemanticAnalyzer (which extend BaseSemanticAnalyzer) under the "org.apache.hadoop.hive.ql.ddl" package. Do you have custom analyzer classes under that package? Do they have the "@DDLType" annotation? See
as an example.
That annotation is likely missing, causing an NPE.
If not, please check the classpath of HS2 for old / custom jars, which may still have some classes under the "org.apache.hadoop.hive.ql.ddl" package.
Hope this helps,
Best regards
Miklos Szurap
Customer Operations Engineer
Created on 05-25-2022 03:13 AM - edited 05-25-2022 04:33 AM
Hi Miklos,
Has this worked before in your cluster?
= = = = = = = = = = = = = = = = = = = =
Yes, it worked in CDP 7.1.7.
When I upgraded my cluster to CDP 7.1.7 SP1 it failed.
Is this a new integration of Teradata with Hive / CDP?
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
No, existing tool (TDCH)
Do you have custom analyzer classes under that package? Do they have the "@DDLType" annotation?
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
No
Attached below sample MapReduce job. (Main.java)
This job worked fine in CDP 7.1.7 but failed in CDP 7.1.7 SP1
Thanks,
Mohan
Created 05-30-2022 05:44 AM
Have you reviewed the classpath of the HS2 and all the jars?
$JAVA_HOME/bin/jinfo <hs2_pid> | grep java.class.path
Do they have some classes under the "org.apache.hadoop.hive.ql.ddl" package?
The attached code does not work on my cluster (it is missing some tez related configs). What configuration does it require?
Created 05-30-2022 07:05 AM
Set these environment variables and try submitting the job:
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
export HIVE_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive
export HCAT_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive-hcatalog/share/hcatalog
export HADOOP_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hadoop
export SQOOP_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/sqoop
export ATLAS_HOME=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/atlas/hook/hive/atlas-hive-plugin-impl
export LIB_JARS=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-1.8.2.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-mapred-1.8.2.7.1.7.1000-141-hadoop2.jar,$HIVE_HOME/conf,$HIVE_HOME/lib/antlr-runtime-3.5.2.jar,$HIVE_HOME/lib/commons-dbcp-1.4.jar,$HIVE_HOME/lib/commons-pool-1.5.4.jar,$HIVE_HOME/lib/datanucleus-core-4.1.17.jar,$HIVE_HOME/lib/datanucleus-rdbms-4.1.19.jar,$HIVE_HOME/lib/hive-cli-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-exec-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-jdbc-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-service-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/hive-metastore-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/libfb303-0.9.3.jar,$HIVE_HOME/lib/libthrift-0.9.3-1.jar,$HCAT_HOME/hive-hcatalog-core-3.1.3000.7.1.7.1000-141.jar,$HIVE_HOME/lib/parquet-hadoop-bundle.jar,$HIVE_HOME/lib/hive-llap-tez-3.1.3000.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/*,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/bin/*,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/calcite-core-1.19.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-common-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-intg-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-notification-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/commons-configuration-1.10.jar,${ATLAS_HOME}/hive-bridge-2.1.0.7.1.7.1000-141.jar,${ATLAS_HOME}/atlas-client-common-2.1.0.7.1.7.1000-141.jar,$HIVE_HOME/lib/reflections-0.9.10.jar,$HIVE_HOME/lib/javassist-3.19.0-GA.jar,$HIVE_HOME/lib/cron-utils-9.1.3.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/hive-beeline-3.1.3000.7.1.7.1000-141.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/jline-2.12.jar,/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/super-csv-2.2.0.jar
export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-1.8.2.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/avro/avro-mapred-1.8.2.7.1.7.1000-141-hadoop2.jar:$HIVE_HOME/conf:$HIVE_HOME/lib/antlr-runtime-3.5.2.jar:$HIVE_HOME/lib/commons-dbcp-1.4.jar:$HIVE_HOME/lib/commons-pool-1.5.4.jar:$HIVE_HOME/lib/datanucleus-core-4.1.17.jar:$HIVE_HOME/lib/datanucleus-rdbms-4.1.19.jar:$HIVE_HOME/lib/hive-cli-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-exec-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-jdbc-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-service-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/hive-metastore-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/libfb303-0.9.3.jar:$HIVE_HOME/lib/libthrift-0.9.3-1.jar:$HCAT_HOME/hive-hcatalog-core-3.1.3000.7.1.7.1000-141.jar:$HIVE_HOME/lib/parquet-hadoop-bundle.jar:$HIVE_HOME/lib/hive-llap-tez-3.1.3000.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/*:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/tez/bin/*:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/calcite-core-1.19.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-common-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-intg-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-notification-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/commons-configuration-1.10.jar:${ATLAS_HOME}/hive-bridge-2.1.0.7.1.7.1000-141.jar:${ATLAS_HOME}/atlas-client-common-2.1.0.7.1.7.1000-141.jar:$HIVE_HOME/lib/reflections-0.9.10.jar:$HIVE_HOME/lib/javassist-3.19.0-GA.jar:$HIVE_HOME/lib/cron-utils-9.1.3.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/hive-beeline-3.1.3000.7.1.7.1000-141.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/jline-2.12.jar:/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/lib/hive/lib/super-csv-2.2.0.jar
Created 06-08-2022 01:10 AM
Hi @tallamohan
The direct usage of the Hive classes (CliSessionState, SessionState, Driver) in the provided code falls under the "Hive CLI" or "Hcat CLI" access, which is not supported in CDP:
https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade/topics/hive-unsupported.html
Please open a case on MyCloudera Support Portal to get that clarified.
The recommended approach would be to use beeline and access the Hive service through HiveServer2.
Best regards
Miklos