About cravani

cravani · ‎01-11-2024

@yusufu If nothing in /var/log/hive, can you check if you have log file inside /tmp/hive or /tmp/hive/hive? This would be helpful to debug the issue further.

cravani · ‎12-07-2023

@RS_11 If I grasp your inquiry accurately, you are referring to the behavior in older CDH versions (specifically 5.x), where executing insert overwrite with a selection of 0 rows resulted in the creation of a 0-byte file named 000000_0. However, in the newer Hive version 3.1.3, you observe that no 0-byte files are generated in such scenarios. Assuming my understanding is correct, may I ask why there is a desire to generate a 0-byte file? Having 0-byte or small files is deemed inefficient. It's noteworthy that there have been various changes in the past related to 0-byte files, as documented in the following issues: HIVE-22941 HIVE-21714

cravani · ‎11-20-2023

@HadoopHero Answer would vary based on query that you are running, assuming you have simple "Insert select */cols from Table" it is likely mapper only job and you may want to try tuning below. set tez.grouping.min-size=134217728; -- 128 MB min split set tez.grouping.max-size=1073741824; -- 1 GB max split Try setting min-size and max-size to same value. I would not go below 128M.

cravani · ‎09-29-2023

PyHive is not maintained by Cloudera, Can you use Impyla to connect to HiveServer2? It should work with HS2 also. Ref: https://github.com/cloudera/impyla

cravani · ‎09-28-2023

@jayes I had another look into the problem description and I see queries run with 2 categories Column names enclosed in backquotes Column names not enclosed in backquotes. below are the root cause for both the categories, please find the solution at the end of this comment. Case 1: This one fails because you are running SQL with -e option and and column values are enclosed in backticks "`", the value inside backticks is considered as a Linux command as opposed to be a part of SQL leading to the compilation error. Command : sudo -u hive beeline -u "jdbc:hive2://cdp1:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -e "SET hive.support.quoted.identifiers=column;use jayesh; create external table testkeyword1( `transform` string);"; error : Error: Error while compiling statement: FAILED: ParseException line 1:45 cannot recognize input near ')' '<EOF>' '<EOF>' in column type (state=42000,code=40000) Command : sudo -u hive beeline -u "jdbc:hive2://cdp1:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -e "set hive.support.sql11.reserved.keywords=false; set hive.support.quoted.identifiers=column;use jayesh; create external table testunderscore( `_name` string, `_id` string)"; error : -bash: _name: command not found -bash: _id: command not found Error: Error while compiling statement: FAILED: ParseException line 1:45 cannot recognize input near ',' 'string' ')' in column type (state=42000,code=40000) Case 2: This one fails because you are using reserved keyword transform which is not escaped properly. Command : sudo -u hive beeline -u "jdbc:hive2://cdp1:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -e "set hive.support.sql11.reserved.keywords=false; set hive.support.quoted.identifiers=column;use jayesh; create external table testkeyword3( transform string)"; error : Error: Error while compiling statement: FAILED: ParseException line 1:38 cannot recognize input near 'transform' 'string' ')' in column name or constraint (state=42000,code=40000) Command : sudo -u hive beeline -u "jdbc:hive2://cdp1.:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -e "SET hive.support.sql11.reserved.keywords=false;use jayesh; create external table testkeyowrd2( transform string)"; error : Error while compiling statement: FAILED: ParseException line 1:38 cannot recognize input near 'transform' 'string' ')' in column name or constraint (state=42000,code=40000) Solution: You can escape backticks while passing SQL from command line eg:- sudo -u hive beeline -u "jdbc:hive2://cdp1:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -e "SET use jayesh; create external table testkeyword1( \`transform\` string);"; You can also pass the SQL to beeline command as a file. eg;- transform.sql create external table testkeyword1( `transform` string); Beeline command: sudo -u hive beeline -u "jdbc:hive2://cdp1:2181,cdp2:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;;principal=hive/_HOST@hdpcluster.test" -n hive --showHeader=false --outputformat=tsv2 -f transform.sql Let me know if this helps fixing the issue.

cravani · ‎09-26-2023

Can you share your krb5.conf file on local spark instance? It is likely that you are missing krb5 related conf. You may want to try below to enable additional logging for krb5 before running spark job. export HADOOP_CLIENT_OPTS="-Dsun.security.krb5.debug=true"

cravani · ‎09-26-2023

Both the below queries should work by default. I have checked by running them against CDP 7.2.17. create external table testkeyword1( `transform` string) create external table testunderscore1( `_name` string, `_id` string); What is the Apache Hive or CDP/CDH/HDP version you are using?

cravani · ‎07-15-2023

Are you sure that you are trying to connect using Kerberos to Hive? It appears to me that you are trying to connect to Hive over Knox which may have LDAP configured instead of Kerberos? Can you try below? by replacing your username and password df = spark.read \ .jdbc("jdbc:hive2://p-cdl-knox-prd-svc.hcscint.net:8443/default;transportMode=http;httpPath=gateway/default/hive;ssl=true;sslTrustStore=/Users/u530241/Downloads/gateway_prod_.jks;trustStorePassword=knoxprod;user=<username>;password=<password>")

cravani · ‎07-14-2023

You seem to be using wrong driver to connect to HiveServer2. MySQL JDBC driver does not work to connect to HS2. You would need Apache Hive JDBC Driver or Cloudera Hive JDBC Driver to connect to Hive. You can download latest Apache Hive JDBC Driver from: https://repo1.maven.org/maven2/org/apache/hive/hive-jdbc/4.0.0-alpha-2/ File: hive-jdbc-4.0.0-alpha-2-standalone.jar Add the same jar to your spark.jars.

cravani · ‎07-05-2023

@sevens You might want to use latest version: 2.6.21 and also there is no hard limit to use UGI with Cloudera Driver as opposed to Apache Drivers. One option is to use JAAS file like below. client.jaas Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="PathToTheKeyTab" principal="cloudera@CLOUDERA" doNotPrompt=true; }; and set the java.security.auth.login.config system property to the location of the JAAS file. eg: java -Djava.security.auth.login.config=/opt/clientconf/client.jaas Doc ref: https://docs.cloudera.com/documentation/other/connectors/hive-jdbc/2-6-21/Cloudera-JDBC-Connector-for-Apache-Hive-Install-Guide.pdf

Online	Offline
Last Visited	‎06-12-2025 11:41 AM

Member Since	‎02-02-2017 10:39 AM
Last Visited	‎06-12-2025 11:41 AM
Posts	106
Kudos received	28

Cloudera Community

Re: not able to use '_' in column name or cannot u...

Re: how to use 'HS2Driver' driver class to connect...

Re: HORTONWORKS ODBC driver not working

Re: What are the JAR files required to connect to ...

Re: Knox allowing anonymous users

Re: Hiveserver2 started but not listening 10000

Re: Hive 3.1.3 - Empty files not generated when n...

Re: Possibility Split Parquet file

Re: Error connecting to Kerberos using pyHive libr...

Re: not able to use '_' in column name or cannot u...

Re: Connecting local spark with Hive with kerberos

Re: not able to use '_' in column name or cannot u...

Re: Connecting to Hive using local spark jdbc

Re: Connecting to Hive using local spark jdbc

Re: how to use 'HS2Driver' driver class to connect...