About mszurap

mszurap · ‎04-20-2023

Hi @DataEngAa , The stacktrace shows that the SparkSQL was trying to list the partitions first ("HiveMetaStoreClient.listPartitionsWithAuthInfo") when the connection was lost. The attached snippet does not show timing information, but most likely the request simply timed out after the predefined timeouts. Also likely the table you try to manipulate has lots of partitions (in Hive metastore, the partition directory count on hdfs is a different question) The timeout is defined both on client (Spark) side and on server (Hive metastore) side. To increase the timeout to let it run for longer: 1. Set the "hive.metastore.client.socket.timeout=1800" in hive-site.xml for Hive service wide AND in the Hive gateway safety valves. 2. If this is CDP, set it on the HiveOnTez service side too - to let HS2 pick that value too. 3. Start your spark application with --conf spark.hadoop.hive.metastore.client.socket.timeout=1800 The above increases the timeouts to 30 mintutes from the default 5 minutes, which is usually too low for big tables. Best regards Miklos

mszurap · ‎04-19-2023

Hi @JKarount , I see, we will take up the bug report internally as it may affect other customers too, in any case thanks for your diagnostics and troubleshooting. In my opinion to achieve the same functionality and most likely it will pass through the ODBC driver as-is, please try to use the Impala's NVL or NVL2 function (or even ZEROIFNULL if it's a numeric column): https://impala.apache.org/docs/build/html/topics/impala_conditional_functions.html#conditional_functions__nvl2 SELECT col1, col2 as col2_orig, NVL2(col2, col2, 0) as col2_1 FROM sandbox.jk_test or SELECT col1, col2 as col2_orig, NVL(col2, 0) as col2_1 FROM sandbox.jk_test Hope that helps. Best regards Miklos

mszurap · ‎04-19-2023

Hi @JinTong , I assume you refer to links in Cloudera Manager UI. I don't think using IP addresses instead of the hostnames would be possible. For your best interest and future proof operations please set up DNS properly in the cluster. Please see our documentation: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/installation/topics/cdpdc-configure-network-names.html Thank you Miklos

mszurap · ‎04-19-2023

Hi @JKarount , thank you for the detailed description and reproduction steps. To be able to help you on this, kindly open a formal support case through our Cloudera Support portal. Please also reproduce this with the latest ODBC driver version (2.6.17 as of now) with driver TRACE logs enabled. Please see the ODBC driver install guide https://downloads.cloudera.com/connectors/ClouderaImpala_ODBC_2.6.17.1026/Cloudera-ODBC-Connector-for-Impala-Install-Guide.pdf (in Windows go to the ODBC datasource configuration, and enable it under Logging Options) The support team will be able to triage this further with our driver team. Thank you Miklos Szurap Customer Operations Engineer, Cloudera

mszurap · ‎04-19-2023

Glad to hear that. 🙂 The "hive.stats.autogather" should be unrelated to this, it just controls whether Hive should gather statistics at the end of INSERT statements, disabling it just speeds up the insert query somewhat. If you see something weird or not working with that enabled, feel free to open a support case on our Support Portal. See this and all the hive configuration descriptions in the HiveConf.java source code. Cheers, Miklos

mszurap · ‎04-17-2023

Hello @yashbansal042 , It seems that you are using the "hive" action (Hive1 action) which is using the Hive CLI, that is deprecated and not supported in CDP: https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/hive-introduction/topics/hive-unsupported.html Please follow https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/starting-hive/topics/hive_use_variables.html to convert your Hive CLI scripts to beeline compatible scripts, and then use the "hive2" action in Oozie. First, you might want to test whether a simple hive query (like the "select 1" query) works fine through a hive2 action. Best regards Miklos

mszurap · ‎04-13-2023

This sounds a Spring boot specific issue, a quick googling was leading me to https://stackoverflow.com/questions/42073194/unable-to-detect-database-type please check if the suggestions in that discussion helps.

mszurap · ‎04-03-2023

Hi @Vinylal , The "Templeton" webapplication is the Hive service's WebHCat. Please note that Webhcat is not supported in CDP: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/hive-introduction/topics/hive-unsupported.html Best regards Miklos

mszurap · ‎03-29-2023

I assume you meant NVL function returns 0 or 1 if a null is found, or the specified expression in the argument - when using Hive. With Impala and NVL2 you would still need to have the outer query to "sum" up all the 1 values what we have mapped from the column value (from their real value to 0 or to 1). It would just be a bit nicer, but no real change.

mszurap · ‎03-29-2023

Great, glad to hear that it was helpful. Actually I was thinking about using the NVL function, but in Hive that does not offer a value for the "else" part, like Impala's NVL2 funcion: https://impala.apache.org/docs/build/asf-site-html/topics/impala_conditional_functions.html#conditional_functions__nvl2 with that the query would be much simpler (no need for CASE WHEN ... THEN ... ELSE ... END), just a "NVL2(CustFirstName, 0, 1) SELECT NVL2('ABC', 'Is Not Null', 'Is Null'); -- Returns 'Is Not Null' Again, this is for Impala, Hive does not have this function unfortunately.

Online	Offline
Last Visited	‎01-07-2025 12:29 PM

Member Since	‎11-04-2015 11:53 PM
Last Visited	‎01-07-2025 12:29 PM
Posts	260
Kudos received	44

Cloudera Community

Re: Hive fails to start with "Caused by: java.lang...

Re: The heap memory usage of NameNode is much high...

Re: Hue and Sqoop white spaces in query

Re: straight SELECT and SELECT via CTE produce dif...

Re: Best practices for partition tables in Impala ...

Re: Hive metastore lost connection while executing...

Re: straight SELECT and SELECT via CTE produce dif...

Re: change the link address for service web ui for...

Re: straight SELECT and SELECT via CTE produce dif...

Re: Oozie Hive Job status changes to KILLED but th...

Re: Oozie Hive Job status changes to KILLED but th...

Re: Not able to connect ImpalaDB from Springboot a...

Re: http://Hostname:50111/templeton/v1/status retu...

Re: Null check query

Re: Null check query