Member since
11-04-2015
260
Posts
44
Kudos Received
33
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2515 | 05-16-2024 03:10 AM | |
1532 | 01-17-2024 01:07 AM | |
1551 | 12-11-2023 02:10 AM | |
2290 | 10-11-2023 08:42 AM | |
1572 | 09-07-2023 01:08 AM |
04-20-2023
12:18 AM
Hi @DataEngAa , The stacktrace shows that the SparkSQL was trying to list the partitions first ("HiveMetaStoreClient.listPartitionsWithAuthInfo") when the connection was lost. The attached snippet does not show timing information, but most likely the request simply timed out after the predefined timeouts. Also likely the table you try to manipulate has lots of partitions (in Hive metastore, the partition directory count on hdfs is a different question) The timeout is defined both on client (Spark) side and on server (Hive metastore) side. To increase the timeout to let it run for longer: 1. Set the "hive.metastore.client.socket.timeout=1800" in hive-site.xml for Hive service wide AND in the Hive gateway safety valves. 2. If this is CDP, set it on the HiveOnTez service side too - to let HS2 pick that value too. 3. Start your spark application with --conf spark.hadoop.hive.metastore.client.socket.timeout=1800 The above increases the timeouts to 30 mintutes from the default 5 minutes, which is usually too low for big tables. Best regards Miklos
... View more
04-19-2023
08:03 AM
Hi @JKarount , I see, we will take up the bug report internally as it may affect other customers too, in any case thanks for your diagnostics and troubleshooting. In my opinion to achieve the same functionality and most likely it will pass through the ODBC driver as-is, please try to use the Impala's NVL or NVL2 function (or even ZEROIFNULL if it's a numeric column): https://impala.apache.org/docs/build/html/topics/impala_conditional_functions.html#conditional_functions__nvl2 SELECT col1, col2 as col2_orig, NVL2(col2, col2, 0) as col2_1 FROM sandbox.jk_test or SELECT col1, col2 as col2_orig, NVL(col2, 0) as col2_1 FROM sandbox.jk_test Hope that helps. Best regards Miklos
... View more
04-19-2023
01:28 AM
Hi @JinTong , I assume you refer to links in Cloudera Manager UI. I don't think using IP addresses instead of the hostnames would be possible. For your best interest and future proof operations please set up DNS properly in the cluster. Please see our documentation: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/installation/topics/cdpdc-configure-network-names.html Thank you Miklos
... View more
04-19-2023
01:06 AM
Hi @JKarount , thank you for the detailed description and reproduction steps. To be able to help you on this, kindly open a formal support case through our Cloudera Support portal. Please also reproduce this with the latest ODBC driver version (2.6.17 as of now) with driver TRACE logs enabled. Please see the ODBC driver install guide https://downloads.cloudera.com/connectors/ClouderaImpala_ODBC_2.6.17.1026/Cloudera-ODBC-Connector-for-Impala-Install-Guide.pdf (in Windows go to the ODBC datasource configuration, and enable it under Logging Options) The support team will be able to triage this further with our driver team. Thank you Miklos Szurap Customer Operations Engineer, Cloudera
... View more
04-19-2023
12:22 AM
Glad to hear that. 🙂 The "hive.stats.autogather" should be unrelated to this, it just controls whether Hive should gather statistics at the end of INSERT statements, disabling it just speeds up the insert query somewhat. If you see something weird or not working with that enabled, feel free to open a support case on our Support Portal. See this and all the hive configuration descriptions in the HiveConf.java source code. Cheers, Miklos
... View more
04-17-2023
02:41 AM
Hello @yashbansal042 , It seems that you are using the "hive" action (Hive1 action) which is using the Hive CLI, that is deprecated and not supported in CDP: https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/hive-introduction/topics/hive-unsupported.html Please follow https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/starting-hive/topics/hive_use_variables.html to convert your Hive CLI scripts to beeline compatible scripts, and then use the "hive2" action in Oozie. First, you might want to test whether a simple hive query (like the "select 1" query) works fine through a hive2 action. Best regards Miklos
... View more
04-13-2023
12:05 AM
This sounds a Spring boot specific issue, a quick googling was leading me to https://stackoverflow.com/questions/42073194/unable-to-detect-database-type please check if the suggestions in that discussion helps.
... View more
04-03-2023
02:42 AM
Hi @Vinylal , The "Templeton" webapplication is the Hive service's WebHCat. Please note that Webhcat is not supported in CDP: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/hive-introduction/topics/hive-unsupported.html Best regards Miklos
... View more
03-29-2023
09:33 AM
1 Kudo
I assume you meant NVL function returns 0 or 1 if a null is found, or the specified expression in the argument - when using Hive. With Impala and NVL2 you would still need to have the outer query to "sum" up all the 1 values what we have mapped from the column value (from their real value to 0 or to 1). It would just be a bit nicer, but no real change.
... View more
03-29-2023
08:49 AM
1 Kudo
Great, glad to hear that it was helpful. Actually I was thinking about using the NVL function, but in Hive that does not offer a value for the "else" part, like Impala's NVL2 funcion: https://impala.apache.org/docs/build/asf-site-html/topics/impala_conditional_functions.html#conditional_functions__nvl2 with that the query would be much simpler (no need for CASE WHEN ... THEN ... ELSE ... END), just a "NVL2(CustFirstName, 0, 1) SELECT NVL2('ABC', 'Is Not Null', 'Is Null'); -- Returns 'Is Not Null' Again, this is for Impala, Hive does not have this function unfortunately.
... View more