About mszurap

mszurap · ‎03-28-2023

Hive @Supernova , There can be multiple solutions probabaly, for one example with a subquery you can use the following: select concat(cast(sum(CustFirstNameNullCheck) as bigint),' (nulls)') as CustFirstName, concat(cast(sum(CustLastNameNullCheck) as bigint),' (nulls)') as CustLastName from (select CASE WHEN CustFirstName is null then 1 else 0 end CustFirstNameNullCheck, CASE WHEN CustLastName is null then 1 else 0 end CustLastNameNullCheck from SchemaName.DbName WHERE Date = '2023-03-15') a; Hope this helps. Best regards, Miklos

mszurap · ‎03-23-2023

Hi @ThomasCloudeara , The ImpalaJDBC library/dependency is not published into any public maven repository due to licensing questions. After accepting the license terms you need to download the .zip from our download site, extract it, and "install" the ImpalaJDBC42.jar into one of your company wide or to your machine local maven repository. Follow the Maven guide: https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html

mszurap · ‎03-21-2023

Hi @ThomasCloudeara , First, the JDBC driver v2.6.4 is very old. Kindly use the latest one, download it from our website: https://www.cloudera.com/downloads/connectors/impala/jdbc/ Preferably use the "JDBC4.2" specification driver, with that use "com.cloudera.impala.jdbc.Driver" instead. Second, please follow the guide https://docs.cloudera.com/documentation/other/connectors/impala-jdbc/2-6-30/Cloudera-JDBC-Connector-for-Apache-Impala-Install-Guide.pdf and verify the JDBC connection string: - if the LDAP authentication is really enabled (AuthMech=3 tells that) or if you have Kerberos authentication only in your cluster (which needs AuthMech=1) - if LDAP auth is really enabled, can you connect to Impala with impala-shell with the same username/password pair? Thanks Miklos

mszurap · ‎03-06-2023

Hi @KaimingGu , the translation might have been incorrect, so I assume you're actually facing "Client Fetch Wait Time" taking the most of the time. At the same time the "Client Fetch Wait Time Percentage" is also close to 100%. This means that while the query has been executed quicky, the client was too slow to fetch the results. This is usually a sign that the network may be slow or the client is facing other slowness, for example it does some processing between the "result.next()" calls, or hitting GC pauses (if it's a Java application). See also the following blog post which explains the different query profile metrics: https://www.ericlin.me/2020/02/impala-query-profile-explained-part-5-query-metrics/ Best regards Miklos Szurap Customer Operations Engineer, Cloudera

mszurap · ‎01-19-2023

Hi @StuartM , I know it's not a direct answer, but this requirement sounds more like a good call for Kafka - which inherently supports the idea of "consumer offsets".

mszurap · ‎10-13-2022

Hi @MichaelPlet , yes, sure, 7.1.7 SP1 is definitely a stable and also a long term support release with lots of bugfixes over 7.1.7. Also check the additional cumulative hotfixes which were release on top of 7.1.7 SP1: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/runtime-release-notes/topics/chf-pvcb-sp1-overview.html#chf-pvcb-sp1-overview If you have some security vulnerability questions, then kindly raise those questions through a support case. Thank you Miklos Szurap Customer Operations Engineer, Cloudera

mszurap · ‎10-13-2022

These are pretty long GC pauses, I assume they are from the HMS logs. With long GC pauses of course every operation will suffer and will be slow, eventually the SMON's request will time out . Kindly review the HMS heap size and consider to increase it until you get a stable performance (without such GC pauses).

mszurap · ‎10-13-2022

The Canary is just testing whether the basic operatins are working in Hive Metastore. If that shows "unhealthy" it does not necessarily mean that the jobs are failing due to the Hive Metastore not functioning (it may be just slow for example), it is however indeed a warning sign for you that something is not proper. Please connect with beeline to the HiveServer2 and verify what is working and what is failing, then check the HiveServer2 logs and HiveMetastore logs. You can file a support case (where you can share much more details) if this is an urgent issue.

mszurap · ‎10-13-2022

Hi @hanumanth , I assume this is a CDH 6 cluster. Do you have Sentry enabled as well? Is this always happening, or just at some times? Have you tested in beeline how long does it take to drop an example database? Does it also fail with a timeout? I guess it is taking more than 60 seconds (that's the service monitor's default timeout), and since the default timeout for HS2 to HMS is 5 minutes it actually succeeds. Thanks, Miklos

mszurap · ‎10-05-2022

Hi @Jarinek , Yes, in CDH/CDP every service which depends on HDFS will inherit the HDFS configuration "auth-to-local rules", in CM in HDFS Configuration see "Additional Rules to Map Kerberos Principals to Short Names". Kafka does not need HDFS so that's why it has a separate such configuration. See the documentation how to set it: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/security-kerberos-authentication/topics/cm-security-kerberos-authentication-auth-to-local-isolate.html Best regards Miklos

Online	Offline
Last Visited	‎12-10-2024 10:10 AM

Member Since	‎11-04-2015 11:53 PM
Last Visited	‎12-10-2024 10:10 AM
Posts	260
Kudos received	44

Cloudera Community

Re: Hive fails to start with "Caused by: java.lang...

Re: The heap memory usage of NameNode is much high...

Re: Hue and Sqoop white spaces in query

Re: straight SELECT and SELECT via CTE produce dif...

Re: Best practices for partition tables in Impala ...

Re: Null check query

Re: Not able to connect ImpalaDB from Springboot a...

Re: Not able to connect ImpalaDB from Springboot a...

Re: impala执行部分查询语句时，[客户端获取等待时间]占整体时间占比很高，导致过慢 (Whe...

Re: Retrieving Impala data using SQL, only data re...

Re: Cloudera Runtime 7.1.7 SP1

Re: Hive Metastore getting alert due to Hive Metas...

Re: Hive Metastore getting alert due to Hive Metas...

Re: Hive Metastore getting alert due to Hive Metas...

Re: Spark-submit - mapping of principal