About mszurap

mszurap · ‎10-16-2024

Right, thanks for verifying that the partition locations are not renamed. Then it is clearly the HIVE-26158, which has been fixed in 7.1.7 SP2, please plan/test and apply the 7.1.7 SP2 or the even newer 7.1.7 SP3 version (recommended) to avoid this in the future. (one should not need to alter the partition locations when table is renamed) Best regards Miklos

mszurap · ‎10-14-2024

Yes, with renaming external tables Hive does not change their locations (to the new desired path, irrespective whether the DB changes or the table name), so once you drop the renamed table, due to the 'external.table.purge'='true' it dropped the original location's data directory. FYI there were some fixes around the rename functionality, HIVE-24920 fixed in 7.1.7 SP1 and HIVE-26158 fixed in 7.1.7 SP2, not sure if that affects you but it's recommended to apply the latest service pack.

mszurap · ‎10-11-2024

Hi @Patriciabqc The behavior depends on what was the table type and it's table properties. The CDP version may be also important as there were some fixes around the rename functionality (whether HDFS location is changed) between the versions. The behavior is different for managed and external tables. If you rename a managed table, it should rename the HDFS location too, so you should not end up in such situations. Because of that I assume these were external tables. The table property 'TRANSLATED_TO_EXTERNAL'='TRUE' should make an external table behave like a managed table ("legacy CDH managed table" behavior), so when you rename the table the location is also changed, so probably we can rule out this too, but I mention this for clarity that this is also an option - and maybe a way to avoid this in the future. Last but not least, for external tables if the table property 'external.table.purge'='true' is there, then it will delete the HDFS location when the table is dropped. The symptoms suggest this was the case. I hope this explains. Best regards, Miklos

mszurap · ‎09-30-2024

Hi @ipson , please see the CML docs: https://docs.cloudera.com/machine-learning/1.5.4/import-data/topics/ml-access-cdw-from-cml.html which suggests to use Impyla: https://github.com/cloudera/impyla This does not require the ODBC driver to be installed. Cheers, Miklos

mszurap · ‎09-26-2024

Hi @szhao , based on the symptoms this may be a problem within the ODBC driver. Can you please open a support ticket through the MyCloudera support portal so that our support team can forward these information to our driver team? Thanks, Miklos

mszurap · ‎09-11-2024

Most likely. Try to figure out the differences between those VMs. The SSPI is the Windows' integrated Kerberos library, it also has tools for "kinit" and "klist" but it's kerberos configuration is in a different place and works only if the VM is part of the Active Directory domain and the user is logged in with it's AD username.

mszurap · ‎09-10-2024

Thanks for checking. That error message "The specified target is unknown or unreachable" is not too descriptive, but suggests some Kerberos related setup problem. If you are connecting to a CDP 7.x cluster, please use the latest Cloudera Impala ODBC 2.7.x driver as it may contain additional fixes: https://www.cloudera.com/downloads/connectors/impala/odbc/ Also you can try to add another environment variable KRB5_CONFIG and point to the krb5.ini file used by the MIT kerberos client. (c:\ProgramData\MIT\Kerberos5\krb5.ini) If none of these help, please open a support case through our support portal where you can attach more details (and the driver logs). Thank you Miklos

mszurap · ‎09-10-2024

Hi @pnmateus , There are a couple of points to check: - Have you created a "System wide" datasource or a "user specific" datasource in the "ODBC Data sources"? - Per the screenshot you are using a 64 bit driver. That works only for 64 bit applications. Please verify if your Power BI (and the started python process) is also 64 bit (use Windows Task Manager to determine that) - When using MIT kerberos, the best is to define a User specific environment variable (go to your system settings / Environment variables) with name "KRB5CCNAME" and set it to a path / file which can be read only by your user. This ensures that MIT Kerberos tool will create the kerberos ticket cache file in that location and all the processes started by your user will also use the same ticket cache file when authenticating to the services (like Impala through ODBC). - If that still does not help, ultimately collect the ODBC driver logs, enable the DEBUG or Trace level logging in the datasource settings / Logging options.

mszurap · ‎05-22-2024

Hi @shrikantverma This indeed seems to be a generated query, however based on this only it is not clear if this is coming from the Hive Metastore. You can review with your DB tools which user is submitting these queries, for example for MySQL use "show processlist" or slow query logging to capture which user is submitting these. Alternatively stop the Hive metastore (if allowed) and observe if those queries are still appearing. As I see this query might be used by HMS while listing partitions by name in MetaStoreDirectSql.java#getPartitionsUsingProjectionAndFilterSpec(). Is this inefficient in your database? Have you checked that with an "explain <query>" or why do you want to optimize this? The HMS database should have indexes on the affected tables (DBS, TBLS, PARTITIONS) to speed up these queries, the only scan needed is to match the partition names - which can be slower if the Hive table has too many partitions. Best regards Miklos

mszurap · ‎05-21-2024

Thanks @ldylag for sharing the details, that makes sense. The CM does not necessarily need the PostgreSQL driver to be installed, of course it's needed only when that database type is configured for CM. The one under the "/opt/cloudera/cm/lib" might be shipped to support the "embedded" database, that might have caused a confusion here (if one HMS is on the CM host), but in general the DB drivers are expected to be under the "/usr/share/java". Cheers Miklos

Online	Offline
Last Visited	‎12-10-2024 10:10 AM

Member Since	‎11-04-2015 11:53 PM
Last Visited	‎12-10-2024 10:10 AM
Posts	260
Kudos received	44

Cloudera Community

Re: Hive fails to start with "Caused by: java.lang...

Re: The heap memory usage of NameNode is much high...

Re: Hue and Sqoop white spaces in query

Re: straight SELECT and SELECT via CTE produce dif...

Re: Best practices for partition tables in Impala ...

Re: Problem renaming Hive partitioned table and th...

Re: Problem renaming Hive partitioned table and th...

Re: Problem renaming Hive partitioned table and th...

Re: Installing Impala ODBC Driver on a CML Project...

Re: SSL error when opening ODBC connection to Impa...

Re: Connection malfunction ODBC to Impala using Ke...

Re: Connection malfunction ODBC to Impala using Ke...

Re: Connection malfunction ODBC to Impala using Ke...

Re: Un-optimize queries are running on metastore_d...

Re: Hive fails to start with "Caused by: java.lang...