Member since
11-04-2015
260
Posts
44
Kudos Received
33
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2491 | 05-16-2024 03:10 AM | |
1531 | 01-17-2024 01:07 AM | |
1551 | 12-11-2023 02:10 AM | |
2289 | 10-11-2023 08:42 AM | |
1572 | 09-07-2023 01:08 AM |
05-16-2024
03:10 AM
2 Kudos
Hello @ldylag, The stacktrace shows that the "org.postgresql.core.v3.ConnectionFactoryImpl" class depends on the "com/ongres/scram/common/stringprep/StringPreparation.class" - and that is not available on the classpath. By default HMS searches for the JDBC driver under the /usr/share/java. Please compare if the same JDBC driver is available on both HMS hosts. Probably the best would be to use the latest drivers from: https://jdbc.postgresql.org/download/ Best regards Miklos
... View more
02-29-2024
01:24 AM
1 Kudo
Hi @dqsdqs , Please also see the following article: https://community.cloudera.com/t5/Customer/Troubleshooting-Kerberos-Related-Issues-Common-Errors-and/ta-p/76192 Most of the times the "Server xxx not found in Kerberos database" message indicates that you need to include the server hostname in the "[domain_realm]" (host to realm mapping) section, so that the kerberos client can go to the proper KDC. Cheers Miklos
... View more
01-18-2024
05:52 AM
As far as I know this is more of an empirical best practice. As mentioned it cannot be exactly calculated since there are some variable factors (filename / path lengths, acl counts, etc) which change from environment to environment.
... View more
01-17-2024
01:07 AM
Hi @Meepoljd , The file and block metadata consumes the NameNode heap. Can you share how did your calculation? Per our docs: https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/hdfs-overview/topics/hdfs-sizing-namenode-heap-memory.html the file count should kept below 300m files. Also the same page suggests that approximately 150 bytes are needed for each namespace object, I assume you did your calculation based on that. The real NN heap consumption varies with the path lengths, ACL counts, replication factors, snapshots, operational load, etc. As such in our other page https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/hdfs-overview/topics/hdfs-examples-namenode-heap-memory.html we suggest to allocate rather a bigger heap size, 1 GB heap for 1 million blocks, which would be ~320 GB in your case. Hope this helps, Best regards, Miklos
... View more
01-13-2024
03:16 AM
Hi @mpla217 , can you share exactly what query are you executing? Per the error message the query is syntactically incorrect around the "TIMESTAMPADD" part. If that TIMESTAMPADD function is not used, then maybe the Hive JDBC driver is optimizing the query (converting it) to use that function instead of plus signs (just guessing). As per the output, you are using Cloudera Hive JDBC driver. Make sure you are using the latest version https://www.cloudera.com/downloads/connectors/hive/jdbc also you may try to use the "UseNativeQuery=1" connection url property to disable the optimizations done by the driver. (which are sometimes incorrect) Thanks Miklos
... View more
12-11-2023
02:10 AM
1 Kudo
Hi @Sokka Thank you for raising this. I see the same behavior in the "Sqoop 1" editor, though with some older CDH versions, I haven't tested with the most recent ones, maybe it's already fixed. In any case, I would not advise to use the "Sqoop 1" editor, it will quickly become insufficient, as it does not provide any advanced configurations. Instead please create an Oozie workflow (Scheduler / Workflow) with a Sqoop action, you have better controls over there. As I see this problem should have been fixed in later versions, see https://issues.cloudera.org/browse/HUE-6717 Best regards Miklos
... View more
11-08-2023
01:27 AM
Hi @HadoopHero , For Hive, if there is a single reduce task to write the output data it will not break it up the output file into smaller files, that's expected and cannot be configured to behave in a different way. With DISTRIBUTE BY you should be able to achieve to have multiple reducers (if you have a column by which you can "split" your data reasonably into smaller subsets), see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy Best regards Miklos
... View more
11-08-2023
01:17 AM
1 Kudo
To add to the point of @ggangadharan, there are lots of good articles/posts why the float and even the double datatype has these problems. Note that this is not Hive / Hadoop or Java specific. https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency https://dzone.com/articles/never-use-float-and-double-for-monetary-calculatio https://www.red-gate.com/hub/product-learning/sql-prompt/the-dangers-of-using-float-or-real-datatypes Miklos
... View more
10-30-2023
02:40 AM
Hi @cl99 , yes, seems there is a 50 MB limit for the max rpc message size in the CDH 6.3.2 version. https://github.com/apache/impala/blob/branch-3.2.0/be/src/kudu/rpc/transfer.cc#L39 This error is likely the result of the unsafe flag you have turned on. Best regards Miklos
... View more
10-11-2023
08:42 AM
Hi @JKarount , To close the loop, this has been resolved in the latest Cloudera Impala ODBC driver 2.7.0, see the Resolved Issues section in the Release Notes: https://docs.cloudera.com/documentation/other/connectors/impala-odbc/2-7-0/Release-Notes-Impala-ODBC.pdf "[IMP-946][02795738] The connector does not generate the last COALESCE parameter from the ELSE expression in the CASE statement." You can download it from: https://www.cloudera.com/downloads/connectors/impala/odbc/2-7-0.html I hope this will help you to implement your usecases as expected. Best regards Miklos Szurap Customer Operations Engineer, Cloudera
... View more