Member since
02-28-2022
171
Posts
14
Kudos Received
17
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1222 | 07-07-2025 06:35 AM | |
| 1291 | 06-17-2025 09:42 AM | |
| 3234 | 04-02-2025 07:43 AM | |
| 1017 | 10-18-2024 12:29 PM | |
| 16124 | 09-05-2024 09:06 AM |
05-30-2023
05:35 AM
hi @Juanes , great! So, let's assume I have a 500GB table and that table was created with 240 tablets, would that value be within the recommended range? other point! I'm using the following calculates as an example: DATA_SIZE = (value taken from the graph "total_kudu_on_disk_size_across_kudu_replicas") NUM_REPLICAS = RF * Total Tablets (value taken from the ksck command) TABLET_SIZE = DATA_SIZE / NUM_REPLICAS DATA_SIZE = 147G (converted to bytes, getting "157840048128") NUM_REPLICAS = 3 * 240 = 360 Name | RF | State | Total Tablets | Healthy | Recovering | Underreplicated | not available impala::DATABASE01.TABLE01 | 3 | HEALTHY | 240 | 240 | 0 | 0 | 0 TABLET_SIZE = 157840048128 / 720 = 219222289 (which equals 2.04GB) the end result was 2.04GB, does that mean each tablet has 2.04GB?
... View more
05-29-2023
02:01 PM
hi @ChethanYM , I read this documentation, but the doubt is about the tablet and table if looking at the graph in cloudera and seeing tables above 50GB it would be out of the recommended
... View more
05-26-2023
01:58 PM
hello cloudera community, we check in the graph "total_kudu_on_disk_size_across_kudu_replicas" and there are tables with 500GB with that, we need to know what is the recommended size for a kudu table?
... View more
Labels:
- Labels:
-
Apache Kudu
-
Cloudera Manager
05-26-2023
12:35 PM
hello cloudera community, we have a problem we are not able to view the execution logs of the jobs that were executed in yarn by the resourcemanager of yarn when we click on logs it presents the error: "Error getting logs at hostname:8041" how can we solve this problem? we are using cdh express 5.16.2
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager
-
MapReduce
10-14-2022
07:20 AM
hello cloudera community, we are having problem in impala after enabling kerberos on cdp cluster only the Impala StateStore role is started healthy, the other roles are in bad status checking the log of the Impala Catalog Server role, the following appears: ------------- 11:07:50.584 AM INFO cc:170 SASL message (Kerberos (internal)): GSSAPI client step 1 11:07:50.587 AM INFO cc:78 Couldn't open transport for hostname:24000 (No more data to read.) 11:07:50.587 AM INFO cc:94 Unable to connect to hostname:24000 11:07:50.587 AM INFO cc:274 statestore registration unsuccessful: Couldn't open transport for hostname:24000 (No more data to read.) 11:07:50.587 AM FATAL cc:87 Couldn't open transport for hostname:24000 (No more data to read.) . Impalad exiting. Wrote minidump to /var/log/impala-minidumps/catalogd/7ae4848b-cd34-4d4c-96cfeaa3-bd4a584f.dmp ------------- how can we solve this problem? this problem is urgent!
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
10-14-2022
07:06 AM
I managed to solve. canary timeouts have been changed: ZooKeeper Canary Connection Timeout = 30s ZooKeeper Canary Session Timeout = 1m ZooKeeper Canary Operation Timeout = 30s with that, the error no longer presented and the status is healthy 100%
... View more
10-14-2022
06:23 AM
hello cloudera community we are having a problem with zookeeper on cdp after enabling kerberos on the cluster zookeeper instances are healthy (status green), but the general status of zookeeper shows the message: Canary test of client connection to ZooKeeper and execution of basic operations succeeded though a session could not be established with one or more servers how can we solve this problem?
... View more
Labels:
- Labels:
-
Apache Zookeeper
-
Cloudera Manager
09-30-2022
06:32 AM
hello cloudera community, solved the problem by pointing hive-site.xml file in spark and spark2 so spark jobs in livy in jupyter notebook ran successfully
... View more
09-29-2022
11:55 AM
hello cloudera community,
we are having problems using livy with job spark to read hive by jupyter notebook
when we run a simple query, for example:
"spark.sql("show show databases").show()"
returns the error below "org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient"
Could you help us with this setup?
ps: we are using cdh 5.16.x
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Spark
09-21-2022
01:00 PM
hello cloudera community, we are having a problem accessing the hive cli of a certain user and the spark-shell too when executing the query "show databases" in hive cli returns the error: "FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset " when executing the "show databases" query in spark-shell returns the error: "org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException;" "WARN metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.transport.TTransportException" when we use beeline and run the query "show databases" it works without problem could you help us with this problem? we are using cloudera manager 5.16.1 and cdh 5.16.1 the cluster is with kerberos and sentry is managing the permissions to the cluster databases
... View more
Labels: