Member since
02-27-2023
37
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 9715 | 05-09-2023 03:20 AM | |
| 5179 | 05-09-2023 03:16 AM | |
| 3833 | 03-30-2023 10:41 PM | |
| 26084 | 03-30-2023 07:25 PM |
06-20-2025
12:27 PM
@csm09 As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.
... View more
09-18-2024
09:19 PM
1 Kudo
This solution worked for eliminating error , but data is not being fetched from table. empty data frame showing.
... View more
09-06-2024
10:30 AM
@BrianChan What is the postgres version being used here. Postgres versions below 9.5 doesn't support ON statement. It is only supported from Postgres 9.5 onwards. Refer https://stackoverflow.com/questions/61774741/psycopg2-errors-syntaxerror-syntax-error-at-or-near-on/61775874#61775874 Also we only support Postgres 10.x or higher versions from CDP 7.1.8.x onwards. Please check out the support matrix https://supportmatrix.cloudera.com/
... View more
07-13-2024
05:26 PM
1 Kudo
Hello @BrianChan Apologies for the delayed response Have you verified if /etc/cloudera-scm-server/db.properties has the correct entires for your database?
... View more
05-15-2024
12:52 PM
@Aqdas Welcome to the Cloudera Community! To help you get the best possible solution, I have tagged our Kerberos experts @venkatsambath @james_jones @pajoshi who may be able to assist you further.
... View more
03-06-2024
02:58 AM
1 Kudo
Hello @BrianChan, We should check the consumer offset topic (__consumer_offsets) health using the Kafka describe command in such issues. And check min.insync.replicas setting of this topic in describe command output. It should be less than or qual to topic ISR. For example: If the topic has replication factor 3 then min ISR should be 2 or 1 for failover. If you found this response assisted with your query, please take a moment to log in and click on KUDOS 🙂 & ”Accept as Solution" below this post. Thank you.
... View more
03-04-2024
02:14 AM
1 Kudo
@BrianChan Cluster Average Utilization Calculation: The cluster average utilization during HDFS rebalancing is typically calculated based on the configured capacity of the cluster. The configured capacity represents the total storage capacity allocated to the HDFS cluster as defined in the cluster's configuration settings. Individual Utilization Calculation: Individual utilization during HDFS rebalancing is usually calculated based on the sum of DFS used and remaining space for each datanode. This calculation provides an accurate representation of how much storage is currently being utilized on each datanode and how much space is available for additional data storage. Difference in File Moving Size: The difference between the initially reported file moving size and the actual file moving size in the balancer log can occur due to various factors. These may include changes in data distribution across datanodes during the rebalancing process, optimizations performed by the balancer algorithm, or adjustments made based on real-time cluster conditions and performance considerations. Exceeding DataNode Balancing Bandwidth: While the datanode balancing bandwidth is configured to limit the amount of data transferred between datanodes per second during HDFS rebalancing, it's possible for the actual bandwidth consumption to exceed this limit under certain circumstances. Factors such as network congestion, variations in data transfer rates, or optimizations performed by the balancer algorithm can contribute to bandwidth consumption exceeding the configured limit. Regards, Chethan YM
... View more
02-29-2024
09:01 AM
Maybe so, but a huge flaw of shared communities like this is the proliferation of answers to the same questions that are variations on a theme. Brian Chan had the exact issue I was seeing. Putting the answer to the problem WITH the stated problem makes more sense that simply parroting his question. Linking the two post is indeed better, but the one is forced to hop back and forth between posts for context.
... View more
05-12-2023
02:30 AM
Anyone tried that before?
... View more
04-06-2023
02:15 PM
@BrianChan You will need to manually perform the checkpoint on the faulty node. If the standby NameNode is faulty for a long time, generated edit log will accumulate. In this case, this will cause the HDFS or active NN to take a long time to restart and could even fail to restart because if the HDFS or active NameNode is restarted, the active NameNode reads a large amount of unmerged editlog. Is your NN setup active/standby? Fr the below steps you could as well use CM UI to perfom the tasks Quickest solution 1 I have had occasions when a simple rolling restart of the Zk's would resolve that biut I see the checkpoint lag goes to > 2 days Solution 2 Check the most up to date on both NN by comparing the dates of files in the directory. $ ls -lrt /dfs/nn/current/ On the Active NN with the latest editlogs as hdfs user $ hdfs dfsadmin -safemode enter $ hdfs dfsadmin -saveNamespace Check whether the latest generated fsimage timestamp is the current time. If yes, the combination is executed correctly and is complete. $ hdfs dfsadmin -safemode leave Before restarting the HDFS or active NameNode, perform a checkpoint manually to merge the metadata of the active NameNode. The restart the standby the newly generated files should now automatically be shipped and synced this could take a while < 5 minutes and your NN should all be green
... View more