Member since
04-14-2020
2174
Posts
4
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
559 | 09-22-2020 12:19 AM | |
1215 | 07-07-2020 04:56 AM | |
745 | 05-15-2020 12:20 AM | |
8871 | 05-14-2020 04:29 AM |
03-23-2022
01:21 AM
1 Kudo
oh, this is a long time ago issue, the root cause is because new machines charset is not utf-8, just keep all the machines chaset is utf-8 , then its ok.
... View more
04-25-2021
11:07 PM
Hi @SimoneMasc , Thank you for reaching out to Cloudera Community! I would request you to please review the below documentation on basic requirements (OS/DB/Java/Network/Platform etc.) https://docs.cloudera.com/cdp/latest/release-guide/topics/cdpdc-requirements-supported-versions.html Also, https://docs.cloudera.com/cdp-private-cloud/latest/data-migration/topics/cdp-data-migration-machine-learning-to-cdp.html which describes about data migration.
... View more
12-04-2020
03:21 AM
Hello @Madhur Thanks a lot for the reply. I can confirm that the operating system is rhel7. The base url used was a configuration setting passed down but we have used it for other clusters without issues. I will nonetheless check with the client to make sure it is correct. Concerning the link to the bug report, the upgrade was done for Ambari 2.6.2.2 while the mentioned bug was fixed in version 2.6.0.0. Also, the scenarios presented in the bug are a bit different in our case. Thanks a lot for the help
... View more
11-03-2020
07:30 AM
Thanks, What I am experiencing is that the complete file, if 300GB, has to be assembled before upload to S3. This requires either 300GB of memory or disk. Distcp does not create a part file per block. I have not witnessed any file split being done. Multi part uploads require you get an upload ID and upload many part files with a numeric extension and in the end ask S3 to put them back together. I do not see any of this being done. I admit I do not know much about all this and it could be happening out of my sight.
... View more
09-30-2020
03:51 AM
@rbloughThank you for the continued support. 2) The command is being run as the hdfs user. 1) The detailed output showed that there are 603,723 blocks in total. Looking at the HDFS UI, the Datanodes report having 586,426 blocks each. 3) hdfs fsck / -openforwrite says that there are 506,549 blocks in total. The discrepancy in block count seems to be there still. Below are the summaries of the different fsck outputs. hdfs fsck / -files -blocks -locations -includeSnapshots Status: HEALTHY Number of data-nodes: 3 Number of racks: 1 Total dirs: 64389 Total symlinks: 0 Replicated Blocks: Total size: 330079817503 B (Total open files size: 235302 B) Total files: 625308 (Files currently being written: 129) Total blocks (validated): 603723 (avg. block size 546740 B) (Total open file blocks (not validated): 122) Minimally replicated blocks: 603723 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Missing blocks: 0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Blocks queued for replication: 0 Erasure Coded Block Groups: Total size: 0 B Total files: 0 Total block groups (validated): 0 Minimally erasure-coded block groups: 0 Over-erasure-coded block groups: 0 Under-erasure-coded block groups: 0 Unsatisfactory placement block groups: 0 Average block group size: 0.0 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 Blocks queued for replication: 0 FSCK ended at Wed Sep 30 12:23:06 CEST 2020 in 23305 milliseconds hdfs fsck / -openforwrite Status: HEALTHY Number of data-nodes: 3 Number of racks: 1 Total dirs: 63922 Total symlinks: 0 Replicated Blocks: Total size: 329765860325 B Total files: 528144 Total blocks (validated): 506549 (avg. block size 651004 B) Minimally replicated blocks: 506427 (99.975914 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.9992774 Missing blocks: 0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Blocks queued for replication: 0 Erasure Coded Block Groups: Total size: 0 B Total files: 0 Total block groups (validated): 0 Minimally erasure-coded block groups: 0 Over-erasure-coded block groups: 0 Under-erasure-coded block groups: 0 Unsatisfactory placement block groups: 0 Average block group size: 0.0 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 Blocks queued for replication: 0 FSCK ended at Wed Sep 30 12:28:06 CEST 2020 in 11227 milliseconds
... View more
09-22-2020
12:19 AM
Hello @Mondi , When you install CDP Trial Version, it includes an embedded PostgreSQL database and is not suitable for a production environment. Please check this information for more details. Also, find how to end the trial or upgrade trial version and Managing licenses
... View more
08-24-2020
01:01 AM
Hello Madhur, Thanks for your response. But, we can see the same issue in Ambari 2.7.4 which we have installed just last weeks to overcome this issue. Can you please help on this regards? Thanks, KK
... View more
08-19-2020
06:42 AM
Hi @rohit19, As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.
... View more
07-08-2020
11:59 PM
@SeanU This level of detailed log scanning and alert functionality is not available. The existing service role logs for which rules can be set will not contain each application exceptions logged since detailed information is present in the application logs. You can check the available job history server logs and resource manager logs available to check if the logged in information during application run time helps serve your purpose.
... View more
07-07-2020
04:56 AM
2 Kudos
Hi @shrikant_bm , When ever Active NameNode server goes down, its associated daemon also goes down. HA works in the same way whenever Active NameNode daemon or server goes down. ZKFC will not receive the heartbeat and the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered. To answer your question: yes, in both the cases mentioned by you, HA should work.
... View more
06-19-2020
06:43 AM
@Udhav It appears your mysql is not running after a restart. Try this: service mariabdb start chkconfig mariadb on Also, I would refrain from using "localhost" or "127.0.0.1". This opens doors for networking and permissions issues. Please always use an FQDN, for example: c3701.ambari.apache.org that is mapped to public IP addresses on any given node via /etc/hosts. That FQDN should also be the hostname for each node as well. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
06-05-2020
06:51 AM
Hi @galzoran , Kindly check if you are hitting below known issue. https://my.cloudera.com/knowledge/How-to-delete-growing-Cloudera-Manager-database-AUDITS-table?id=75896
... View more
05-25-2020
11:31 PM
Hello @sarm , what are the impacts of changing service account password in kerberized cluster ? Service accounts like hdfs, hbase, spark, etc. password rely on keytabs. It has principals which look like any other normal user principal but they do rely on having valid keytabs around. If the passwords for these service accounts expire/ change then you will need to re-generate keytabs for them once the password is updated. You can re-generate these keytabs in Ambari by going to the Kerberos screen and pressing the "Regenerate Keytabs" button. This will also automatically distribute the keytabs where they are needed. Note it's always best to restart the cluster when you do this. NOTE:- for better smoothness of this process please try changing password for one service account followed by service restart and observe if any impact is there and then proceed for other Service Accounts. To answer your question, changing the password of the service accounts would not affect the running services since the passwords are not used to start the service. Hence passwords are not required during the service startup or during the life time of process.
... View more
05-15-2020
12:20 AM
1 Kudo
Hello @cyborg , Thank you for reaching out to Community! There are two ways to place a node in maintenance mode. 1) Select the host --> Select Actions for Selected > Begin Maintenance (Suppress Alerts/Decommission). The Begin Maintenance (Suppress Alerts/Decommission) dialog box opens. The role instances running on the hosts display at the top.Deselect the Decommission Host(s) option and Click Begin Maintenance. The Host Decommission Command dialog box opens and displays the progress of the command. To Exit Maintenance : Select the host --> Select Actions for Selected > End Maintenance > Deselect the Recommission Host(s) option and Click End Maintenance. This will re-enable alerts for the host. By using first option, It does not prevent events from being logged; it only suppresses the alerts that those events would otherwise generate. You can see a history of all the events that were recorded for entities during the period that those entities were in maintenance mode.This can be useful when you need to take actions in your cluster (make configuration changes and restart various elements) and do not want to see the alerts that will be generated due to those actions. For more details, refer https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_maint_mode.html#cmug_topic_14_1 2) Select the host --> Select Actions for Selected > Begin Maintenance (Suppress Alerts/Decommission). The Begin Maintenance (Suppress Alerts/Decommission) dialog box opens. The role instances running on the hosts display at the top > Select Decommission Host(s). If the selected host is DataNode role, you can specify whether or not to replicate under-replicated data blocks to other DataNodes to maintain the cluster's replication factor. If the host is not running a DataNode role, you will only see the Decommission Host(s) option and Click Begin Maintenance.The Host Decommission Command dialog box opens and displays the progress of the command. To Exit Maintenance : Select the host --> Select Actions for Selected> Select Recommission Host(s). > choose to bring hosts online and start all roles or choose to bring hosts online and start roles later > Click End Maintenance. By using the second option, You can perform minor maintenance on cluster hosts such as adding memory or changing network cards or cables where the maintenance window is expected. In your case: you can suppress alerts, follow the 1st path that you described in the question (for taking down single node for few hours, when no under-replicated factor and your replication factor is more than 1)
... View more
05-14-2020
06:02 AM
Hi@Madhur Appreciate your assistance, I am using CM, where would this setting be in CM for making the changes in Cluster level, and to confirm these values have to be passed in seconds ? Could you also provide steps / document outlining how to change this while submitting spark jobs.
... View more
05-14-2020
03:53 AM
Hello @rvillanueva , You can check how many threads are used by a user by running ps -L -u <username> | wc -l if the user’s open files limit ( ulimit -n <user name >) is hit then the user can’t spawn any further more threads. Most possible reasons in this case could be, Same user running other jobs and having open files on the node where it tries to launch/spawn the container. systems thread might have excluded. see which application is running and what is their current open files Kindly check application log ( application_XXX),if available and see which phase it throw's the exception and on which node the issue is faced.
... View more
05-13-2020
12:47 PM
Hi @Yong , you can try setting ACLS on HDFS. Below is the doc which gives more details. https://docs.cloudera.com/runtime/7.0.3/hdfs-acls/topics/hdfs-acls-features.html
... View more