Member since
04-14-2020
2182
Posts
4
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
559 | 09-22-2020 12:19 AM | |
1215 | 07-07-2020 04:56 AM | |
745 | 05-15-2020 12:20 AM | |
8875 | 05-14-2020 04:29 AM |
08-04-2021
01:56 AM
Hi @iamfromsky , Thank you for reaching out to our community! The error message which you have provided is logged when either a "broken pipe" or "connection reset" happens, which is most likely network-related. Please check if the network is stable when you see these errors. Also refer Jira HDFS-8814 for more details.
... View more
04-25-2021
11:07 PM
Hi @SimoneMasc , Thank you for reaching out to Cloudera Community! I would request you to please review the below documentation on basic requirements (OS/DB/Java/Network/Platform etc.) https://docs.cloudera.com/cdp/latest/release-guide/topics/cdpdc-requirements-supported-versions.html Also, https://docs.cloudera.com/cdp-private-cloud/latest/data-migration/topics/cdp-data-migration-machine-learning-to-cdp.html which describes about data migration.
... View more
12-02-2020
10:21 AM
Hi @CaptainJa , Thank you for reaching out to community! It looks like you are hitting Ambari UI Bug (https://issues.apache.org/jira/browse/AMBARI-21854). Also, can you please confirm if the operating system you are using is RHEL7? because i see the base url used for repositories is for Centos6. E nsure to update the repository to use RedHat Satellite if the OS is Redhat7.
... View more
11-02-2020
11:54 PM
Hi @regeamor, Thank you for reaching out to community! The DistCp command submits a regular MapReduce job that performs a file-by-file copy. The block locations of the file is obtained from the name-node during MapReduce. On DistCp, each Mapper will be initiated, if possible, on the node where the first block of the file is present. In cases where the file is composed of multiple splits, they will be fetched from the neighbourhood if not available on the same node. The fs.s3a.fast.upload option significantly accelerates data upload by writing the data in blocks, possibly in parallel. Please refer on How to improve performance for DistCp for more details.
... View more
09-22-2020
07:49 AM
Hello @matagyula, Thank you for reaching out to community! Please check if article https://community.cloudera.com/t5/Customer/DataNode-Block-Count-Threshold-alerts-are-displayed-in/ta-p/302261 helps you on this issue.
... View more
09-22-2020
12:19 AM
Hello @Mondi , When you install CDP Trial Version, it includes an embedded PostgreSQL database and is not suitable for a production environment. Please check this information for more details. Also, find how to end the trial or upgrade trial version and Managing licenses
... View more
08-20-2020
05:51 AM
Hi @K_K, Thank you for reaching out to community! This is known issue for Ambari version 2.5.1. Please check https://issues.apache.org/jira/browse/AMBARI-21151 for more information.
... View more
08-19-2020
06:42 AM
Hi @rohit19, As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.
... View more
07-07-2020
04:56 AM
2 Kudos
Hi @shrikant_bm , When ever Active NameNode server goes down, its associated daemon also goes down. HA works in the same way whenever Active NameNode daemon or server goes down. ZKFC will not receive the heartbeat and the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered. To answer your question: yes, in both the cases mentioned by you, HA should work.
... View more
07-06-2020
03:34 AM
Hi @shrikant_bm , Please find answers inline. 1. When active namenode server is rebooted will the standby namenode will not become active? Is this something expected? Or Did the HA did not work in our cluster Yes, standby NameNode will become active when primary NameNode reboots, provided high availability is enabled and configured. 2. Whether the HA is expected to work only between the active and standby namenode daemons? Yes, HA works only between active and standby NameNode. It is taken care by ZKFC ( ZooKeeper Failover Controller).
... View more
07-06-2020
12:42 AM
Hi @shrikant_bm , Thank you for reaching out to community! If NameNode high availability is enabled and configured on your cluster. Automatic failover of active NameNode should work. [1] Give us steps on configuring NN high availability using Ambari. https://docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/managing-high-availability/content/amb_enable_namenode_high_availability.html [2] Gives us steps on Managing High Availability of Services for other components. https://docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/managing-high-availability/content/amb_managing_high_availability_of_services.html
... View more
06-29-2020
02:49 AM
Hi @SeanU Thank you for reaching out to community! To better assist you with this post, could you tell us if you are using Ambari or Cloudera Manager to manage your cluster? Yarn application logs can be monitored using Yarn Web UI. For more details, please check [1] and [2] based on the distribution you are using. There are many other tools available for monitoring Hadoop clusters as mentioned in [3] [1] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_dg_yarn_applications.html [2] https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/data-operating-system/content/monitoring_clusters_using_yarn_web_user_interface.html [3] https://community.cloudera.com/t5/Support-Questions/what-is-the-most-best-monitoring-tool-for-hadoop-clusters/td-p/176256 .
... View more
06-18-2020
04:44 AM
Hi @Udhav , Please check below solved post if it can help you with this issue. https://community.cloudera.com/t5/Support-Questions/unable-to-start-ambari-server/td-p/158488
... View more
06-05-2020
06:51 AM
Hi @galzoran , Kindly check if you are hitting below known issue. https://my.cloudera.com/knowledge/How-to-delete-growing-Cloudera-Manager-database-AUDITS-table?id=75896
... View more
05-25-2020
11:31 PM
Hello @sarm , what are the impacts of changing service account password in kerberized cluster ? Service accounts like hdfs, hbase, spark, etc. password rely on keytabs. It has principals which look like any other normal user principal but they do rely on having valid keytabs around. If the passwords for these service accounts expire/ change then you will need to re-generate keytabs for them once the password is updated. You can re-generate these keytabs in Ambari by going to the Kerberos screen and pressing the "Regenerate Keytabs" button. This will also automatically distribute the keytabs where they are needed. Note it's always best to restart the cluster when you do this. NOTE:- for better smoothness of this process please try changing password for one service account followed by service restart and observe if any impact is there and then proceed for other Service Accounts. To answer your question, changing the password of the service accounts would not affect the running services since the passwords are not used to start the service. Hence passwords are not required during the service startup or during the life time of process.
... View more
05-20-2020
10:50 AM
Hi @sarm, Thank you for reaching out to Cloudera Community! To better assist you with this issue, it would be great if you could please help to provide the following additional information: 1) Is the cluster on which you are planning to change password is kerberized ? 2) Do you use CM or Ambari for managing your cluster?
... View more
05-15-2020
12:20 AM
1 Kudo
Hello @cyborg , Thank you for reaching out to Community! There are two ways to place a node in maintenance mode. 1) Select the host --> Select Actions for Selected > Begin Maintenance (Suppress Alerts/Decommission). The Begin Maintenance (Suppress Alerts/Decommission) dialog box opens. The role instances running on the hosts display at the top.Deselect the Decommission Host(s) option and Click Begin Maintenance. The Host Decommission Command dialog box opens and displays the progress of the command. To Exit Maintenance : Select the host --> Select Actions for Selected > End Maintenance > Deselect the Recommission Host(s) option and Click End Maintenance. This will re-enable alerts for the host. By using first option, It does not prevent events from being logged; it only suppresses the alerts that those events would otherwise generate. You can see a history of all the events that were recorded for entities during the period that those entities were in maintenance mode.This can be useful when you need to take actions in your cluster (make configuration changes and restart various elements) and do not want to see the alerts that will be generated due to those actions. For more details, refer https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_maint_mode.html#cmug_topic_14_1 2) Select the host --> Select Actions for Selected > Begin Maintenance (Suppress Alerts/Decommission). The Begin Maintenance (Suppress Alerts/Decommission) dialog box opens. The role instances running on the hosts display at the top > Select Decommission Host(s). If the selected host is DataNode role, you can specify whether or not to replicate under-replicated data blocks to other DataNodes to maintain the cluster's replication factor. If the host is not running a DataNode role, you will only see the Decommission Host(s) option and Click Begin Maintenance.The Host Decommission Command dialog box opens and displays the progress of the command. To Exit Maintenance : Select the host --> Select Actions for Selected> Select Recommission Host(s). > choose to bring hosts online and start all roles or choose to bring hosts online and start roles later > Click End Maintenance. By using the second option, You can perform minor maintenance on cluster hosts such as adding memory or changing network cards or cables where the maintenance window is expected. In your case: you can suppress alerts, follow the 1st path that you described in the question (for taking down single node for few hours, when no under-replicated factor and your replication factor is more than 1)
... View more
05-14-2020
04:29 AM
Hi @Amn_468 , Thank you for replying back. Kindly try Increasing spark.rpc.askTimeout from default 120 seconds to a higher value in Ambari UI -> Spark Configs -> spark2-defaults. Recommendation is to increase it to at least 480 seconds and restart the necessary services.possibly the Driver and Executer are not able to get Heartbeat response in configured timeout. If you don’t want to do any cluster level change then you may try overriding this value in the job level. For example: spark-submit by adding --conf spark.rpc.askTimeout=600s while submitting the job
... View more
05-14-2020
03:53 AM
Hello @rvillanueva , You can check how many threads are used by a user by running ps -L -u <username> | wc -l if the user’s open files limit ( ulimit -n <user name >) is hit then the user can’t spawn any further more threads. Most possible reasons in this case could be, Same user running other jobs and having open files on the node where it tries to launch/spawn the container. systems thread might have excluded. see which application is running and what is their current open files Kindly check application log ( application_XXX),if available and see which phase it throw's the exception and on which node the issue is faced.
... View more
05-13-2020
11:40 PM
Hello @Amn_468 , To better assist you with this issue, it would be great if you could please help to provide the following additional information: 1) Is this issue occurring for all jobs or only some jobs? If the issue has only started recently, does this coincide with any code or configuration changes in the job itself or configuration changes in the cluster?
... View more
05-13-2020
12:47 PM
Hi @Yong , you can try setting ACLS on HDFS. Below is the doc which gives more details. https://docs.cloudera.com/runtime/7.0.3/hdfs-acls/topics/hdfs-acls-features.html
... View more