2206
Posts
230
Kudos Received
82
Solutions
About
My expertise is not in hadoop but rather online communities, support and social media. Interests include: photography, travel, movies and watching sports.
My Accepted Solutions
Title | Views | Posted |
---|---|---|
438 | 05-07-2025 11:41 AM | |
904 | 02-27-2025 12:49 PM | |
2774 | 06-29-2023 05:42 AM | |
2361 | 05-22-2023 07:03 AM | |
1725 | 05-22-2023 05:42 AM |
08-18-2016
05:04 AM
Looking over the CCA Spark and Hadoop Developer Certification page at the bottom is a exam delivery and cluster information section with the following information.
All other websites, including Google/search functionality is disabled. You may not use notes or other exam aids.
... View more
08-17-2016
01:54 PM
Sorry about the delay in responding @Megh. If you look at the DE575 certification page on the Cloudera website you can see that the cluster setup includes Cloudera HUE.
... View more
08-12-2016
05:42 AM
I'm happy to see that you were able to resolve the issue. I'm also impressed with your use of the profile card option. It really makes you stand out in the crowd. 🙂
... View more
08-04-2016
07:52 AM
Hi I will do , sqoop job --show Jobname it show illegal arguement exception.
... View more
07-23-2016
05:07 AM
3 Kudos
I spoke with some of my contacts about this one and here is their response. I hope it helps.
This warning message indicates a potential performance problem which may be occurring for different reasons, from disk/network latency to high CPU load to GC pauses, to mention a few. Based on our earlier experience, I suggest to check/verify the followings: 1. the latency of the network services the Standby NameNode (LDAP/AD, NTP, DNS) uses 2. the possible disk overload (ideally dedicate individual disks to separate the IO loads of the QuorumJournalNode [edit logs storage], NameNode [checkpointing!], and Zookeeper [znode persistency] services), thus the use of NFS mounted storage should be avoided 3. check/verify the GC activity of the Standby NameNode process ('jstat' command, service logs) by running the following two commands in parallel on the Standby NameNode until after you receive another alert in Cloudera Manager: jstat -gc -t -h30 <SBNN JVMPID> 2s jstat -gcutil -t -h30 <SBNN JVMPID> 2s 4. corresponding to the occasionally high GC activity, you may need to increase the heap size on both NameNodes 5. the RPC handler counts should also be set properly to match the occasional large list loads (similar to 'hadoop fsck /'), which could increase latencies if run too often
Generally speaking, the increased RPC latency has two parts, the average time the requests spend in the queue (controlled by the NameNode Handler Count property) and the time needed to process the requests. The length of this latter depends on the performance of the HDFS metadata (edit logs, fsimage) directory. The Cloudera Manager Healt Check alert message contains both the queue and the processing times.
In cases of extremely high activity, such as an attempt to decommission then recommission multiple datanodes or a large number of YARN reducers or Flume/Sqoop data ingestion processes or HBase bulk data load, a lot of edit logs can be generated by the Active Namenode. The process of synchronizing the edits with each JournalNode and sending them to the Standby NameNode and the Standby Namenode checkpointing can be highly I/O hungry. While the Standby NameNode is checkpointing it is not accepting edits from the JournalNodes. The JournalNodes might be having trouble keeping in sync which delayed edits being relayed to the Standby NameNode. This in turn can result in network latencies/delays on the Standby NameNode.
The "rpc_call_queue_len_avg" graphs for the NameNode can also be checked to see if it has any continuous spikes or curves. Ideally that should be 0, indicating that the handlers are sufficient. If not, the value of the 'dfs.datanode.handler.count', the 'dfs.namenode.handler.count' and the 'dfs.namenode.service.handler.count' properties can be bumped. The values of the 'dfs.namenode.handler.count' and the 'dfs.namenode.service.handler.count' both should be the ln (# of cluster nodes)*20 while the 'dfs.datanode.handler.count' is the tenth of these values.
Finally, there can be another special condition when Cloudera Manager health check emits this alert: if the NameNode Health Check interferes with the regular NameNode checkpointing.
... View more
07-22-2016
05:42 AM
The first thing to look at is the amount of RAM allocated to the VM. If you are using Cloudera Manager you need a minimum of 8gb of RAM. Depending on what you are doing with the VM you may need to go above the minimum.
... View more
07-22-2016
04:52 AM
I am happy to hear that you are now up and running. Best of luck.
... View more
07-12-2016
11:12 AM
Great to hear you both have gotten over the speedbump here. Best of luck with the Quickstart VM. 🙂
... View more
06-23-2016
10:40 AM
@Stewart12586, this thread may be of assistance in your situation. 🙂
... View more