Member since
02-18-2016
64
Posts
12
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7282 | 05-31-2016 01:39 PM |
08-03-2016
02:50 PM
thank you very much Ryan
... View more
08-03-2016
06:55 PM
@Kuldeep Kulkarni I've all these properties in place. Tez view is working fine, however it's not showing any jobs after we implemented kerberos.
... View more
01-29-2018
02:04 PM
Hi Rahul, no effect the change... please advice.
... View more
05-31-2016
01:39 PM
Thank you Ravi. HA Zookeeper Connection State: Auto Failover is not enabled, after enabling this it worked.
... View more
05-13-2016
09:36 PM
If there are no HA components (Namenode HA and 2 hiveserver2 instances), then there is no depending with zookeeper. Check the hiveserver2.log from /var/log/hiveserver2/hiveserver2.log to see if you see any errors. If you have 2 hiveserver2 instances, they will register with zookeeper which may be when they are running into issues.
... View more
05-12-2016
10:39 AM
Hi @kavitha velaga for this kind of monitoring, I'd suggest using an external monitoring framework, something like Munin, Ganglia or whatever framework you already use within your org. Most of these frameworks can handle the recording of Round Trip Times (RTT) from hosts to something like an s3 endpoint. Hope that helps.
... View more
05-11-2016
04:14 PM
1 Kudo
I haven't run into any 'cons' of using YARN ACLs. ACLs will help you as more users/groups of users end up using the cluster. If you are using hiveserver2 with doAs as 'false', then all your jobs are running as user hive. In that situation, you have be careful on which queues the jobs are submitted to since YARN ACLs will not help you there.
... View more
05-19-2016
09:55 AM
4 Kudos
@kavitha velaga 1. Number of mappers depends on InputSplit of the file and hadoop launches mappers as much as reqired. User do not have direct control to set number of mapper via property. 2. To control the number of mapper, user has to control the number of inputsplit which is not necessary until there is requirement of custom logic. 3. User can control the number of reducer for a MR job by setting this property : job.setNumReduceTasks(numOfReducer); numOfReducer can have value from 0 to any positive integer. if you choose 0 then MR job will be mapper only job(no reducer means no aggregation) There are some usecases where Reducer is not necessary so putting numOfReducer=0 will make MR job to finish quickly (as job avoid shuffle and sorting). 4. Container size depends on how much memory your program would require in general. 5. Distcp - This ticket https://issues.apache.org/jira/browse/HDFS-7535 has improved distcp performance. To make distcp run quicker we might disable post copy check like checksum but then we trade-off with reliability. Hope this helps
... View more
05-11-2016
04:14 PM
Thank you Kuldeep and Jeff
... View more
05-05-2016
09:02 AM
Hi,
The Cloudera "documentation" you reference here is actually an 8-year-old blog post. I would defer to the more current docs.
... View more