Member since
11-01-2017
35
Posts
0
Kudos Received
0
Solutions
10-05-2018
01:28 PM
If you are running MapReduce on a single node, it will take more time than a sequential application due to the job creation overhead that MapReduce must undertake. There is extra time taken in the case of MapReduce to submit the job, copy the code and dependencies into a YARN container, and start the job. As you scale out to several nodes and more, you will see the performance benefits of MapReduce. In general however, MapReduce is used less often now on the platform - Hive runs on Tez now rather than MapReduce and I've only seen MapReduce of late being used for things like bulk loading data into HBase/Druid. In-memory processing, the likes of which both Hive/LLAP and Spark provide, can net you a significant performance boost depending on what you're trying to accomplish and the tool best suited for the job.
... View more
10-05-2018
02:30 PM
@Maryem Mary, Did this work for you? Please take a moment to login and "Accept" the answer if this helped. This will be really useful for other community users 🙂
... View more
11-23-2017
12:23 PM
So you miss the zkServer.sh script??? can you share the output? #ls -lrth /usr/hdp/current/zookeeper-server/bin/
... View more
11-23-2017
11:05 AM
I get this error while trying to start zookeeper: /usr/hdfp/2.6.3-235/zookeeper/bin/zkServer.sh start /zkServer.sh: Permission denied
... View more