About yvora

aliyesami · ‎07-14-2017

oh I am sorry the question got posted twice . the website was slow earlier and I must have clicked twice. thanks for your answer

jknulst · ‎12-18-2017

Don't forget to make changes on Zeppelin > Interpreter > Livy (& Livy2) as well: Add: zeppelin.livy.ssl.trustStore /etc/path/to/your/truststore.jks zeppelin.livy.ssl.trustStorePassword <password1234> also don't forget to change http -> https on this property: zeppelin.livy.url https://your-host:8998 These properties apply for both livy and livy2 interpreters.

nilayj · ‎06-06-2017

I installed the python interpreter and now everything works fine. Thank You.

dsubhramanian · ‎07-13-2018

Exit code 143 is due to multiple reasons. Yesterday I got the error in sqoop related to timeout. Adding -Dmapreduce.task.timeout=0 in my sqoop job resolved the issue. 18/07/12 06:40:28 INFO mapreduce.Job: Job job_1530133778859_8931 running in uber mode : false 18/07/12 06:40:28 INFO mapreduce.Job: map 0% reduce 0% 18/07/12 06:45:57 INFO mapreduce.Job: Task Id : attempt_1530133778859_8931_m_000005_0, Status : FAILED AttemptID:attempt_1530133778859_8931_m_000005_0 Timed out after 300 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143.

darrellulm · ‎09-04-2017

This worked for me, thanks.

yvora · ‎03-22-2017

Mateusz Grabowski, queue distribution ensures the capacity distribution. However it is possible that containers from different queues can run on same node manager host. In this case, execution time of a container may be affected. So, isolating queues is not sufficient. You will also need to configure CGroup for cpu isolation. Find some good links on CGroup as below. https://hortonworks.com/blog/managing-cpu-resources-in-your-hadoop-yarn-clusters/ https://hortonworks.com/blog/apache-hadoop-yarn-in-hdp-2-2-isolation-of-cpu-resources-in-your-hadoop-yarn-clusters/

aaditya_a_deshp · ‎03-22-2017

local[*] new SparkConf() .setMaster("local[2]") This is specific to run the job in local mode This is specifically used to test the code in small amount of data in local environment It Does not provide the advantages of distributed environment * is the number of cpu cores to be allocated to perform the local operation It helps in debugging the code by applying breakpoints while running from Eclipse or IntelliJ yarn-client --master yarn --deploy-mode client Yarn client mode: your driver program is running on the yarn client where you type the command to submit the spark application (may not be a machine in the yarn cluster). In this mode, although the drive program is running on the client machine, the tasks are executed on the executors in the node managers of the YARN cluster yarn-cluster --master yarn --deploy-mode cluster This is the most advisable pattern for executing/submitting your spark jobs in production Yarn cluster mode: Your driver program is running on the cluster master machine where you type the command to submit the spark application

yvora · ‎03-03-2017

@Will Dailey, The JVM heap can be preconfigured for memory boundaries – the initial heap size (defined by the –Xms option) and the maximum heap size (defined by the –Xmx option). Used memory from the perspective of the JVM is Working set + Garbage. The Committed memory is a measure of how much memory the JVM heap is really consuming. If ( –Xms < -Xmx ) and (Used memory == Current heap size), the JVM is likely to grow its heap after a full garbage collection. However if ( –Xms or Current heap size == -Xmx ), either through heap growth or configuration, the heap cannot grow any further. Both of these Values are important to debug datanode failure related Out of Memory, less PermGen Errors, GC errors. Reference link : https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html

yvora · ‎02-11-2017

@Shigeru Takehara, You can definitely have different nodes with different memory/cpu in a cluster. You can have a 5GB memory and 2 Cores machine as one of your nodes without changing max to 5GB globally. You can set yarn.nodemanager.resource,memory-mb=20000 on machines with 20gb memory and set yarn.nodemanager.resource,memory-mb=5000 on machine with 5gb memory. You can also manage different configuration on different node managers using ambari. Its called host config groups. https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-user-guide/content/using_host_config_groups.html

kbadani · ‎02-09-2017

@yvora Its because %livy.sql interpreter runs in the yarn-cluster mode whereas %sql interpreter runs in the yarn-client mode. Hence %sql can find the local file on the client-machine whereas %livy.sql wont be able to find. Try putting file in HDFS and use LOAD DATA INPATH rather than LOAD DATA LOCAL INPATH. It should work

Online	Offline
Last Visited	‎10-25-2018 06:32 PM

Member Since	‎10-24-2015 06:41 PM
Last Visited	‎10-25-2018 06:32 PM
Posts	171
Kudos received	375

Cloudera Community

Re: yarn cache files does not have execute permiss...

Re: What is the use of zookeeper.out?

Re: how to know the reason for missing blocks

Re: Best way to monitor/move hadoop files through ...

Re: Limit in number of Yarn jobs

Re: can't find parameter dfs.data.dirs

Re: Enable SSL for Livy Server

Re: How to run a standalone python interpreter in ...

Re: Help Troubleshoot - 'Container killed by the A...

Re: spark 2.1 properties (spark-env.sh, spark-defa...

Re: Spark job in YARN queue depends on jobs in ano...

Re: Difference between local[*] vs yarn cluster vs...

Re: Difference between DataNode JVM Heap Memory Us...

Re: How many node managers should be installed?

Re: livy sql interpreter can not find source file