Member since
10-24-2015
171
Posts
379
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2744 | 06-26-2018 11:35 PM | |
4447 | 06-12-2018 09:19 PM | |
2932 | 02-01-2018 08:55 PM | |
1499 | 01-02-2018 09:02 PM | |
6909 | 09-06-2017 06:29 PM |
07-14-2017
12:29 AM
oh I am sorry the question got posted twice . the website was slow earlier and I must have clicked twice. thanks for your answer
... View more
12-18-2017
09:46 AM
Don't forget to make changes on Zeppelin > Interpreter > Livy (& Livy2) as well: Add: zeppelin.livy.ssl.trustStore /etc/path/to/your/truststore.jks
zeppelin.livy.ssl.trustStorePassword <password1234> also don't forget to change http -> https on this property: zeppelin.livy.url https://your-host:8998 These properties apply for both livy and livy2 interpreters.
... View more
06-06-2017
08:55 AM
I installed the python interpreter and now everything works fine. Thank You.
... View more
07-13-2018
08:54 PM
1 Kudo
Exit code 143 is due to multiple reasons. Yesterday I got the error in sqoop related to timeout. Adding -Dmapreduce.task.timeout=0 in my sqoop job resolved the issue. 18/07/12 06:40:28 INFO mapreduce.Job: Job job_1530133778859_8931 running in uber mode : false 18/07/12 06:40:28 INFO mapreduce.Job: map 0% reduce 0% 18/07/12 06:45:57 INFO mapreduce.Job: Task Id : attempt_1530133778859_8931_m_000005_0, Status : FAILED AttemptID:attempt_1530133778859_8931_m_000005_0 Timed out after 300 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143.
... View more
09-04-2017
04:49 PM
This worked for me, thanks.
... View more
03-22-2017
11:40 PM
2 Kudos
Mateusz Grabowski, queue distribution ensures the capacity distribution. However it is possible that containers from different queues can run on same node manager host. In this case, execution time of a container may be affected. So, isolating queues is not sufficient. You will also need to configure CGroup for cpu isolation. Find some good links on CGroup as below. https://hortonworks.com/blog/managing-cpu-resources-in-your-hadoop-yarn-clusters/ https://hortonworks.com/blog/apache-hadoop-yarn-in-hdp-2-2-isolation-of-cpu-resources-in-your-hadoop-yarn-clusters/
... View more
03-22-2017
06:31 AM
2 Kudos
local[*] new SparkConf() .setMaster("local[2]")
This is specific to run the job in local mode This is specifically used to test the code in small amount of data in local environment It Does not provide the advantages of distributed environment * is the number of cpu cores to be allocated to perform the local operation It helps in debugging the code by applying breakpoints while running from Eclipse or IntelliJ
yarn-client --master yarn --deploy-mode client
Yarn client mode: your driver program is running on the yarn client where you type the command to submit the spark application (may not be a machine in the yarn cluster). In this mode, although the drive program is running on the client machine, the tasks are executed on the executors in the node managers of the YARN cluster yarn-cluster --master yarn --deploy-mode cluster
This is the most advisable pattern for executing/submitting your spark jobs in production Yarn cluster mode: Your driver program is running on the cluster master machine where you type the command to submit the spark application
... View more
03-03-2017
11:36 PM
4 Kudos
@Will Dailey, The JVM heap can be preconfigured for memory boundaries – the initial heap size (defined by the –Xms option) and the maximum heap size (defined by the –Xmx option). Used memory from the perspective of the JVM is Working set + Garbage. The Committed memory is a measure of how much memory the JVM heap is really consuming. If ( –Xms < -Xmx ) and (Used memory == Current heap size), the JVM is likely to grow its heap after a full garbage collection. However if ( –Xms or Current heap size == -Xmx ), either through heap growth or configuration, the heap cannot grow any further. Both of these Values are important to debug datanode failure related Out of Memory, less PermGen Errors, GC errors. Reference link : https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html
... View more
02-11-2017
05:28 AM
1 Kudo
@Shigeru Takehara, You can definitely have different nodes with different memory/cpu in a cluster. You can have a 5GB memory and 2 Cores machine as one of your nodes without changing max to 5GB globally. You can set yarn.nodemanager.resource,memory-mb=20000 on machines with 20gb memory and set yarn.nodemanager.resource,memory-mb=5000 on machine with 5gb memory. You can also manage different configuration on different node managers using ambari. Its called host config groups. https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-user-guide/content/using_host_config_groups.html
... View more
02-09-2017
01:24 AM
4 Kudos
@yvora Its because %livy.sql interpreter runs in the yarn-cluster mode whereas %sql interpreter runs in the yarn-client mode. Hence %sql can find the local file on the client-machine whereas %livy.sql wont be able to find. Try putting file in HDFS and use LOAD DATA INPATH rather than LOAD DATA LOCAL INPATH. It should work
... View more
- « Previous
-
- 1
- 2
- Next »