About majnam

majnam · ‎05-18-2018

Hi! Im using Spark2 on YARN and I have some weird question maybe. I want to turn off Spark History server but I have Spark jobs running on YARN. Can you tell me, if I turn off Spark History server what will happen? Will my applications die or I wont be able to submit more jobs or? Thanks in advance I really appriciate any help 🙂

majnam · ‎01-31-2018

@Gour Saha Can you please help me?

majnam · ‎12-05-2017

Thank you very much sir. You solved my case 🙂

majnam · ‎12-05-2017

Hi I'm trying to make HA Hadoop client for Spark job (need for spark warehouse) which will switch from NN1 to NN2 if NN1 breaks down. public class ConfigFactoryTest { public static void main(String [] args) throws IOException { HdfsConfiguration conf = new HdfsConfiguration(true); conf.set("fs.defaultFS", "hdfs://bigdata5.int.ch:8020"); conf.set("fs.default.name", conf.get("fs.defaultFS")); conf.set("dfs.nameservices","hdfscluster"); conf.set("dfs.ha.namenodes.nameservice1", "nn1,nn2"); conf.set("dfs.namenode.rpc-address.hdfscluster.nn1","bigdata1.int.ch:8020"); conf.set("dfs.namenode.rpc-address.hdfscluster.nn2", "bigdata5.int.ch:8020"); conf.set("dfs.client.failover.proxy.provider.hdfscluster","org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); FileSystem fs = FileSystem.get(conf); while(true){ FileStatus[] fsStatus = fs.listStatus(new Path("/")); for(int i = 0; i < fsStatus.length; i++) { System.out.println(fsStatus[i].getPath().toString()); } } } } Or I followed examples but when I tried to turn NN1 while this client is running, I'm getting exception that NN1 isnt available anymore and application is shutting down. Can someone point me in right direction? Thank you

majnam · ‎11-22-2017

Do you need to pay for SmartSense or? @Artem Ervits

majnam · ‎11-06-2017

Frequently, very frequently while I'm trying to run Spark Application this is kind of error I'm meeting with: 17/11/06 13:58:57 WARN DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1246657973-10.60.213.61-1495788390217:blk_1076301910_2561450 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2464) at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:843) 17/11/06 13:58:57 WARN DFSClient: Error Recovery for block BP-1246657973-10.60.213.61-1495788390217:blk_1076301910_2561450 in pipeline DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-dc399cb9-1705-4471-aad7-db328b1a4d94,DISK], DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK]: bad datanode DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-dc399cb9-1705-4471-aad7-db328b1a4d94,DISK] 17/11/06 13:59:01 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception java.io.IOException: All datanodes DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1227) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:999) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:506) 17/11/06 13:59:11 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception java.io.IOException: All datanodes DatanodeInfoWithStorage[xx.xxx.xx.xx6:50010,DS-0541333f-cf15-4c2b-af07-ce5aa75ef21a,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1227) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:999) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:506) Can someone please explain me why is this happening. I can't find any data or anything helpful because all nodes are fine on dashboard. All datanodes doesn't reporting any problem neither *hdfs fsck*. Any ideas, I'm really struggling 😕

majnam · ‎11-03-2017

Spark AM logs? Can you lead me please? :S

majnam · ‎10-30-2017

I decreased yarn.scheduler.minimum-alocation-mb to 256MB Spark submit configs now are following: --executor-memory 256m --executor-cores 1 --num-executors 1 --driver-memory 512m I need it to set --driver-memory to 512MB since application wouldn't start. So, with this configs application is taking 2 GB of RAM and as you were asking => Job is as You assume across 2 containers and each is taking 1024MB UPDATE: In INFO of Spark job I can see this: 17/10/30 17:57:10 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead

majnam · ‎10-30-2017

I really have feelling like YARN is overriding parameters I'm passing. Also I tried to set --num-executors to 2, he set 3 as you can see on the first picture above

majnam · ‎10-30-2017

Yeah, yeah I did it of course, it was suggesting. I tested once more, and the same job is still taking 3GB, this s how my config looks like now screenshot-7.jpg

Online	Offline
Last Visited	‎06-11-2018 02:57 PM

Member Since	‎12-15-2016 03:44 PM
Last Visited	‎06-11-2018 02:57 PM
Posts	54
Kudos received	4

Cloudera Community

Re: How to purge everything after deleting service

Spark History server dependecy

Re: SPARK job taking more memory then it is given

Re: HA Hadoop core

HA Hadoop core

Re: How to create Smartsense id, custoemer id whil...

All datanodes are bad aborting

Re: SPARK job taking more memory then it is given

Re: SPARK job taking more memory then it is given

Re: SPARK job taking more memory then it is given

Re: SPARK job taking more memory then it is given