About parameswarnc

parameswarnc · ‎05-09-2017

@yvora Thanks for the response. 1. Can you try setting spark.yarn.stagingDir to hdfs:///user/tmp/ ? This is not working . 2. Can you please share which spark config are you trying to set which require RM address? I am trying to run the Spark application through java program , so when the master is yarn , by default it connects to resource manager @ 0.0.0.0:8032 in order to override this property , I need to set the same in spark configuration i.e sparkConf.set("spark.hadoop.yarn.resourcemanager.hostname",resourcemanagerHostname); sparkConf.set("spark.hadoop.yarn.resourcemanager.address",resourcemanagerAddress); But the problem is when the I have HA enabled the resource manager How to I connect to it . And some idea I got about my question is : There is way to achieve this in the Spark context as below . JavaSparkContext jsc = new JavaSparkContext(sparkConf); jsc.hadoopConfiguration().addResource(new Path(hadoopClusterSiteFilesBasePath + "core-site.xml")); jsc.hadoopConfiguration().addResource(new Path(hadoopClusterSiteFilesBasePath + "hdfs-site.xml")); jsc.hadoopConfiguration().addResource(new Path(hadoopClusterSiteFilesBasePath + "mapred-site.xml")); jsc.hadoopConfiguration().addResource(new Path(hadoopClusterSiteFilesBasePath + "yarn-site.xml")); But it needs the resource manager and staging directory configuration even before creating the context ,so there is problem . And what I am looking for is something like above for the SparkConguration class/object . Thanks , Param.

parameswarnc · ‎05-08-2017

Hi All, When I am trying to run the spark application in YARN mode using the HDFS file system it works fine when I provide the below properties . sparkConf.set("spark.hadoop.yarn.resourcemanager.hostname",resourcemanagerHostname); sparkConf.set("spark.hadoop.yarn.resourcemanager.address",resourcemanagerAddress); sparkConf.set("spark.yarn.stagingDir",stagingDirectory ); But the problem here , 1. Since my HDFS is NamdeNode HA enabled it won't work when I provide spark.yarn.stagingDir has the commons URL of hdfs example hdfs://hdcluster/user/tmp/ it gives error has unknown host hdcluster , But it works fine when I give the URL as hdfs://<ActiveNameNode>/user/tmp/ , But we don't in advance which will be active so how to resolve this . And few things I have noticed are SparkContext takes the Hadoop configuration but SparkConfiguration class won't have any methods to accepts Hadoop configuration. 2. How To provide the resource Manager address when Resource Manager are running in HA . Thanks in Advance , Param.

parameswarnc · ‎03-31-2017

@amankumbare Thanks for responding, Sumit and me work in the same team . It is happening for all topics and jdk we using is 7 . The main issue here is the content of the znode of the broker has no host and port information updated . ["PLAINTEXTSASL://xxxx.domain.com:9092"],"host":null,"version":2,"port":-1} -Is it the expected behavior ? -And it happens when listener value is PLAINTEXTSASL://xxxx.domain.com:9092 and everything is fine when it is just PLAINTEXT . Thanks in advance , Param.

parameswarnc · ‎03-17-2017

Hi All , How to close the the question in hortenworks community by accepting the answer . Thanks , Param.

parameswarnc · ‎03-15-2017

@Sandeep Nemuri Thanks its worked ..I tried this already but forgot to create the file on each node ,now its fine. And I just got one more question here : If I run in spark app YARN mode I can set the memory parameter through sparkconfiguration using spark.yarn.driver.memoryOverhead properties , Is something similar available for the standalone and local mode ? Thanks in advance , Param.

parameswarnc · ‎03-15-2017

Hi All, When executing the spark application on YARN cluster can I access the local file system (Underlying OS FS). Though YARN is pointing to HDFS . Thanks , Param.

parameswarnc · ‎03-15-2017

Sorry for the delayed reply ...I got busy in some work. @Artem Ervits thanks a lot for all the responses . I was able to achieve this by setting the spark configuration as below ;- sparkConfig.set("spark.hadoop.yarn.resourcemanager.hostname","XXXXX"); sparkConfig.set("spark.hadoop.yarn.resourcemanager.address","XXXXX:8032"); sparkConfig.set("spark.yarn.access.namenodes", "hdfs://XXXXX:8020,hdfs://XXXX:8020"); sparkConfig.set("spark.yarn.stagingDir", "hdfs://XXXXX:8020/user/hduser/"); sparkConfig.set("--deploy-mode", deployMode); Thanks , Param.

parameswarnc · ‎03-08-2017

Hi All, I am trying to describe the Kafka Consumer group by the below command . bin/kafka-consumer-groups.sh --new-consumer --bootstrap-server broker1:9092,broker2:9092 --describe --group console-consumer-44261 --command-config conf/security_user_created But I am getting an error as Consumer group `console-consumer-44261` does not exist or is rebalancing. I have checked the zookeeper , the consumer group exist and I am always getting this error . Could some point out me here , what mistake I am making here . Kakfa version I am using is 0.9 . Thanks in advance , Param.

parameswarnc · ‎02-28-2017

@Artem Ervits , Thanks a lot for your time and help given. However I am able to achieve my objective by setting the properties of hadoop and yarn in spark configuration . sparkConfig.set("spark.hadoop.yarn.resourcemanager.hostname","XXX"); sparkConfig.set("spark.hadoop.yarn.resourcemanager.address","XXX:8032"); sparkConfig.set("spark.yarn.access.namenodes", "hdfs://XXXX:8020,hdfs://XXXX:8020"); sparkConfig.set("spark.yarn.stagingDir", "hdfs://XXXX:8020/user/hduser/"); Regards, Param.

parameswarnc · ‎02-27-2017

@Artem Ervits , Thanks again ! And Sorry If I am asking too many questions here . What actually I am looking for is ..I should not use the spark-submit script as per the project requirement , So the cluster configuration I am passing through the spark config as given below . SparkConf sparkConfig = new SparkConf().setAppName("Example App of Spark on Yarn"); sparkConfig.set("spark.hadoop.yarn.resourcemanager.hostname","XXXX"); sparkConfig.set("spark.hadoop.yarn.resourcemanager.address","XXXXX:8032"); And it is able to identify the Resource Manager but it failing because it is not identifying the file system . Though I am setting the hdfs file system configuration as well. sparkConfig.set("fs.defaultFS", "hdfs://xxxhacluster"); sparkConfig.set("ha.zookeeper.quorum", "xxx:2181,xxxx:2181,xxxx:2181"); And it assuming it as the local file system. And error I am getting in the Resource Manager is exited with exitCode: -1000 due to: File file:/tmp/spark-0e6626c2-d344-4cae-897f-934e3eb01d8f/__spark_libs__1448521825653017037.zip does not exist Thanks and Regards, Param.

Online	Offline
Last Visited	‎11-02-2017 06:54 AM

Member Since	‎11-30-2016 01:17 PM
Last Visited	‎11-02-2017 06:54 AM
Posts	33
Kudos received	5

Cloudera Community

Re: How to add the hadoop and yarn configuration f...

Re: Setting the Spark configuration in yarn mode f...

Setting the Spark configuration in yarn mode for H...

Re: ISSUE with kafka while enabling kerberos

How to close the question ?

Re: Spark Application on YARN .

Spark Application on YARN .

Re: How to add the hadoop and yarn configuration f...

Getting the Error While describing the kafka consu...

Re: How to add the hadoop and yarn configuration f...

Re: How to add the hadoop and yarn configuration f...