Support Questions

Find answers, ask questions, and share your expertise

Cannot connect to resource manager

avatar
Expert Contributor

Hi. i am running this code from Eclipse:

object HelloWorld {
    def main(args: Array[String]) {
      import org.apache.spark.SparkContext
      import org.apache.spark.SparkContext._
      import org.apache.spark.SparkConf
      import org.apache.spark.sql.SQLContext
//      println("Hello, world!")
//      val b=new bubba()
//      val master = "spark://master.royble.co.uk:7077"
      val sc = new SparkContext("yarn-client", "Plane data", System.getenv("SPARK_HOME"))

      val sqlContext = new SQLContext(sc)
      val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/tmp/airflightsdelays/")
      df.printSchema
    }
  }

but it is giving me this error:

16/09/30 15:39:45 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/09/30 15:39:46 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

I have tried changing yarn.resourcemanager.address from s1.royble.co.uk:8050 to s1.royble.co.uk:8032 but this did not fix it. Any other pointers are gratefully received. TIA!

FYI:

Spark is 1.6.0.2.4 and HDP-2.4.2.0 (2.4.2.0-258)

Java build path:

8163-screenshot-from-2016-09-30-154911.png

my yarn-site xml is:

  <configuration>
    
    <property>
      <name>hadoop.registry.rm.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>hadoop.registry.zk.quorum</name>
      <value>master.royble.co.uk:2181,s2.royble.co.uk:2181,s3.royble.co.uk:2181</value>
    </property>
    
    <property>
      <name>yarn.acl.enable</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.admin.acl</name>
      <value>yarn</value>
    </property>
    
    <property>
      <name>yarn.application.classpath</name>
      <value>$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>
    </property>
    
    <property>
      <name>yarn.client.nodemanager-connect.max-wait-ms</name>
      <value>60000</value>
    </property>
    
    <property>
      <name>yarn.client.nodemanager-connect.retry-interval-ms</name>
      <value>10000</value>
    </property>
    
    <property>
      <name>yarn.http.policy</name>
      <value>HTTP_ONLY</value>
    </property>
    
    <property>
      <name>yarn.log-aggregation-enable</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.log-aggregation.retain-seconds</name>
      <value>2592000</value>
    </property>
    
    <property>
      <name>yarn.log.server.url</name>
      <value>http://s1.royble.co.uk:19888/jobhistory/logs</value>
    </property>
    
    <property>
      <name>yarn.node-labels.enabled</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.node-labels.fs-store.retry-policy-spec</name>
      <value>2000, 500</value>
    </property>
    
    <property>
      <name>yarn.node-labels.fs-store.root-dir</name>
      <value>/system/yarn/node-labels</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.address</name>
      <value>0.0.0.0:45454</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.admin-env</name>
      <value>MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle,spark_shuffle</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
      <value>org.apache.spark.network.yarn.YarnShuffleService</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.bind-host</name>
      <value>0.0.0.0</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.container-executor.class</name>
      <value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.container-monitor.interval-ms</name>
      <value>3000</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.delete.debug-delay-sec</name>
      <value>0</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
      <value>90</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb</name>
      <value>1000</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
      <value>0.25</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.health-checker.interval-ms</name>
      <value>135000</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.health-checker.script.timeout-ms</name>
      <value>60000</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.linux-container-executor.cgroups.hierarchy</name>
      <value>hadoop-yarn</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.linux-container-executor.cgroups.mount</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.linux-container-executor.group</name>
      <value>hadoop</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
      <value>org.apache.hadoop.yarn.server.nodemanager.util.DefaultLCEResourcesHandler</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.local-dirs</name>
      <value>/hadoop/yarn/local</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log-aggregation.compression-type</name>
      <value>gz</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log-aggregation.debug-enabled</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log-aggregation.num-log-files-per-app</name>
      <value>30</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
      <value>-1</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log-dirs</name>
      <value>/hadoop/yarn/log</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.log.retain-second</name>
      <value>604800</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.recovery.dir</name>
      <value>/var/log/hadoop-yarn/nodemanager/recovery-state</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.recovery.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.remote-app-log-dir</name>
      <value>/app-logs</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
      <value>logs</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.resource.cpu-vcores</name>
      <value>6</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>11264</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name>
      <value>80</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.vmem-check-enabled</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.vmem-pmem-ratio</name>
      <value>2.1</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.address</name>
      <value>s1.royble.co.uk:8050</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.admin.address</name>
      <value>s1.royble.co.uk:8141</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.am.max-attempts</name>
      <value>2</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.bind-host</name>
      <value>0.0.0.0</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.connect.max-wait.ms</name>
      <value>900000</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.connect.retry-interval.ms</name>
      <value>30000</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.fs.state-store.retry-policy-spec</name>
      <value>2000, 500</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.fs.state-store.uri</name>
      <value> </value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.ha.enabled</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.hostname</name>
      <value>s1.royble.co.uk</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.nodes.exclude-path</name>
      <value>/etc/hadoop/conf/yarn.exclude</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.recovery.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.resource-tracker.address</name>
      <value>s1.royble.co.uk:8025</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.scheduler.address</name>
      <value>s1.royble.co.uk:8030</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.scheduler.class</name>
      <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.scheduler.monitor.enable</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.state-store.max-completed-applications</name>
      <value>${yarn.resourcemanager.max-completed-applications}</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.store.class</name>
      <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size</name>
      <value>10</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.webapp.address</name>
      <value>s1.royble.co.uk:8088</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled</name>
      <value>false</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.webapp.https.address</name>
      <value>s1.royble.co.uk:8090</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms</name>
      <value>10000</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-acl</name>
      <value>world:anyone:rwcda</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-address</name>
      <value>master.royble.co.uk:2181,s2.royble.co.uk:2181,s3.royble.co.uk:2181</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-num-retries</name>
      <value>1000</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-retry-interval-ms</name>
      <value>1000</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-state-store.parent-path</name>
      <value>/rmstore</value>
    </property>
    
    <property>
      <name>yarn.resourcemanager.zk-timeout-ms</name>
      <value>10000</value>
    </property>
    
    <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>11264</value>
    </property>
    
    <property>
      <name>yarn.scheduler.maximum-allocation-vcores</name>
      <value>6</value>
    </property>
    
    <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>1024</value>
    </property>
    
    <property>
      <name>yarn.scheduler.minimum-allocation-vcores</name>
      <value>1</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.address</name>
      <value>s3.royble.co.uk:10200</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.bind-host</name>
      <value>0.0.0.0</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.client.max-retries</name>
      <value>30</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.client.retry-interval-ms</name>
      <value>1000</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.active-dir</name>
      <value>/ats/active/</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.cleaner-interval-seconds</name>
      <value>3600</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.done-dir</name>
      <value>/ats/done/</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes</name>
      <value>org.apache.tez.dag.history.logging.ats.TimelineCachePluginImpl</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.retain-seconds</name>
      <value>604800</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.scan-interval-seconds</name>
      <value>60</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.entity-group-fs-store.summary-store</name>
      <value>org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.generic-application-history.store-class</name>
      <value>org.apache.hadoop.yarn.server.applicationhistoryservice.NullApplicationHistoryStore</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.http-authentication.simple.anonymous.allowed</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.http-authentication.type</name>
      <value>simple</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-state-store.path</name>
      <value>/hadoop/yarn/timeline</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-timeline-store.path</name>
      <value>/hadoop/yarn/timeline</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-timeline-store.read-cache-size</name>
      <value>104857600</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-timeline-store.start-time-read-cache-size</name>
      <value>10000</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-timeline-store.start-time-write-cache-size</name>
      <value>10000</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms</name>
      <value>300000</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.recovery.enabled</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.state-store-class</name>
      <value>org.apache.hadoop.yarn.server.timeline.recovery.LeveldbTimelineStateStore</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.store-class</name>
      <value>org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.ttl-enable</name>
      <value>true</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.ttl-ms</name>
      <value>2678400000</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.version</name>
      <value>1.5</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.webapp.address</name>
      <value>s3.royble.co.uk:8188</value>
    </property>
    
    <property>
      <name>yarn.timeline-service.webapp.https.address</name>
      <value>s3.royble.co.uk:8190</value>
    </property>
    
  </configuration>
1 ACCEPTED SOLUTION

avatar
Expert Contributor

I set the following values and it connected OK:

.set("spark.hadoop.yarn.resourcemanager.hostname", "resourcemanager.fqdn")
.set("spark.hadoop.yarn.resourcemanager.address", "resourcemanager.fqdn:8032")

View solution in original post

4 REPLIES 4

avatar
Super Guru

@ed day

Can you telnet to that port? You just need to check that the port is accessible. Are you using sandbox or pointing to an actual cluster? If sandbox and and you could not telnet, then you need to add a forward rule to the wanted ported. If the later, then you may be in a different subnet and may have to work with your network admin.

avatar
Expert Contributor

Thanks. The ports were OK.

avatar
Expert Contributor

I set the following values and it connected OK:

.set("spark.hadoop.yarn.resourcemanager.hostname", "resourcemanager.fqdn")
.set("spark.hadoop.yarn.resourcemanager.address", "resourcemanager.fqdn:8032")

avatar
Master Mentor
@ed day

please consider revising your posts to remove your server names, this is a public forum and you're compromising your security.