12-13-2016 09:27 AM
i am trying to run debug spark program in eclipse on Cloudera cluster on AWS EC2. i tried
val conf = new SparkConf().setAppName("WordCount").setMaster("yarn-client")
val conf = new SparkConf().setAppName("WordCount").setMaster("local")
i find out i am facing an issue . the namenode in the AWS EC2 cluster return me the private IP in AWS.like
172.31.26.79,172.31.26.80 etc.. which my local windows mechine not able to resolve .
Any idea how to handle all this ?
12-13-2016 11:11 AM
It's also possible to establish an ssl tunnel in order to connect to a remote debug session. Take a look at the -L option for ssh, you will be able to open a local port and setup the remote port within the ssh command. This will work for private IPs as long as you can connect to a server from a public IP that has access to the private network. Note though that there can be extreme latency and still be difficult to debug in setups like this.
01-17-2017 05:39 PM
Amazon Elastic MapReduce (EMR) builds proprietary versions of Apache Hadoop, Hive, and Pig optimized for running on Amazon Web Services. Amazon EMR provides a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (EC2) or Simple Storage Service (S3)