Member since 
    
	
		
		
		09-25-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                24
            
            
                Posts
            
        
                9
            
            
                Kudos Received
            
        
                3
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2117 | 12-14-2015 05:59 PM | |
| 2354 | 12-13-2015 05:28 PM | |
| 4243 | 12-12-2015 10:04 PM | 
			
    
	
		
		
		01-07-2016
	
		
		05:34 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @jspeidel  Thanks John, your solution worked. I was indeed missing the Oozie HA property.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-07-2016
	
		
		05:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@rnettleton Yes, John's solution worked. Thanks a lot for your help! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		08:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Ah, thanks. Let me try it and see if it works. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		07:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Ancil - I'm not using an external db for Ambari. I setup Ambari using "ambari-server setup -s -j /path/to/jdk" which accepts all defaults and only uses my custom JDK path. The default Ambari server db is embedded Postgres.   The Blueprint Processor class is responsible for substituting hostnames and it should be just a string replace after the topology has been correctly resolved. So, not sure if choosing an external DB choice will affect it.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		07:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Ambari Server log Gist:   https://gist.github.com/DhruvKumar/e2c06a94388c51e... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		07:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi John, blueprint and cluster creation template are linked from the question's description. Please see the links at the end of the description.   I've also added the Ambari Server log just now. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		06:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Not sure if that matters. To the best of my knowledge, name of the host group can be anything, it is just a String, and the blueprint processor should substitute the correct hosts if they match up with the cluster creation template. See Sean's HA blueprint here which doesn't use the "host_groups" suffix:  https://github.com/seanorama/ambari-bootstrap/blob/master/api-examples/blueprints/blueprint-hdfs-ha.json  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-05-2016
	
		
		06:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I'm using Ambari 2.1.2 to install a Highly Available HDP 2.3.4 cluster. The service installation is successful on the nodes, but the services fail to start. Digging in to the logs and the config files, I found that the %HOSTNAME::node:port% strings didn't get replaced with the actual hostnames defined in the cluster configuration template. As a result, the config files contain invalid URIs like these:      <property>
      <name>dfs.namenode.http-address</name>
      <value>%HOSTGROUP::master_2%:50070</value>
      <final>true</final>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn1</name>
      <value>%HOSTGROUP::master_2%:50070</value>
    </property>
  Sure enough, the errors while starting the services also pointed to the same reason:  [root@worker1 azureuser]# cat /var/log/hadoop/hdfs/hadoop-hdfs-datanode-worker1.log
2016-01-05 02:24:22,601 INFO  datanode.DataNode (LogAdapter.java:info(45)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = worker1.012g3iyhe01upgbu35npgl5l4a.gx.internal.cloudapp.net/10.0.0.9
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.7.1.2.3.4.0-3485
.
.
.
2016-01-05 02:34:27,068 FATAL datanode.DataNode (DataNode.java:secureMain(2533)) - Exception in secureMain
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: %HOSTGROUP::master_2%:8020
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:198)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
	at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:687)
	at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:655)
	at org.apache.hadoop.hdfs.DFSUtil.getNNServiceRpcAddressesForCluster(DFSUtil.java:872)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.java:155)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1152)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:430)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2411)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2550)
2016-01-05 02:34:27,072 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2016-01-05 02:34:27,076 INFO  datanode.DataNode (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at worker1.012g3iyhe01upgbu35npgl5l4a.gx.internal.cloudapp.net/10.0.0.9
************************************************************/
  Interestingly, the Ambari server log reports that the hostname mapping was successful for master nodes, but I didn't find it for worker nodes.   05 Jan 2016 02:13:40,271  INFO [pool-2-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = master_5 has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.
05 Jan 2016 02:13:40,272  INFO [pool-2-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = master_1 has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.
05 Jan 2016 02:13:40,273  INFO [pool-2-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = master_2 has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.
05 Jan 2016 02:13:40,273  INFO [pool-2-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = master_3 has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.
05 Jan 2016 02:13:40,274  INFO [pool-2-thread-1] TopologyManager:598 - TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = master_4 has been fully resolved, as all 1 required hosts are mapped to 1 physical hosts.
   (but even the master nodes had service startup failure)  Here's the config Blueprint Gist: https://gist.github.com/DhruvKumar/355af66897e584b...  And here's the cluster creation template: https://gist.github.com/DhruvKumar/9b971be81389317...  Here's the result of blueprint exported from Ambari server after installation (using /api/v1/clusters/clusterName?format=blueprint): https://gist.github.com/DhruvKumar/373cd7b05ca818c...  Edit: Ambari Server Log: https://gist.github.com/DhruvKumar/e2c06a94388c51e...  Note that my non-HA Blueprint which doesn't contain the HOSTNAME syntax works without an issue on the same infrastructure.    Can someone please help me debug why the hostnames aren't being mapped correctly? Is it a problem in the HA Blueprint? I have all the logs from the installation and I'll keep the cluster alive for debugging.  Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Ambari
 
			
    
	
		
		
		12-17-2015
	
		
		05:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Are you using Virtual Box? This might help:  http://www.howtogeek.com/187535/how-to-copy-and-paste-between-a-virtualbox-host-machine-and-a-guest-machine/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-14-2015
	
		
		05:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 To add two RDD values, the general approach is:  0. Convert the RDDs to pair RDD (key-value). You can use zipWithIndex() to do it if your RDD doesn't have implicit keys.  1. Do a union of the two RDDs  2. Do reduceByKey(_+_) on the new RDD  Don't use collect, it is slow and you'll be limited by the Driver memory anyway.  edit: see here for an example in Scala which you can adapt to Python: http://stackoverflow.com/questions/27395420/concatenating-datasets-of-different-rdds-in-apache-spark-using-scala 
						
					
					... View more