Member since 
    
	
		
		
		06-17-2018
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                6
            
            
                Posts
            
        
                0
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		09-15-2018
	
		
		08:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 well my bad,  it turned out to be a connection issue as per the following log from executor.  WARN DFSClient: Failed to connect to sandbox.hortonworks.com/<<IP>>:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused  Here is what I did.   on HDP 2.5 , using root login I modified start_scripts/start_sandbox.sh and forwarded the port 50010.   and run the following commands  docker commit sandbox sandbox<br>  docker stop sandbox
docker rm sandbox
init 6 ( to restart )   now the spark master from Dev (A) can get the block from Dev (B) which is my HDP 2.5 machine. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-17-2018
	
		
		06:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am trying to run the spark job with hive support enabled. it can run the command "show databases" successfully but when it try to read  hive table ( which had data stored as txt on hdfs ) is showing org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block:....  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 3.0 (TID 6, 192.168.8.134, executor 0): org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: <bloac ID >> =<<path>>
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:984)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:882)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)  Here are the details of my dev environment:  Dev box (A) (centOS  running in vmware ) with eclipse added jars from spark 2.2.1 with hadoop 2.7 support.  Dev box (A) is running with spark master and slave configured  with thrift server  on Dev box (B) . Dev box (B) is running  HDP 2.5 hortonworks.  So, why the app running in dev box (A)  is throwing the missing block exception  when it try to query hive table , even if the file is present in hdfs ?  Please note I have already executed the following command to check for blocks.  sudo -u hdfs hdfs dfsadmin -report   sudo -u hdfs hdfs fsck -list-corruptfileblocks  Thanks for any help! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hadoop
 - 
						
							
		
			Apache Hive
 - 
						
							
		
			Apache Spark