Member since 
    
	
		
		
		06-09-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                529
            
            
                Posts
            
        
                129
            
            
                Kudos Received
            
        
                104
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1731 | 09-11-2019 10:19 AM | |
| 9321 | 11-26-2018 07:04 PM | |
| 2480 | 11-14-2018 12:10 PM | |
| 5310 | 11-14-2018 12:09 PM | |
| 3140 | 11-12-2018 01:19 PM | 
			
    
	
		
		
		09-11-2018
	
		
		10:17 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Jon Page Try these before running spark-submit command:  export PYSPARK_DRIVER_PYTHON=/opt/anaconda2/bin/python   
export PYSPARK_PYTHON=/opt/anaconda2/bin/python   /opt/anaconda2/bin/python should be the location of your 2.7 python (this should be same across all clsuter nodes)  HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-11-2018
	
		
		02:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Harshad M  Good to hear you found the issue with the record. Please remember to login and mark the answer as accepted if it helped you in anyway. Thanks! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-11-2018
	
		
		12:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Daniel Müller  Could you share the explain extended for the above query? From the logical/physical plan details you could see whether filter pushdown is includes the limit. If this is spark with llap integration, I know this is not supported previous HDP 3.0. Starting HDP 3.0 we have added the HWC (hive warehouse connector) for spark, which will work as expected.   HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-10-2018
	
		
		01:16 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  @Harshad M Perhaps issue is data related. I see show 10 rows works fine so this means that when it needs to go over all the rows it is failing at some point due data may not be properly formatted. Could you check if underlying data has any additional commas or any other problem? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2018
	
		
		12:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							  @Michael Bronson Check if Driver is doing full garbage collection or if there could be a network issue between executor or driver. You can check the gc pause times in the spark UI and also you can add the gc logs to be printed as part of the output of the driver and executors.  --conf "spark.driver.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails"   --conf "spark.executor.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails"  HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2018
	
		
		12:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Michael Bronson  In yarn master mode executors will run inside a yarn container.   Spark will launch an Application Master that will be responsible of negotiating the containers with Yarn. Having that said only nodes running Nodemanager are eligible to run executors.    First question: The executor logs you are looking for will be part of the yarn application logs for the container running on the specific node. (yarn logs -applicationId <appId>)  Second question: Executor will notify in case heartbeat fails to reach driver for some network problem/timeout. So this should be in the executor log that is part of the application logs.   HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-30-2018
	
		
		02:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Sudharsan
 Ganeshkumar perhaps a snapshot or a copy of a file pointing to the same blocks. Please remember to login and accept the answer if you think it has addressed your question.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-30-2018
	
		
		02:03 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							  @Sudharsan
 Ganeshkumar
  Actually any file stored in hdfs is split in blocks (chunks of data) and each block is replicated 3 times by default. When you delete a file you remove the metadata pointing to the blocks that is stored in Namenode. Blocks are deleted when there is no reference to them in the Namenode metadata. This is important to mention since you could have snapshots, or files in Trash folders still referencing the blocks, if this happens those blocks wont be deleted until the snapshot of files under Trash folders are also removed.   HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-30-2018
	
		
		01:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @vishal dutt Please remember to login and accept the answer if you think it has addressed your question.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-30-2018
	
		
		01:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  @heta desai  Based on this jira  https://issues.apache.org/jira/browse/HIVE-13290  and the hive docs:  https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Constraints  This is supported from hive 2.1.0 onwards only.   HTH  *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













