Member since 
    
	
		
		
		09-23-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                800
            
            
                Posts
            
        
                898
            
            
                Kudos Received
            
        
                185
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 7089 | 08-12-2016 01:02 PM | |
| 2664 | 08-08-2016 10:00 AM | |
| 3549 | 08-03-2016 04:44 PM | |
| 6998 | 08-03-2016 02:53 PM | |
| 1806 | 08-01-2016 02:38 PM | 
			
    
	
		
		
		01-25-2016
	
		
		11:36 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 wrote as an answer because of the character limit:   yes first go into ambari or perhaps better the OS and search for the tez.lib.uris property in the properties file  less /etc/tez/conf/tez-site.xml  You should find something like this:       <value>/hdp/apps/${hdp.version}/tez/tez.tar.gz</value>  if this is not available you may have a different problem. ( Tez client not installed some configuration issue)  You can then check if these files exist in HDFS with  hadoop fs -ls /hdp/apps/  find the version number for example 2.3.2.0-2950  [root@sandbox ~]# hadoop fs -ls /hdp/apps/2.3.2.0-2950/tez  Found 1 items  -r--r--r--   3 hdfs hadoop   56926645 2015-10-27 14:40 /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz  You can check if this file is corrupted somehow with   hadoop fs -get /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz  You can then try to untar it to see if that works.   If the file doesn't exist in HDFS you can find it in the installation directory of HDP (/usr/hdp/2.3.2.0-2950/tez/lib/tez.tar.gz on the local filesystem ) You could then put it into hdfs 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-25-2016
	
		
		10:20 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 There are different possibilities. Normally this means the tez libraries are not present in HDFS. Are you using the sandbox?   You should check if the tez client is installed on your pig client, if the tez-site.xml contains the tez.lib.uris property and if the tez libraries are actually in HDFS and valid ( download them and untar to check )  /hdp/apps/<hdp_version>/tez/tez.tar.gz  https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ref-ffec9e6b-41f4-47de-b5cd-1403b4c4a7c8.1.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-25-2016
	
		
		10:00 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hmmmm weird, the order shouldn't really make a difference. I assume he added a reducer doing that. Only explanation I have. Adding a distribute by would most likely also have helped. But sort is good for predicate pushdown and so as long as all is good ... 🙂  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-14-2016
	
		
		06:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Apart from apreduce.reduce.java.opts=-Xmx4096m missing an m which I don't think will be the problem;   How many days are you loading? You essentially do a dynamic partitioning so the task needs to keep memory for every day you load into. If you have a lot of days this might be the reason:  Possible solutions:  a) Try to load one day and see if that makes it better.   b) use dynamic sorted partitioning, ( slide 16) this theoretically should fix the problem if this is the reason  c) use manual distribution ( slide 19 )   http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-13-2016
	
		
		12:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 That is very curious I have seen lots of stripes being created because of memory problems. But normally he only gets down to 5000 rows and then out of memory.   Which version of Hive are you using? What are your memory settings for the hive tasks and if the file is small is it possible that the table is partitioned and the task is writing into a large number of partitions at the same time?   Can you share the LOAD command and the table layout?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-11-2016
	
		
		05:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 ah nice undercover magic. I will try and see what happens if I switch the active off.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-11-2016
	
		
		05:29 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have seen the question for HA Namenodes however HA Resource Managers still confuse me. In Hue you are for example told to add a second resource manager entry with the same logical hue name. I.e. Hue supports adding two resource manager urls and he will manually try both.  How does that work in Falcon, how can I enter an HA Resource Manager entry into the interfaces of the cluster Entity document. For Namenode HA I would use the logical name and the program would then read the hdfs-site.xml  I have seen the other similar questions for oozie but I am not sure it was answered or I didn't really understand it.  https://community.hortonworks.com/questions/2740/what-value-should-i-use-for-jobtracker-for-resourc.html  so assuming my active resource manager is   mycluster1.com:8050  and standby is  mycluster2,com:8050 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		01-07-2016
	
		
		02:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 You could use a shell action, add the token to the oozie files ( file tag )  and do the kinit yourself before running the java command. Obviously not that elegant and you have a token somewhere in HDFS but it should work. I did something similar with a shell action running a scala program and running a kinit before. ( Not against hive but running kinit then connecting to HDFS ). Ceterum censeo I would always suggest using a hive server with LDAP/PAM authentication. beeline and hive2 action has a password file option now and it makes life so much easier. As a database guy kerberos for a jdbc connection just always makes problems.  Here is the oozie shell command by the way.   <shell xmlns="uri:oozie:shell-action:0.1">
   <job-tracker>${jobTracker}</job-tracker>
   <name-node>${nameNode}</name-node>
   <exec>runJavaCommand.sh</exec>
   <file>${nameNode}/scripts/runJavaCommand.sh#runJavaCommand.sh</file>
   <file>${nameNode}/securelocation/user.keytab#user.keytab</file>
</shell>
then just add a kinit into the script before running java
kinit -kt user.keytab user@EXAMPLE.COM
java org.apache.myprogram
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-04-2016
	
		
		01:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It looks like a very useful command for debugging. Never used it before. Shame it seems to be broken.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				













