Member since 
    
	
		
		
		02-09-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                559
            
            
                Posts
            
        
                422
            
            
                Kudos Received
            
        
                98
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2864 | 03-02-2018 01:19 AM | |
| 4593 | 03-02-2018 01:04 AM | |
| 3068 | 08-02-2017 05:40 PM | |
| 2871 | 07-17-2017 05:35 PM | |
| 2103 | 07-10-2017 02:49 PM | 
			
    
	
		
		
		06-21-2016
	
		
		06:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Ranger and Knox are complimentary.  Knox is for perimeter security.  It allows you to control the entry point for users to your cluster.  You can put Knox behind a load balancer and shield the users from access to specific servers in the cluster.  Ranger and Knox integrate well together, so you can use Ranger to grant permissions to users for Knox.  This tutorial can walk you through setting up Knox: http://hortonworks.com/hadoop-tutorial/securing-hadoop-infrastructure-apache-knox/  And here is some good info on Knox: http://hortonworks.com/apache/knox-gateway/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-17-2016
	
		
		05:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Is the issue cache related?  https://drill.apache.org/docs/hive-metadata-caching/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-17-2016
	
		
		03:29 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 In step 3, the script in the VagrantFile could include:  sudo service ntpd start
  The chkconfig command will ensure ntpd starts on bootup.  However, I found ntpd did not auto start the first time the instance was brought up with Vagrant.  Subsequent boots of the VM work properly. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-17-2016
	
		
		03:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Step 5 installs Ambari 1.7 which is an older version.  You should use this step to get the latest version (Ambari 2.2.20):  wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2016
	
		
		07:16 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 You are welcome. Don't forget to accept the answer. 🙂  It should use the full pathname you specify after the DIRECTORY keyword.  A common testing approach would be to use the tmp directory like '/tmp/my-output.cvv' and see if it is working as expected. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2016
	
		
		07:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The LOCAL keyword tells Hive to write the data to the local filesystem, not HDFS. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-13-2016
	
		
		05:09 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Do you have JAVA_HOME configured in the environment for the user Ambari is using to install the software on the remote nodes? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-09-2016
	
		
		08:16 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 I believe this is because you have designated your Hive table as EXTERNAL.  External tables are most often used to manage data directly on HDFS that is loaded as CSV files, etc.  In your case, you are creating a table and storing as ORC while populating it via a SELECT clause.  Is there a reason you need it to be external?  If not, you can omit the "EXTERNAL" in your "CREATE TABLE" clause and remove the "Location" entry.  The only real change for you is the data will be stored under /apps/hive/warehouse on HDFS instead of the location you specified.  This is what you would have:  CREATE TABLE IF NOT EXISTS  mi_cliente_fmes(
id_interno_pe bigint,
cod_nrbe_en int,
mi_nom_cliente string,
fec_ncto_const_pe string,
fecha_prim_rl_cl string ,
sexo_in string,
cod_est_civil_indv string,
cod_est_lab_indv string,
num_hijos_in int,
ind_autnmo_in string,
cod_ofcna_corr string,
cod_cpcdad_lgl_in int
)
CLUSTERED BY (cod_nrbe_en) INTO 60 BUCKETS
stored as ORC
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-09-2016
	
		
		03:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 You are using ORC format, which supports ACID and transactions.  Try using "INSERT INTO" instead of "INSERT OVERWRITE" and adding a "TRUNCATE TABLE" before running the select.  https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML  As of Hive 0.14, if a table has an OutputFormat that implements AcidOutputFormat and the system is configured to use a transaction manager that implements ACID, then INSERT OVERWRITE will be disabled for that table.  This is to avoid users unintentionally overwriting transaction history.  The same functionality can be achieved by using TRUNCATE TABLE (for non-partitioned tables) or DROP PARTITION followed by INSERT INTO. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		- « Previous
- Next »
 
         
					
				













