Member since 
    
	
		
		
		06-28-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                279
            
            
                Posts
            
        
                43
            
            
                Kudos Received
            
        
                24
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2560 | 12-24-2018 08:34 AM | |
| 6357 | 12-24-2018 08:21 AM | |
| 2948 | 08-23-2018 07:09 AM | |
| 11946 | 08-21-2018 05:50 PM | |
| 6183 | 08-20-2018 10:59 AM | 
			
    
	
		
		
		12-26-2018
	
		
		07:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Armanur Rahman,  from the error message it seems to be a simple syntax error. From there I think this is the statement with the error:  CREATE TABLE IF NOT EXISTS `hr.f_company_xml_tab`   ( `RECID` STRING, `XMLTYPE.GETSTRINGVAL(XMLRECORD)` STRING)   your second column is tried to be named 'XMLTYPE.GETSTRINGVAL(XMLRECORD)' which includes a '(' just as the error message claims. Can you rename the column to an easier name i.e. 'val', and try again?  Regards  Harald 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-24-2018
	
		
		09:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @A Sabatino,    thanks for the info. Would be great if you click on 'accept' for the answer. Helps everyone to see the issue is resolved and provides you and me with a reward in terms of reputation points 🙂  Regards  Harald 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-24-2018
	
		
		08:34 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @hr pyo   This really depends and you will have to understand authentication with SSL to get all the details.  I am trying this in short here:  If you use self signed certificates or you sign the certificates by your own CA, you will experience browser warnings about unsecure connections. This means each time the user has to confirm he want to continue, until you install either the certificate of the server or the CA into the browser.  Anyway there are preinstalled 'root ca' in every browser. So if you get your certificate signed by one of those root cas you don't have to install the certificate itself. Due to the chain of trust the browser accepts the signed certificate without further steps needed. To get a free of charge signed certificate you can use 'Let's encrypt'.  In a enterprise level, you usually have an enterprise ca, that gets installed on all enterprise machines, and you let your certificate get signed by your enterprise ca.  Regards  Harald 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-24-2018
	
		
		08:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @A Sabatino,    I am not sure why you expect the date resulting from your epoch value. So from what I can see, your value is not what you expect, the conversion is fine.  In the API documentation (https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#dates) it is desribed, that the value ist interpreted in milliseconds from 1. January 1970 00:00:00 GMT. Now when interpreting '1 545 266 262', it results in something like 17.8 days. So a time on the 18. January 1970 seems to be the correct result. To me it appears as if you lost a factor of 1000 somewhere in your epoch value.  Regards  Harald    
	
	 
     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-23-2018
	
		
		01:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 not very sure, but can you try the hdfs command instead? it should be configured to include the necessary jars for the execution:  hdfs dfs -copyFromLocal trial.txt hdfs://sandbox-hdp.hortonworks.com:8020/tmp/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-23-2018
	
		
		01:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 To upload the file from your Windows machine to a Linux machine, you can use a tool like WinSCP. You configure the session for the Linux machine almost identical to the config in Putty. It gives you a GUI to copy files.  On the other hand, when you need to access the Windows machine from Linux, you need to configure an FTP or better SFTP server on Windows that allows access to your NTFS path. Or you use the Windows Network to share, and install Samba, a Windows networking implementation, on the Linux machine. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-22-2018
	
		
		06:50 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am a little guessing here, but I believe its possible that the Hive metastore has statistics (i.e. information on the number of records in the partitions), so that the count might actually not read the complete table. The count on the file must read the file in any case.  But still i think 12 min are really long for processing 3.8 GB, even if this is the compressed size.   Is the count the very first action on the data frame? So that Spark only executes all previous statements (i guess reading the file, uncompressing it etc) when running the count? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-22-2018
	
		
		08:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Rajeswaran,  I guess you are just using Ambari? Or have you implemented some own Python code anywhere? Can you perhaps post what some details on what action you are trying to execute?  Regards  Harald 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-22-2018
	
		
		07:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 HI Ajay,  here is a sizing guide, which seems to address exactly your questions:   https://community.hortonworks.com/articles/135337/nifi-sizing-guide-deployment-best-practices.html  Still i personally wouldn't start with 8Gb RAM per node but at least with 16GB (2 GB per core). Anyway you will have to be clear on the throughput needed (Gb/sec.), not only on the overall volume.  Regards  Harald 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-22-2018
	
		
		07:34 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Can you perhaps also let us know how you try to read the file and the hive table? Also where is the file stored? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













