Member since 
    
	
		
		
		06-07-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                923
            
            
                Posts
            
        
                322
            
            
                Kudos Received
            
        
                115
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 4076 | 10-18-2017 10:19 PM | |
| 4324 | 10-18-2017 09:51 PM | |
| 14809 | 09-21-2017 01:35 PM | |
| 1831 | 08-04-2017 02:00 PM | |
| 2410 | 07-31-2017 03:02 PM | 
			
    
	
		
		
		07-25-2017
	
		
		10:12 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @PJ   These directories exists on journal nodes if that's what you are using or whatever disk you will specify in ambari for namenode when you do your install. I think you will find the following link helpful.  https://hortonworks.com/blog/hdfs-metadata-directories-explained/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-26-2017
	
		
		02:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Matt Clarke  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-19-2018
	
		
		10:21 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Here's the detailed implementation of slowly changing dimension type 2 in Hive using exclusive join approach.  Assuming that the source is sending a complete data file i.e. old, updated and new records.  Steps:  
 Load the recent file data to STG table  Select all the expired records from HIST table  select * from HIST_TAB where exp_dt != '2099-12-31'    Select all the records which are not changed from STG and HIST using inner join and filter on HIST.column = STG.column as below  select hist.* from HIST_TAB hist
inner join STG_TAB stg
on hist.key = stg.key
where hist.column  = stg.column    Select all the new and updated records which are changed from STG_TAB using exclusive left join with HIST_TAB and set expiry and effective date as below  select stg.*, eff_dt (yyyy-MM-dd), exp_dt (2099-12-31)
from STG_TAB stg
left join 
(select * from HIST_TAB where exp_dt = '2099-12-31') hist
  on hist.key = stg.key
where hist.key is null
or hist.column  != stg.column    Select all updated old records from the HIST table using exclusive left join with STG table and set their expiry date as shown below:  select hist.*, exp_dt(yyyy-MM-dd) from
(select * from HIST_TAB where exp_dt = '2099-12-31') hist
left join STG_TAB stg
  on hist.key= stg.key
where hist.key is null
or hist.column!= stg.column     unionall  queries from 2-5 and insert overwrite result to HIST table   More detailed implementation of SCD type 2 can be found here-  https://github.com/sahilbhange/slowly-changing-dimension  Hope this helps! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-11-2017
	
		
		08:01 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Sorry....I attached the wrong code...please find it here nodereadfrommongo.txt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-29-2017
	
		
		06:17 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @mqureshi 
The cluster currently only has one active name node.      
Is there a better way to find out the 'Active Node' ?
I used the following as well.. but does not distinguish 
  curl --user admin:admin http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE&metrics/dfs/FSNamesystem/HAState=active  dh01 ~]$ curl --user admin:admin http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE&metrics/dfs/FSNamesystem/HAState=active
[1] 16533
-bash: metrics/dfs/FSNamesystem/HAState=active: No such file or directory
[ayguha@dh01 ~]$ {
  "href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE",
  "items" : [
    {
      "href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh01.int.belong.com.au/host_components/NAMENODE",
      "HostRoles" : {
        "cluster_name" : "belong1",
        "component_name" : "NAMENODE",
        "host_name" : "dh01.int.belong.com.au"
      },
      "host" : {
        "href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh01.int.belong.com.au"
      }
    },
    {
      "href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh02.int.belong.com.au/host_components/NAMENODE",
      "HostRoles" : {
        "cluster_name" : "belong1",
        "component_name" : "NAMENODE",
        "host_name" : "dh02.int.belong.com.au"
      },
      "host" : {
        "href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh02.int.belong.com.au"
      }
    }
  ]
}
  Also hdfs-site.xml does not have the property dfs.namenode.rpc-address.
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-22-2017
	
		
		09:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@mqureshi The raw file written have the same problem when i view in "Files View" in Ambari. Perhaps the problem is when visualizing although the right encoding in UTF-8. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-22-2017
	
		
		08:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I was able to figure it out. I used the EvaluateJsonPath processor and grabbed the 'Raw_Json' and the 'partition_date' column and then I used the AttributestoJson processor to turn those two attributes into a Json. Afterwards the Inferavroschema processor was able to infer the 'Raw_Json" column as a string and it is now putting it into the Hive table via HiveStreaming correctly.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-21-2017
	
		
		11:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 here are my interface configuration both from my linux host, and my hortonworks sandbox where nifi belongs:          Thanks again ! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-08-2017
	
		
		01:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Dear Vinay Upala  Pls join Me ON Skype to proceed my request  i have same issue Commands can't run 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-12-2017
	
		
		10:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@Shiv Kabra I think there might be a confusion of what Nifi does. I also think you are making it more complex then this needs to be. First thing first. There is a ReplaceText processor which you *might* be able to use to mask data by changing data content and replacing them with your masking values. It supports Regular expressions.  Now, since you are new to Nifi, I will try to give you an overview of what Nifi is purpose built for. Nifi is a data flow management tool. It helps creates a data flow in a few minutes without writing a single line of code. Nifi enables you to ingest data from multiple sources using different protocols where data might be in different format and process the data by may be enriching metadata, changing format (for example JSON to Avro), filtering records, track lineage, move data across data centers (cloud and on-prem) securely, send it to different destinations and much more. Companies use Nifi to manage enterprise data flow. Its rich features include queuing (at each processor level), back pressure and lineage.  2. Can I pass the tables list as an input parameter to the process  To do what? Which processor? Check the list of processors below:  https://nifi.apache.org/docs.html  3. Can I restart a process - in case there is any failure during execution  One of the best features of Nifi. When a failure occurs, you can replay records, stop flow at a processor level, make changes and restart it.  4. Does have any inbuilt Process to handle such requests i.e. doing masking of the sensitive information in tables.  I think ReplaceText should do what you are looking for. Nifi is extensible so you can also write your own processor if one of the 200 plus is not enough for you. There is also an executescript processor that you can use to call outside scripts. 
						
					
					... View more