Member since 
    
	
		
		
		01-11-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                355
            
            
                Posts
            
        
                232
            
            
                Kudos Received
            
        
                74
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 9270 | 06-19-2018 08:52 AM | |
| 3915 | 06-13-2018 07:54 AM | |
| 4576 | 06-02-2018 06:27 PM | |
| 5290 | 05-01-2018 12:28 PM | |
| 6832 | 04-24-2018 11:38 AM | 
			
    
	
		
		
		11-08-2017
	
		
		08:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Salda Murrah  You can set the "Compression format" to "use mime.type". This way, the processor will look for an attribute called mime.type and dynamically infer the format and hence the decompression algorithm.  For this to work, you need to use an UpdateAttribute to add an attribute mime.type and set it's value following your logic. Keep in mind that UpdateAttribute have rules logic in the advanced configuration that can be useful for your use case : https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/additionalDetails.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-07-2017
	
		
		06:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Michael Jonsson  NiFi is not packaged in HDI this is why you can not find it in Ambari from Add Service. You can install an HDP + HDF cluster on Azure in IaaS mode to have both platforms. You can also provision Azure VMs and install Ambari + HDF only, and use it to with HDI. This way you have two separate clusters HDF and HDI. You can also use Cloudbreak for easier installation.  Theoretically, you should be able to manually install NiFi on HDI nodes but this won't be supported neither manager by Ambari (no monitoring, configuration, upgrade, etc ..). So this may make sense for testing/POC. I've never tried it though. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-07-2017
	
		
		05:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi @dhieru singh  QueryDatabaseTable query the database at the defined schedule. Even if you don't custom the scheduling of the processor, there's one by default in the scheduling tab.  The processor is intended to be used on Primary only to avoid ingesting data several times. This processor doesn't accept an incoming connection so you can not customize it dynamically with previous flow parts. So if you deploy it at all nodes, each node will ingest the exact same data (data duplication) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-07-2017
	
		
		10:57 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @pranayreddy bommineni  Have you seen this article that describe how to use NiFi Rest API to add a processor and the configure it ? https://community.hortonworks.com/articles/87217/change-nifi-flow-using-rest-api-part-1.html  Look to step 7 for configuration only. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-06-2017
	
		
		02:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		7 Kudos
		
	
				
		
	
		
					
							 Introduction  
 This is part 3 of a series of articles on Data Enrichment with NiFi:  
 
 Part 1: Data flow enrichment with LookupRecord and SimpleKV Lookup Service is available here  
 Part 2: Data flow enrichment with LookupAttribute and SimpleKV Lookup Service is available here  
 Part 3: Data flow enrichment with LookupRecord and MongoDB Lookup Service is available here   
 Enrichment is a common use case when working on data ingestion or flow management. Enrichment is getting data from external source (database, file, API, etc) to add more details, context or information to data being ingested. In Part 1 and 2 of this series, I showed how to use LookupRecord and LookupAttribute to enrich the content/metadata of a flow file with a Simple Key Value Lookup Service. Using this lookup service helped us implement an enrichment scenario without deploying any external system. This is perfect for scenarios where reference data is not too big and don't evolve too much. However, managing entries in the SimpleKV Service can become cumbersome if our reference data is dynamic or large.  
 Fortunately, NiFi 1.4 introduced a new interesting Lookup Service with NIFI-4345 : MongoDBLookupService. This lookup service can be used in NiFi to enrich data by querying a MongoDB store in realtime. With this service, your reference data can live in a MongoDB and can be updated by external applications. In this article, I describe how we can use this new service to implement the use case described in part 1.  Scenario  
 We will be using the same retail scenario described in Part 1 of this series. However, our stores reference data will be hosted in a MongoDB rather than in the SimpleKV Lookup service of NiFi.  
 For this example, I'll be using a hosted MongoDB (BDaaS) on MLab. I created a database "bigdata" and added a collection "stores" in which I inserted 5 documents.  
     
 Each Mongo document contains information on a store as described below:  { 
"id_store" : 1, 
"address_city" : "Paris",  
"address" : "177 Boulevard Haussmann, 75008 Paris",
"manager" : "Jean Ricca",
"capacity" : 464600
}  The complete database looks like this:        Implementation  We will be using the exact same flow and processors used in part 1. The only difference is using a MongoDBLookupService instead of SimpleKVLookupService with Lookup record. The configuration of the LookupRecord processor looks like this:      Now let's see how to configure this service to query my MongoDB and get the city of each store. As you can see, I'll query MongoDB by the id_store that I read from each flow file.  Data enrichment  If not already done, add a MongoDBLookupService and configure it as follows:  
 Mongo URI: the URI used to access your MongoDB database in the format mongodb://user:password@hostname:port  Mongo Database Name : the name of your database. It's bigdata in my case  Mongo Collection Name : the name of the collection to query for enrichment. It's stores in my case  SSL Context Service and Client Auth : use your preferred security options  Lookup Value Field : the name of the field you want the lookup service to return. For me, it's address_city since I am looking to enrich my events with the city of each store. If you don't specify which field you want, the whole Mongo document is returned. This is useful if you want to enrich your flow with several attributes.         Results  To verify that our enrichment is working, let's see the content of flow files using the data provenance feature in our global flow.      As you can see, the attribute city has been added to the content of my flow file. The city Paris has been added to Store 1 which correspond to my data in MongoDB. What happened here is that the lookup up service extracted the id_store which is 1 from my flow file, generated a query to mongo to get the address_city field of the store having id_store 1, and added the result into the field city in my new generated flow files. Note that if the query has returned several results from Mongo, only the first document is used.       By setting an empty Lookup Value Field, I can retrieve the complete document corresponding to the query { "id_store" : "1" }          Conclusion  Lookup services in NiFi is a powerful feature for data enrichment in realtime. Using Simple Key/Value lookup service is straightforward for non-dynamic scenarios. In addition, it doesn't require external data source. For more complex scenarios, NiFi started supporting lookup from external data source such as MongoDB (available in NiFi 1.4) and HBase (NIFI-4346 available in NiFi 1.5). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		11-04-2017
	
		
		10:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Andre Labbe If you found that this answer addressed your question,
please take a moment to click "Accept" below. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-04-2017
	
		
		09:51 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @manisha jain  You can have several users work simultaneously on the UI and it will refresh automatically. You can also organise your flow by process groups to make easier management and edition of your flow files 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-04-2017
	
		
		09:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @manisha jain  The approach described above (Avro -> Json -> Avro) is no more required with new record based processors in NiFi. You can use UpdateRecord processor to add new attribute to your flow files whatever their format is (Avro, JSON, CSV, etc).  Just define your Avro Schema with the new attribute and use in Avro Writter. Then use UpdateAttribute to add the value as in the below example where I add city attribute with the static value Paris:      And the result will look like       Note that in my example data is in JSON format but the same approach works for Avro. Just use Avro reader/writer instead of Json.  If you are new to record based processors read these two articles:   https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi  https://community.hortonworks.com/articles/138632/data-flow-enrichment-with-nifi-lookuprecord-proces.html   I hope this is helpful. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-02-2017
	
		
		02:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Wesley Bohannon   PFA the template enrichlookuprecord.xml 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-02-2017
	
		
		02:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @pranayreddy bommineni   You can add limit 1 to your SQL query with ExecuteSQL to get only one row for schema inference 
						
					
					... View more