Member since 
    
	
		
		
		04-05-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                139
            
            
                Posts
            
        
                144
            
            
                Kudos Received
            
        
                16
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 43785 | 02-14-2019 02:53 PM | |
| 3177 | 01-04-2019 08:39 PM | |
| 12476 | 11-05-2018 03:38 PM | |
| 6281 | 09-27-2018 04:21 PM | |
| 3415 | 07-05-2018 02:56 PM | 
			
    
	
		
		
		05-23-2018
	
		
		03:31 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Nice thanks, I figured there had to be a way to tell it that it was a solo node but I just wasn't phrasing it right for google apparently.  Though the problem ended up being solved with a simple delete/reinstall. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-09-2018
	
		
		05:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 Objective  
 To import a versioned flow or revert local changes in a versioned flow, a user must have access to all the components in the versioned flow. As such, it is recommended that restricted components are created at the root process group level if they are to be utilized in versioned flows.  This tutorial illustrates the benefits of this configuration and demonstrates a new feature introduced in Apache NiFi 1.6.0: granular restricted component categories (NIFI-4885). Users can be given access to all restricted components or to specific categories of restricted components.  
 Note: This tutorial assumes you are familiar with setting up a secure Apache NiFi instance and integrating it with a secure Apache NiFi Registry.  Environment  
 This tutorial was tested using the following environment and components:  
 
 Mac OS X 10.11.6  
 Apache NiFi 1.6.0  
   Apache NiFi Registry 0.1.0  User Setup  
 Assume the following:   There are two users, "sys_admin" and "test_user" who have access to both view and modify the root process group.  "sys_admin" has access to all restricted components.    
 "test_user" has access to restricted components requiring 'read filesystem' and 'write filesystem'.        Restricted Controller Service Created in Root Process Group  
 In this first example, sys_admin creates a KeytabCredentialsService controller service (NIFI-4917) at the root process group level:      KeytabCredentialService controller service is a restricted component that requires 'access keytab' permissions:      
  Sys_admin creates a process group ABC containing a flow with GetFile and PutHDFS processors:
       
 GetFile processor is a restricted component that requires 'write filesystem' and 'read filesystem' permissions:      
 PutHDFS is a restricted component that requires 'write filesystem' permissions:      
 The PutHDFS processor is configured to use the root process group level KeytabCredentialsService controller service:      
 Sys_admin saves the process group as a versioned flow:      
 Test_user changes the flow by removing the KeytabCredentialsService controller service:      
 If test_user chooses to revert this change:      
 the revert is successful:      
 Additionally, if test_user chooses to import the ABC versioned flow:      
 The import is successful:      Restricted Controller Service Created in Process Group  
  Now, consider a second scenario where the controller service is created on the process group level.
   
  Sys_admin creates a process group XYZ:
       
 Sys_admin creates a KeytabCredentialsService controller service at the process group level:      
  The same GetFile and PutHDFS flow is created in the process group:
       
 However, PutHDFS now references the process group level controller service:      
 Sys_admin saves the process group as a versioned flow.  
 Test_user changes the flow by removing the KeytabCredentialsService controller service. However, with this configuration, if test_user attempts to revert this change:      
 the revert is unsuccessful because test_user does not have the 'access keytab' permissions required by the KeytabCredentialService controller service:      
 Similarly, if test_user tries to import the XYZ versioned flow:      
 The import fails:       
  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		03-14-2018
	
		
		03:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Looks like this is being handled/answered in a different but related question:  https://community.hortonworks.com/questions/177353/i-am-a-newbie-in-nifi-i-am-using-nifi-in-docker-ho.html?childToView=176700#answer-176700 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-14-2018
	
		
		02:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Akananda Singhania,  I suspect your network configuration on your Docker Engine host is incorrect.  Running the image you listed works as anticipated in a few of the environments available to me.  Let's try to confirm this suspicion by running the following:  docker run busybox ping -c 1 files.grouplens.org
  You should receive output similar to the following.  If not, the configured DNS server is not appropriately routing to external sites.  PING files.grouplens.org (128.101.34.235): 56 data bytes
64 bytes from 128.101.34.235: seq=0 ttl=37 time=39.263 ms
--- files.grouplens.org ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 39.263/39.263/39.263 ms  Could you provide more details about your environment in which you are running Docker?  Of interest would be the output of   cat /etc/resolv.conf  Another option is to try explicitly specifying a DNS server such as those that Google makes available via a command such as:  docker run --dns 8.8.8.8 -d -p 8080:8080 apache/nifi 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-10-2019
	
		
		06:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @@dhieru singh  Thank you for the post , but I am unable to get the values of    $.component.backPressureObjectThreshold  $.status.aggregateSnapshot.flowFilesQueued after processing the EvaluateJsonPath.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-08-2018
	
		
		01:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		6 Kudos
		
	
				
		
	
		
					
							 Objective  
 This tutorial walks you through how to install and secure a NiFi Registry using client certificates.  A quick example of modifying user privileges in the Registry is also included.  A video version of this tutorial can be seen here: https://youtu.be/qD03ao3R-a4  
 Note: To learn the basics of setting up an unsecured Registry and integrating with Apache NiFi see the HCC article Versioned DataFlows with Apache NiFi 1.5 and Apache NiFi Registry 0.1.0.  Environment  
 This tutorial was tested using the following environment and components:  
  Mac OS X 10.11.6   Apache NiFi Registry 0.1.0   Apache NiFi Toolkit 1.5.0   Secure NiFi Registry Configuration  Download & Extract Tarballs  
 Download the tarball for the 0.1.0 Registry release:  
 nifi-registry-0.1.0-bin.tar.gz  
 and the tarball for the 1.5.0 NiFi Toolkit:  
 nifi-toolkit-1.5.0-bin.tar.gz  
 Extract the tars: 
  tar xzvf nifi-registry-0.1.0-bin.tar.gz
 tar xzvf nifi-toolkit-1.5.0-bin.tar.gz
  Generate Configuration and Certificate Files  
 We will use the Apache NiFi TLS Toolkit to generate the necessary keystore, truststore, and client certificates.  In this tutorial, we will create certs for two users:  "sys_admin" and "test_user".  The user “sys_admin” will have full access to the registry while “test_user” will be configured to have targeted access in the registry.  
 In the directory of your NiFi Toolkit install, run the following command: 
  ./bin/tls-toolkit.sh standalone -n "localhost" -C "CN=sys_admin, OU=NIFI" -o target
  
 Note: To see the usage information for the TLS Toolkit, run:  ./bin/tls-toolkit.sh standalone -h .  
 TLS Toolkit generates the following in the 
  target  directory:  
 CN=sys_admin_OU=NIFI.p12
   
 CN=sys_admin_OU=NIFI.p12.password
   
  localhost   
 nifi-cert.pem
   
 nifi-key.key  
 The 
  localhost  directory contains:  
 keystore.jks
   
 nifi.properties
   
 truststore.jks  Registry Configuration  
 Copy the keystore and trustore to the 
  conf  directory of your Registry install.  
 Copy the values of the keystore and truststore properties from the 
  nifi.properties  file: 
  nifi.security.keystore=./conf/keystore.jks
 nifi.security.keystoreType=jks
 nifi.security.keystorePasswd=taceJshGdkyBRy4B7mwaSnM3AkbN7ffewjn3nVIGidw
 nifi.security.keyPasswd=taceJshGdkyBRy4B7mwaSnM3AkbN7ffewjn3nVIGidw
 nifi.security.truststore=./conf/truststore.jks
 nifi.security.truststoreType=jks
 nifi.security.truststorePasswd=WJwg6F2jmUcvpxRHDiseNRc/VV59WOS+SdrZ5amtnsE
  
 into the values for the equivalent properties in the  nifi-registry.properties  file: 
  nifi.registry.security.keystore=./conf/keystore.jks
 nifi.registry.security.keystoreType=jks
 nifi.registry.security.keystorePasswd=taceJshGdkyBRy4B7mwaSnM3AkbN7ffewjn3nVIGidw
 nifi.registry.security.keyPasswd=taceJshGdkyBRy4B7mwaSnM3AkbN7ffewjn3nVIGidw
 nifi.registry.security.truststore=./conf/truststore.jks
 nifi.registry.security.truststoreType=jks
 nifi.registry.security.truststorePasswd=WJwg6F2jmUcvpxRHDiseNRc/VV59WOS+SdrZ5amtnsE
  
 While you are in  nifi-registry.properties , modify the HTTP and HTTPS web properties as follows: 
  nifi.registry.web.http.host=
 nifi.registry.web.http.port=
 nifi.registry.web.https.host=localhost
 nifi.registry.web.https.port=18443
  
 In the same Registry  conf  directory, modify  authorizers.xml  in two places.  First in the userGroupProvider section, add the "sys_admin" DN to the "Initial Admin Identity 1" property: 
  <property name="Initial User Identity 1">CN=sys_admin, OU=NIFI</property>
  
 Then in the accessPolicyProvider section, add the "sys_admin" DN to the "Initial Admin Identity" property: 
  <property name="Initial Admin Identity">CN=sys_admin, OU=NIFI</property>
  
 Note: During this step, it is crucial that you specify the exact DN string used when the TLS Toolkit was invoked. A common error is entering "CN=sys_admin,OU=NIFI" which will not work as it has a missing space.  Add Certificate to Keychain  
 Double-click on the .p12 file that was generated by the TLS Toolkit.  When prompted, provide the password from the .password file.  
     Start the Registry  
 In a terminal window, navigate to the directory where NiFi Registry was installed and run: 
  ./bin/nifi-registry.sh start
  Open Registry UI  
 Navigate to the registry UI in your web browser (Chrome used in the following examples):  
 https://localhost:18443/nifi-registry  
 When prompted, select the "sys_admin" cert to add to your browser:  
     
 When prompted, enter your "login" keychain password:  
     
 You should now be able to view the Registry UI as the "CN=sys_admin, OU=NIFI" user:  
     Registry Administration  
 The "sys_admin" user has full access to the registry.  Here are some examples of administration functions immediately available.  Bucket Creation  
 Select the Settings icon (
   ) in the top right corner of the screen. In the Buckets window that appears, select the "New Bucket" button.  
     
 In the dialog that appears, enter the bucket name "ABC" and select the "Create" button.  
     
 The "ABC" bucket is created:  
     User Administration  
 Select "Users" at the top of the UI to access the user administration area of the Registry:  
     
 Select the pencil icon (
   ) next to the "CN=sys_admin, OU=NIFI" user.  This will open a side nav that shows the Special Privileges and group Membership:  
     
 You can see that the "sys_admin" was given all special privileges as the Initial Admin Identity (IAI).  The privileges for the IAI are not editable. Let's create a second user to see how bucket access can be restricted by modifying these privileges.  Second User Creation  
 Close the side nav and select the "Add User" button.  
     
 Enter "CN=test_user, OU=NIFI" in the Identity field and select the "Add" button:  
     
 "CN=test_user", OU=NIFI" user is created:  
     Second User Certificate  
 Next we need a client certificate for "test_user".  
 Return to the directory of your NiFi Toolkit installation and run: 
  ./bin/tls-toolkit.sh standalone -C "CN=test_user, OU=NIFI" -o target
  
 NOTE:The output directory must be set to  target  in order for the existing CA certificate in that directory to be used.  
 TLS Toolkit generates the following additional files in the 
  target  directory:  
 CN=test_user_OU=NIFI.p12
   
 CN=test_user_OU=NIFI.p12.password  
 Add the .p12 cert to the Keychain as described earlier.  However, choose a different browser this time to access the UI (Safari in the following examples):  
 https://localhost:18443/nifi-registry  
 Add the client certificate to the browser:  
     
 You should now be able to view the Registry UI as the "CN=test_user, OU=NIFI" user:  
     
 You can see that "test_user" has no access to Settings.  
 Return to the Chrome browser where "sys_admin" is the user. Give "test_user" read-only bucket privileges:  
     
 Return to the Safari browser where "test_user" is the user. Reload the browser. Select the Settings icon which is now available. The ABC bucket is now visible, but note that the Action to delete the bucket is not enabled, which is consistent with the privileges given to this user:  
     Additional Help  
 If you would like to learn more about NiFi Registry functionality and working with versioned flows in NiFi, see the following articles:  
  Versioned DataFlows with Apache NiFi 1.5 and Apache NiFi Registry 0.1.0   Apache NiFi - How do I deploy my flow?   
 Or documentation:  
  Apache NiFi Registry User Guide   Apache NiFi Registry System Administrator's Guide   Versioning a DataFlow (Apache NiFi User Guide)  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		01-25-2018
	
		
		08:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 Objective  
 This article highlights some of the latest UI enhancements added in Apache NiFi 1.5.0.  Environment  
 The examples shown in the article utilized the following environment and components:  
 
 Mac OS X 10.11.6  
 Apache NiFi 1.5.0   "Primary Node" Processors Identification  
 In a NiFi Cluster, processors that have been configured for "Primary node" execution are now identified in the UI by a "P".  On the canvas, the "P" is visible next to the processor icon:      
 The "P" is also shown in the Processors tab on the Summary page, specifically in the Name column:      Finding Processors Quickly in the Summary Page  
 If your flow has hundreds of processors, it can be difficult differentiating between them in the Summary page (accessible from the top-right Global menu).  On the Processors tab, a "Process Group" column has been added to display the name of the parent process group containing the component:      
 Additionally, when hovering over the "Go to location" button the tooltip now includes the path of the component.  NiFi Registry Integration  
 NiFi 1.5.0 is the first release to integrate with the Apache NiFi Registry.  NiFi dataflows can now be versioned on the process group level and easily deployed across different NiFi instances. More information can be found in the HCC article "Versioned DataFlows with Apache NiFi 1.5 and Apache NiFi Registry 0.1.0" and in the "Versioning a Dataflow" section of the NiFi User Guide.  However, here are some related UI changes to highlight.  Connecting a Registry Client  
 The NiFi Settings window (accessible from Controller Settings in the top-right Global menu) now has a "Registry Clients" tab where you can connect NiFi to a NiFi Registry:      Importing a Flow  
 If your NiFi instance is connected to an active Registry, when adding a process group to the canvas there is also an option to "Import" a versioned flow:      
 Selecting "Import" prompts the user to choose a version of a flow to add to the canvas:      Version States  
 There are new icons that show:   the version state of an individual process group  the count of the statuses of versioned process groups within a process group  the count of the statuses of versioned process groups in the root process group       
 Here are the meanings of each icon/state:  
 
 
 
    Up to date   
 
 
    Locally modified   
 
 
    Stale   
 
 
    Locally modified and stale   
 
 
    Sync failure     Version state information is also shown in the "Process Groups" tab of the Summary Page:      
  As mentioned previously, more information regarding NiFi and NiFi Registry integration can be found in the "Versioning a Dataflow" section of the NiFi User Guide.
  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		01-19-2018
	
		
		08:13 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 Objective  
 This tutorial walks you through how to install and setup a local Apache NiFi Registry to integrate with Apache NiFi and start using versioned NiFi dataflows.  It assumes basic experience with NiFi but little to no experience with NiFi Registry.  A video version of this tutorial can be seen here:  https://youtu.be/X_qhRVChjZY  Environment  
 This tutorial was tested using the following environment and components:  
 
 Mac OS X 10.11.6  
 Apache NiFi 1.5.0  
 Apache NiFi Registry 0.1.0   
 Note: Apache NiFi 1.5.0 is the first NiFi release to support integration with the NiFi Registry.  Nifi Registry 0.1.0 is the first and currently only version of the application.  Apache NiFi Registry Configuration  Registry Installation  
 Download the tarball of the 0.1.0 Registry release:  
 nifi-registry-0.1.0-bin.tar.gz  
 Extract the tar: 
 tar xzvf nifi-registry-0.1.0-bin.tar.gz
  Start the Registry  In a terminal window, navigate to the directory where NiFi Registry was installed. Run: 
 bin/nifi-registry.sh start
  Open Registry UI  
 Navigate to the registry UI in your browser:  
 http://localhost:18080/nifi-registry  
 Note:By default the registry is unsecured.  The port can be changed by editing the nifi-registry.properties file in the NiFi Registry conf directory (the exact property to change is nifi.registry.web.http.port), but the default port is 18080.  Bucket Creation  
 A Bucket is a container that stores and organizes flows in the Registry.  The Registry is empty as there are no buckets/flows yet.      
 To create a bucket, select the Settings icon (  ) in the top right corner of the screen. In the Buckets window that appears, select the "New Bucket" button.      
 Enter the bucket name "Test" and select the "Create" button.      
 The "Test" bucket is created:      
 There are no permissions configured by default, so anyone is able to view, create and modify buckets in this instance. For information on securing the Registry, see the NiFi Registry System Administrator’s Guide.  Apache NiFi Configuration  Connect NiFi to the Registry  
 With the Registry is running, we can tell NiFi about it.  
 In NiFi, select "Controller Settings" from the top-right Global menu:      
 Select the Registry Clients tab and the "+" button to add a new Registry Client. Enter a name and the URL of the Registry instance (http://localhost:18080):      Versioned DataFlows  Start Version Control on a Process Group  
 NiFi can now place a process group under version control which saves it as a flow resource in the Registry.  
 Right-click on a process group and select "Version→Start version control" from the context menu:      
 The local registry instance and "Test" bucket are chosen by default to store your flow since they are the only registry connected and bucket available. Enter a flow name, flow description, comments and select "Save":      
 As indicated by the Version State icon (  ) in the top left corner of the component, the process group is now saved as a versioned flow in the registry.      
 Go back to the Registry UI and return to the main page to see the versioned flow you just saved (a refresh may be required):      Save Changes to a Versioned Flow  
 Changes made to the versioned process group can be reviewed, reverted or saved.  
 For example, if changes are made to the ABCD flow, the Version State changes to "Locally modified" (  ). The right-click menu will now show the options "Commit local changes", "Show local changes" or "Revert local changes":      
 Select "Show local changes" to see the details of the changes made:      
 Return to the context menu and select "Commit local changes". Enter comments and select "Save" to save the changes:      
 Version 2 of the flow is saved:      
 Note: Some actions made to the versioned process group are not considered local changes.  More information can be found in the
 Managing Local Changes section of the NiFi User Guide.  Import a Versioned Flow  
 With a flow existing in the Registry, we can use it to illustrate how to import a versioned process group.  
 In NiFi, select Process Group from the Components toolbar and drag it onto the canvas:      
 Instead of entering a name, click the Import link:      
 Choose the version of the flow you want imported and select "Import":      
 A second identical PG is now added:      Help  
 To learn more about NiFi Registry functionality and working with versioned flows in NiFi, see the following links:  
 
 Apache NiFi Registry User Guide  
 Apache NiFi Registry System Administrator's Guide  
 Versioning a DataFlow (Apache NiFi User Guide)  
 Apache NiFi - How do I deploy my flow?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		11-07-2017
	
		
		08:03 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 Objective  
 This is the second of a two article series on the ValidateRecord processor.  The first walks you through a NiFI flow that converts a CVS file into JSON format and validates the data against a given schema.  
 This article discusses the effects of enabling/disabling the "Strict Type Checking" property of the ValidateRecord processor.  
 Note: The ValidateRecord processor was introduced in NiFi 1.4.0.  Environment  
 This tutorial was tested using the following environment and components:  
  Mac OS X 10.11.6   Apache NiFi 1.4.0   Strict Type Checking Property  
 A useful property of the ValidateRecord processor is "Strict Type Checking". If the incoming data has a Record where a field is not of the correct type, this property determines how to handle the Record. If set to "true", the Record will be considered invalid. If set to "false", the Record will be considered valid.  
 To demonstrate both cases, we need to ingest data that can distinguish between different types (which our CSV data from the first article could not).  Let's grab a snippet of the JSON candy data and make some changes.  Specifically let's put a string value for the "chocolate" field (which is of type int) and let's put a decimal value for the "competitorname" field (which is of type string😞 
 [ {
  "competitorname" : "One dime",
  "chocolate" : "0",
  "fruity" : 0,
  "caramel" : 0,
  "peanutyalmondy" : 0,
  "nougat" : 0,
  "crispedricewafer" : 0,
  "hard" : 0,
  "bar" : 0,
  "pluribus" : 0,
  "sugarpercent" : 0.011,
  "pricepercent" : 0.116,
  "winpercent" : 32.261086
  }, {
  "competitorname" : 3.14159,
  "chocolate" : 1,
  "fruity" : 0,
  "caramel" : 0,
  "peanutyalmondy" : 0,
  "nougat" : 0,
  "crispedricewafer" : 1,
  "hard" : 0,
  "bar" : 0,
  "pluribus" : 1,
  "sugarpercent" : 0.87199998,
  "pricepercent" : 0.84799999,
  "winpercent" : 49.524113
} ]
  
 Here is the JSON file:  type-checking.txt (Change the extension from .txt to .json after downloading)    
 Place the type-checking.json file in your input directory:      In order to process the JSON file, the ValidateRecord processor needs to use a JSON Record Reader.  Go to the configuration window for the processor and select "Create new service..." for the Record Reader:      
 Select JSONTreeReader, then "Create":      and then select the Arrow icon next to the reader:      
 Save the changes made before going to the Controller Service.  
 Go to the configuration window of the JsonTreeReader controller service, select "AvroSchemaRegistry" for the Schema Registy and then select Apply:      Enable the JsonTreeReader service.  The flow is ready to run.  
 Start the GetFile, UpdateAtttribute and ValidateRecord processors.  With "Strict Type Checking" set to "true", the 2 records are considered invalid and are routed to that connection:          
 Start the LogAttribute processor to clear the queue. Stop all processors.  Place the type-checking.json file in your input directory again.  
 Now let's change the Strict Type Checking property to "false":      
 Running the flow this time, the 2 records are considered valid and are routed to that connection:      
 Note: The documentation for the Strict Type Checking property states that when set to false, the relevant record fields will be coerced into the correct type.  This functionality is currently broken (see NIFI-4579). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		10-23-2017
	
		
		07:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Objective  
 This tutorial demonstrates how to use the
 QueryDatabaseTable and PutKudu processors to read data from a MySQL database and put into Kudu.  Thanks to @Cam Mach for his assistance with this article.   
 Note: The PutKudu processor was introduced in NiFi 1.4.0.  Environment  
 This tutorial was tested using the following environment and components:  
 
 Mac OS X 10.11.6  
 Apache NiFi 1.4.0  
 Apache Kudu 1.5.0  
 MySQL 5.7.13   PutKudu (AvroReader)  Demo Configuration  MySQL Setup  
 In your MySQL instance, choose a database ("nifi_db" in my instance) and create the table "users": 
 unix> mysql -u root -p
unix> Enter password:<enter>
mysql> use nifi_db;
mysql>CREATE TABLE `users` (
  `id` mediumint(9) NOT NULL AUTO_INCREMENT,
  `title` text,
  `first_name` text,
  `last_name` text,
  `street` text,
  `city` text,
  `state` text,
  `zip` text,
  `gender` text,
  `email` text,
  `username` text,
  `password` text,
  `phone` text,
  `cell` text,
  `ssn` text,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=103 DEFAULT CHARSET=latin1;
  
 Add data to the "users" table: 
 mysql>INSERT INTO `users` (`id`, `title`, `first_name`, `last_name`, `street`, `city`, `state`, `zip`, `gender`, `email`, `username`, `password`, `phone`, `cell`, `ssn`)
VALUES (1, 'miss', 'marlene', 'shaw', '3450 w belt line rd', 'abilene', 'florida', '31995', 'F', 'marlene.shaw75@example.com', 'goldenpanda70', 'naughty', '(176)-908-6931', '(711)-565-2194', '800-71-1872'),
(2, 'ms', 'letitia', 'jordan', '2974 mockingbird hill', 'irvine', 'new jersey', '64361', 'F', 'letitia.jordan64@example.com', 'lazytiger614', 'aaaaa1', '(860)-602-3314', '(724)-685-3472', '548-93-7031'),
(3, 'mr', 'todd', 'graham', '5760 spring hill rd', 'garden grove', 'north carolina', '81790', 'M', 'todd.graham39@example.com', 'purplekoala484', 'paintball', '(230)-874-6532', '(186)-529-4912', '362-31-5248'),
(4, 'mr', 'seth', 'martinez', '4377 fincher rd', 'chandler', 'south carolina', '73651', 'M', 'seth.martinez82@example.com', 'bigbutterfly149', 'navy', '(122)-782-5822', '(720)-778-8541', '200-80-9087'),
(5, 'mr', 'guy', 'mckinney', '4524 hogan st', 'iowa park', 'ohio', '24140', 'M', 'guy.mckinney53@example.com', 'blueduck623', 'office', '(309)-556-7859', '(856)-764-9146', '973-37-9077'),
(6, 'ms', 'anna', 'smith', '5047 cackson st', 'rancho cucamonga', 'pennsylvania', '56486', 'F', 'anna.smith74@example.com', 'goldenfish121', 'albion', '(335)-388-7351', '(485)-150-6348', '680-20-6440'),
(7, 'mr', 'johnny', 'johnson', '7250 bruce st', 'gresham', 'new mexico', '83973', 'M', 'johnny.johnson73@example.com', 'crazyduck127', 'toast', '(142)-971-3099', '(991)-131-1582', '683-26-4133'),
(8, 'mrs', 'robin', 'white', '7882 northaven rd', 'orlando', 'connecticut', '40452', 'F', 'robin.white46@example.com', 'whitetiger371', 'elizabeth', '(311)-659-3812', '(689)-468-6420', '960-70-3399'),
(9, 'miss', 'allison', 'williams', '7648 edwards rd', 'edison', 'louisiana', '52040', 'F', 'allison.williams82@example.com', 'beautifulfish354', 'sanfran', '(328)-592-3520', '(550)-172-4018', '164-78-8160');
  
     Kudu Setup  
 For my setup, I followed the 
 Apache Kudu Quickstart instructions to easily set up and run a Kudu VM.  
 To check that your VM is running: 
 unix> VBoxManage list runningvms
"kudu-demo" {b39279b5-3dd6-478a-ac9d-2204bf88e7b9}
  
 To see what IP Kudu is running on: 
 unix> VBoxManage guestproperty get kudu-demo /VirtualBox/GuestInfo/Net/0/V4/IP
Value: 192.168.58.100
  
 The Kudu web client runs on port 8051:  
     
 Create a table in Kudu by first connecting to Impala in the virtual machine: 
 unix> ssh demo@quickstart.cloudera -t impala-shell
demo@quickstart.cloudera's password:
[quickstart.cloudera:21000] >
  
 (
 Note: The username and password for the Quickstart VM is "demo".)  
 Create the Kudu table with the same columns and data types as the MySQL table: 
  [quickstart.cloudera:21000] > CREATE TABLE users_kudu
 (
 id BIGINT,
 title STRING,
 first_name STRING,
 last_name STRING,
 street STRING,
 city STRING,
 state STRING,
 zip STRING,
 gender STRING,
 email STRING,
 username STRING,
 password STRING,
 cell STRING,
 ssn STRING,
 PRIMARY KEY(id)
 )
 PARTITION BY HASH PARTITIONS 16
 STORED AS KUDU;
  
     NiFi Flow Setup  
 Follow the following detailed instructions to set up the flow.  Alternatively, a template of the flow can be downloaded here:  putkudu-querydatabasetable.xml  
 1. Start NiFi.  Two controller services are needed for the flow. Click the "Configuration" button (gear icon) from the Operate palette:  
     
 This opens the NiFi Flow Configuration window.  Select the "Controller Services" tab.  Click the "+" button and add a DBCPConnectionPool controller service:  
     
 Configure the controller service as follows (adjusting the property values to match your own MySQL instance and environment):  
     
 Next, add an AvroReader controller service:  
     
 Apply the default configuration:  
     
 Select the "lightning bolt" icon for each controller service to enable them:  
     
 2. Return to the NiFi canvas.  Add a QueryDatabaseTable processor:  
     
 Configure the processor as follows:  
     
 where:  
 
 The DBCPConnectionPool controller service created earlier is selected for Database Connection Pooling Service  
 "users" is entered for the Table Name  
 "id" is entered for the Maximum-value Columns   
 3. Add a PutKudu processor and connect the two processors:  
     
 Configure the PuKudu processor as follows:  
     
 where:  
 
 "192.168.58.100:7051" is entered for the Kudu Masters IP and port (7051 is the default port)  
 "impala::default.users_kudu" is entered for the Table Name  
 Skip head line property is set to "false"  
 The AvroReader controller service created earlier is selected for Record Reader   
 Auto-terminate the Success relationship:  
     
 On the canvas, make a "failure" relationship connection from the PutKudu processor to itself:  
     
 4. The flow is ready to run.  Run Flow  
 Start the QueryDatabaseTable processor.  
     
 Looking at the contents of the FlowFile in the queue, the data from the MySQL table has been ingested and converted to Avro format:  
     
 Start the PutKudu processor to put the data into Kudu:  
     
 This can be confirmed via  a Select query:  
     
 With the flow still running, add another row of data to the Mysql "users" table:  
     
 The flow processes this data and the new row appears in Kudu:  
     Helpful Links  
 Here are some links to check out if you are interested in other flows which utilize the record-oriented processors and controller services in NiFi:  
 
 Convert CSV to JSON, Avro, XML using ConvertRecord  
 Installing a local Hortonworks Registry to use with Apache NiFi  
 Running SQL on FlowFiles using QueryRecord Processor  
 Using PublishKafkaRecord_0_10 (CSVReader/JSONWriter) in Apache NiFi 1.2+  
 Using PutElasticsearchHttpRecord (CSVReader)  
 Using PartitionRecord (GrokReader/JSONWriter) to Parse and Group Log Files  
 Geo Enrich NiFi Provenance Event Data using LookupRecord  
 Using PutMongoRecord to put CSV into MongoDB  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
 
         
					
				













