Member since 
    
	
		
		
		03-16-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                707
            
            
                Posts
            
        
                1753
            
            
                Kudos Received
            
        
                203
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 6883 | 09-21-2018 09:54 PM | |
| 8613 | 03-31-2018 03:59 AM | |
| 2545 | 03-31-2018 03:55 AM | |
| 2740 | 03-31-2018 03:31 AM | |
| 6151 | 03-27-2018 03:46 PM | 
			
    
	
		
		
		03-31-2018
	
		
		12:01 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 @rdoktorics  Richard, Cloudbreak 1.16.5 is the version that is presented on hortonworks.com website, Software Download/ HDP section. However, documentation shows Cloudbreak 2.4 (see https://docs.hortonworks.com/). Is that right? Where should Leszek go to download Cloudbreak 2.4 for his HDP 2.6.4 installation requirement? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-30-2018
	
		
		07:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 @Leszek Leszczynski  It could be a bug. The reason is well explained in this article, however, in your case, data node services are not present as expected.  https://community.hortonworks.com/articles/12981/impact-of-hdfs-using-cloudbreak-scale-dwon.html  I'll escalate the question to Cloudbreak team. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-30-2018
	
		
		06:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 @Alex Woolford  Look at:    https://cwiki.apache.org/confluence/display/KAFKA/KIP-103%3A+Separation+of+Internal+and+External+traffic  https://issues.apache.org/jira/browse/KAFKA-4565   If helpful, please vote and accept best answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-28-2018
	
		
		01:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 @Saikrishna Tarapareddy  The community processor mentioned by Tim is a good example on how to write a custom processor. It is limited to Put action and quite old. You would have to rebuild it using more up-to-date libraries.   Community processors are not supported by Hortonworks.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-28-2018
	
		
		04:22 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 @Mushtaq Rizvi  Yes. Please follow the instructions on how to add HDF components to an existent HDP 2.6.1 cluster:  https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1/bk_installing-hdf-on-hdp/content/upgrading_ambari.html  This is not the latest HDF, but it is compatible with HDP 2.6.1 and I was pretty happy with its stability and recommend it.   You would be able to add Apache NiFi 1.5, but also Schema Registry. NiFi Registry is part of the latest HDF 3.1.x, however, you would have to install it in a separate cluster and it is not worth it the effort for what you are trying to achieve right now. I would proceed with HDP upgrade when you are ready for HDF 3.2 which will be probably launched in the next couple months.   In case that you can't add another node to your cluster for NiFi, try to use one of the nodes that has low CPU utilization and some disk available for NiFi lineage data storage. It depends on how much lineage you want to preserve, but you should be probably fine with several tens of GB for starters.  If this response helped, please vote and accept answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-28-2018
	
		
		04:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 @Saikrishna Tarapareddy  Unfortunately, there is no specialized processor to connect to Google Big Query and execute queries. There has been some discussions about a set of new processors to support various Google Cloud services, but those processors are still to be planned into a release.  Until then you can use ExecuteScript processor.  Here is an example on how to write a script using Python: https://cloud.google.com/bigquery/create-simple-app-api#bigquery-simple-app-print-result-python . At https://cloud.google.com/bigquery/create-simple-app-api you can see other examples using other languages also supported by ExecuteScript processor.  Obviously, there is always the possibility to develop your own processor leveraging the Java example provided by Google doc. Example of how to build NiFi custom processor:  https://community.hortonworks.com/articles/4318/build-custom-nifi-processor.html  If this response addressed reasonably your question, please vote and accept answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-27-2018
	
		
		09:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 @Christian Lunesa  As you probably know, the 500 Internal Server Error is a very general HTTP status code that means something has gone wrong on the website's server, but the server could not be more specific on what the exact problem is.   You need to provide more information. There could be many reasons.  1) Does it happen with other tables?  2) Is your cluster kerberized?  3) Did you check Ambari server log for more details? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-27-2018
	
		
		03:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 @Mushtaq Rizvi  As you already know, in addition to the API, Atlas uses Apache Kafka as a notification server for communication between hooks and downstream consumers of metadata notification events. There is no other Notification Server capability like SMTP. You would have to write your own filtering through events for those tables that you are interested. That is your presented option 2.  You may not like it, but this is the best answer as of now. If you had NiFi you could easily write that Notification Server by filtering the events based on a lookup list of tables. With latest versions of NiFi you can take advantage of powerful processors like LookupRecord, QueryRecord, also processors around SMTP, email etc. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-27-2018
	
		
		03:34 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		6 Kudos
		
	
				
		
	
		
					
							 @Gubbala Sasidhar  No, it is not enough to do it only with Kafka and HBase. Kafka is your transport layer and HBase is your target data store. You need few more components to connect to the source, post to Kafka, post to HBase.   In order to read Oracle DB log file you need a tool capable to perform Change Data Capture (CDC) from Oracle DB logs. That tool then would write to a Kafka topic. That is your "Kafka Producer" application. Then you would need to write an application that will read from Kafka topic and  put the data to HBase. That is your "Kafka Consumer" application.  Example of CDC capable tools are GoldenGate, SharePlex, Attunity etc.  If you need a tool that will be used enterprise wide to connect to various source types, e.g. Oracle, SQL Server, MySQL, etc. and access database logs instead of issuing expensive queries on source databases, then Attunity is probably your best bet. However, if you don't plan to acquire and you already have GoldenGate or SharePlex then use those. For example, SharePlex writes directly to Kafka. Another option with Oracle would be to use its Change Data Capture feature (https://docs.oracle.com/cd/B28359_01/server.111/b28313/cdc.htm) and then write that Kafka Producer application to gather the data from the source and write to Kafka topic. Then have your consumer application pick-up the data and put to HBase.  Apache NiFi will add this year a CDC processor for Oracle. Currently, NiFi has only the MySQL CDC processor.  If you want to make your life easier, use Apache NiFi (part of Hortonworks DataFlow) to implement Kafka Producer, Kafka Consumer, write to HBase. I see that you tagged your question with kafka-streams. You probably assume writing that Kafka Producer and Consumer using Kafka Stream, That is an alternate option to NiFi, but it will require more programming and it will require you to deal with HA and security aspects, while NiFi provides them out of box and developing a NiFi flow is much easier. NiFi has also Registry component which allows you to manage versions of the flows like source code. Hortonworks Schema Registry provides you with that structure that allows your Kafka producer and consumer applications to share schemas.  If this response helped, please vote and accept it as the best answer, if appropriate. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-12-2018
	
		
		02:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Jane Becker   Happy it worked out. Enjoy the rest of the week-end! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				













