About araujo

araujo · ‎02-17-2022

Hi, @spserd , Please have a look at this section of the NiFi documentation. It says: Wildcard certificates (i.e. two nodes node1.nifi.apache.org and node2.nifi.apache.org being assigned the same certificate with a CN or SAN entry of *.nifi.apache.org) are not officially supported and not recommended. There are numerous disadvantages to using wildcard certificates, and a cluster working with wildcard certificates has occurred in previous versions out of lucky accidents, not intentional support. Wildcard SAN entries are acceptable if each cert maintains an additional unique SAN entry and CN entry. Even though you are not using an asterisk wildcard your single certificate doesn't meet the requirements of a unique SAN and CN entries and is not recommended/supported. You should have separate certificates for each host. Cheers, André

araujo · ‎02-17-2022

@georg , There's not much info in your description for me to give you a precise answer. But, given you mentioned "container", I'm assuming that the other system is running on a Docker container in the same host as NiFi is running. If that's the case, you can use "docker inspect" to find out what the exposed port of the container is and then your URL will be something like either "http://localhost:<port>" or "http://host.example.com:<port>". If TLS is being used, replace "http" with "https". I hope this helps. If not, please provide more details information about the components you're talking about. Cheers, André

araujo · ‎02-17-2022

This is the default when you deploy a new insecure cluster. When you implement security you need to review these defaults and tighten security, including employing Ranger to protect, among other things, the HDFS data in external directories. Cheers, André

araujo · ‎02-17-2022

Thanks for the link to the video, @MartinTerreni . In a traditional NiFi flow that read from Kafka and writes to Kafka, the offsets of the source topic are indeed stored by the consumer in Kafka itself. With this, if NiFi crashes or restarts, the flow will continue to read the source topic from the last offset stored in Kafka. The problem, which @mpayne explained in the video, is that there is a decoupling between the consumer and the producer in a traditional flow. And this can cause data loss or data duplication, depending on the scenario. For example, the ConsumeKafka processor commits offsets to the source Kafka cluster in batches at regular intervals. It is possible that some messages read by the consumer are written by the producer to the destination topic before the offsets of those records are committed to the source Kafka cluster. If the flow stops abruptly before the commit happens, when it starts again it will start reading from the previously committed offset and it will write duplicate messages to the target topic. On the other hand, since there's no tight coupling between consumer and producer, the consumer could read some messages and commit their offsets to the source cluster before the ProduceKafka processor is able to deliver those messages to the target cluster. If there's a crash before those messages are sent and some of the flowfiles are lost (e.g. on node of the cluster burned down) that data will never be read again by the consumer and there will be a data gap at the destination. The new feature explained in the video address all of these issues to guarantee Exactly Once Delivery semantics from Kafka to Kafka. I hope this helps clarifying it a bit more 🙂 Cheers, André

araujo · ‎02-16-2022

You're correct, @MartinTerreni . If you set the Consumer Group Id property for the processor that consumes from Kafka the offset will be maintained across NiFi crashes and restarts. Note, though, that the Consumer Group Id in NiFi is specified by processor, not by record reader. Would you please share the link to the said video? Cheers, Andre

araujo · ‎02-15-2022

@wichovalde , The error below is a server-side error that should (hopefully) be logged in the Atlas server log. Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: https://<atlas-server>:31443/api/atlas/v2/entity/uniqueAttribute/type/hive_db?attr%3AqualifiedName=inventarios%40cm&ignoreRelationships=true&minExtInfo=true&user.name=superuser, status: 500, message: Server Error Try to find the entries in the Atlas server log that match this call and it could tell you a bit more about the problem. If the Atlas log doesn't seem to have the corresponding information, try setting its log threshold to DEBUG in Cloudera Manager and restarting the service and repeating the test. André

araujo · ‎02-15-2022

Hi @drgenious , When you open the Oozie Editor, notice that the default line of icons shown at the top of the editor are "Documents" rather than "Actions". Documents enable to embed content you created previously in your Oozie workflows (saved Hive queries, Spark programs, Java programs, etc.). If you click on the "DOCUMENTS" word, you will see a drop-down with the "Actions" selection. Clicking on it will switch the line of Document icons with the Action icons you're looking for: HTH, André

araujo · ‎02-14-2022

The configuration of the ElasticSearchLookupService Controler Service is pretty straight forward as you have noticed. It uses another controller service, a ElasticSearchClientServiceImpl service, to connect to the Elastic service. Most of the connectivity details need to be specified on the ElasticSearchClientServiceImpl service configuration. The best place for reference is the documentation: ElasticSearchLookupService ElasticSearchClientServiceImpl Cheers, André

araujo · ‎02-14-2022

@ABBI @yamaga , The lookup service require that the lookup key be unique for it to work correctly. If there are duplicates you can chose to ignore them, but only one value will ever be returned. One thing you can do is to consolidate records with the same key under the same line, so all the values will be returned and you can then deal with it (e.g. split the values) in the NiFi flow. In your example, we could change the lookup files like below: col2,col5,col6,col7,col8 2,abc,China,123,col8 4,def,USA,118,col8 8,"qwe,zyx","Canada,England","118,118","col8,col8" Regards, André

araujo · ‎02-14-2022

Can you share your code?

Online	Offline
Last Visited	‎07-21-2025 10:25 PM

Member Since	‎06-26-2015 11:59 AM
Last Visited	‎07-21-2025 10:25 PM
Posts	515
Kudos received	138

Cloudera Community

Re: Is it possible to use Single User authenticati...

Re: Dynamically Assign an XSD File

Re: "error": "There is no mapped role for the grou...

Re: Read xml file content into an Attribute: How t...

Re: Nifi Lookup CSV values with SQL NULL values

Re: Understanding NiFi Certificates functionality

Re: Invoke HTTP Processor in Localhost

Re: hive外部表目录权限问题

Re: Is offset of kafka lost on nifi failure?

Re: Is offset of kafka lost on nifi failure?

Re: Authentication Failed in Atlas Hive Import

Re: Oozie in hue hasn't got shell action

Re: How to configure ElasticSearchLookupService Co...

Re: CSVLookupService to Return multiple records

Re: Streaming job does not get triggered - Promise...