About araujo

iamfromsky · ‎04-05-2022

@araujo do you have any suggestion for this case ?

Corasmum · ‎03-31-2022

You could also have 2 Flows under 1 group. First flow gets the token and caches it .. second retrieves cached token to use where needed.

templarian · ‎03-31-2022

Hi André, That is what I needed. Thanks a lot. For those who are facing the same problem in windows; @araujo's solution also works with Powershell

araujo · ‎03-30-2022

@PabloO , In Kafka's terminology, a topic is a "distributed logs". The data for each topic's partitions is saved in what's called "log segment files". So, the "log.dirs" and "log.dir" parameters point to the directories where the actual messages are saved, *not* the "error logs". For example, if your "log.dirs" is set to "/data1" and you have a topic named "mytopic". The data for the partition 0 of that topic will be saved in files under the directory "/data1/mytopic-0". Cheers, André

araujo · ‎03-29-2022

Hi @inyongkim , At the moment this connector has no controls to adjust the flushing mechanism. We're aware of that and Cloudera is working on making that more configurable so that it does not create a small file problem in your destination cluster. Cheers, André

araujo · ‎03-29-2022

@sheep , You need to add an EvaluateJsonPath processor before your PutHBaseJson to extract the value that you need and save it as an attribute in the flowfile. For example, you could get the value from $.field1.nestedfield and save that as the attribute mynestedfieldvalue. You can then refer to that attribute in your PutHBaseJson processor as ${mynestedfieldvalue}. Please check out this other answer to a similar question: https://community.cloudera.com/t5/Support-Questions/Hash-key-value-missing-putdynamodb-nifi/m-p/339622/highlight/true#M233139 Cheers, André

dutras · ‎03-29-2022

Thanks, everyone, for the help.

bdworld2 · ‎03-27-2022

Hi @araujo Many thanks for the explanation, please note M1 doesn't have Virtual box that is the reason i have chosen Docker from the 2nd link you provided do you have any other alternate solution instead of Virtual box ? 1st option - no since its trail 60 days

araujo · ‎03-27-2022

@mystefied_ , You can download the CDP Trial version from the Cloudera website below: https://www.cloudera.com/downloads/cdp-private-cloud-trial/cdp-private-cloud-base-trial.html Cheers, André

araujo · ‎03-25-2022

@Boss , These are upper bound values to ensure that the services running on the machine won't run into limitations on the number of processes or open file descriptors. IMO, these are really pertinent parameters when you have gateway servers where tens or hundreds users connect to to run their own processes and you want to make sure no single user will run rogue processes that will starve everyone else of resources. The hosts in a CDP cluster enviroment are typically not hosts where users should be connecting directly to. The services and processes that run on those hosts are well known and managed by the administrator. In this scenario, these parameter are not as critical and we usually set them to a value that get them "out of the way", so that that we never reach them. Specifically to answer your question, though: "nofile" is the limit of open file descriptors. Note that file descriptors are not only associated to files; for example, they are also used to refer to open network sockets/ports and pipes. You can check the file descriptors currently open using the command "lsof" "nproc" is the limit of running processes. You can check that with the command "ps". Cheers, André

Online	Offline
Last Visited	‎07-21-2025 10:25 PM

Member Since	‎06-26-2015 11:59 AM
Last Visited	‎07-21-2025 10:25 PM
Posts	515
Kudos received	140

Cloudera Community

Re: Is it possible to use Single User authenticati...

Re: Dynamically Assign an XSD File

Re: "error": "There is no mapped role for the grou...

Re: Read xml file content into an Attribute: How t...

Re: Nifi Lookup CSV values with SQL NULL values

Re: Exception while trying to get password for ali...

Re: NIFI HTTP

Re: How to get the last n lines of a text file ("...

Re: Location in disk where data is flushed Kafka

Re: How to change flush duration of cloudera hdfs ...

Re: Nested field as Hbase row identifier

Re: 7.2.12 - Streams Messaging Light Duty Error to...

Re: cloudera sandbox 6.3.0 docker run no services ...

Re: Hive query for functions

Re: How to set ulimit values