About stevenmatison

vatodorov19 · ‎10-29-2020

Thanks @stevenmatison Do you by chance know the answer to this question https://community.cloudera.com/t5/Support-Questions/Extract-string-nested-in-JSON-value/m-p/305099 It's probably something very easy, but nothing that I tried works. Valentin

robnew666 · ‎10-29-2020

Thanks, I am trying some stuff now to parse data using the JoltSpec/JoltTransformJSON processor that could help me with this issue, but thanks for this help, hopefully can get things running more smoothly soon. 🙂

stevenmatison · ‎10-29-2020

@amey84 Yes. Although yum install still provides the bundled postgres, you can choose to install it or another database separately. During ambari-server setup you choose Y here: Enter advanced database configuration [y/n] (n)? y The following links will be helpful here for more info about ambari + postgres: https://docs.cloudera.com/HDPDocuments/Ambari-2.6.1.5/bk_ambari-administration/content/using_ambari_with_postgresql.html If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven

stevenmatison · ‎10-29-2020

@Kaur it appears like your nifi node does not have enough system ram to allow you to use 2g and 4g settings. I suggest increasing the node specification to at least 8gb or 16 gb of system ram and test boostrap config with 2g 4g or 4g 8g respectively. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven

stevenmatison · ‎10-19-2020

The solution you are looking for is: ReplaceText: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.12.0/org.apache.nifi.processors.standard.ReplaceText/ You can find loads of examples here in the forum with this search: https://community.cloudera.com/t5/forums/searchpage/tab/message?advanced=false&allow_punctuation=false&q=replaceText If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven

HansH · ‎10-16-2020

After I have not been able to find a solution that would be easy to implement inside of NiFi, I've written a small perl (yuk) script that can be used to adjust timestamps in a CSV file to be in ISO8601 format. Maybe it is useful to someone else: #!/bin/perl -w # This perl script adds timezone information to timestamps without a # timezone. All timestamps in the input file that follow the format # "YYYY-MM-DD HH:MM:SS" are converted to ISO8601 timestamps. use strict; use DateTime::Format::Strptime; my $time_zone = 'Europe/Amsterdam'; my $parser = DateTime::Format::Strptime->new( pattern => '%Y-%m-%d %T', time_zone => $time_zone ); my $printer = DateTime::Format::Strptime->new( pattern => '%FT%T%z', time_zone => $time_zone ); while (<>) { s/(?<=")(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(?=")/ my $dt = $parser->parse_datetime($1); $printer->format_datetime($dt); /ge; print; }

gnavarro54 · ‎10-01-2020

Hi steven. Thanks for the quick response. I'm running this HDP cluster in SUSE 12 SP2. This node has 32 GB RAM and using just 4. Free RAM is 27 GB. Yarn Configuration is like this: ResourceManager Java heap size = 2048 NodeManager Java heap size = 1024 AppTimelineServer Java heap size = 8072 ulimit used by RM process is: core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 128615 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 65536 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited --- From RM log file: 2020-09-29 17:15:00,825 INFO scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:getMinimumAllocation(1367)) - Minimum allocation = <memory:1024, vCores:1> 2020-09-29 17:15:00,825 INFO scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:getMaximumAllocation(1379)) - Maximum allocation = <memory:24576, vCores:3 No matter how much memory is assigned to RM, it always fails with this Jana OoM. What may be a recommended Java Memory configuration for Yarn components?

stevenmatison · ‎10-01-2020

@aniket5003 NiFi can be added to ambari using one of the HDF Management packs. Depending on your relationship with Cloudera, you may need to use your account to get after the NiFi 1.12.1 management pack. I do know other versions are out on the open internet (1.9 and below), but newest versions will require a cloudera username and password to access repos and artifacts. Once you have a management pack added to ambari, you should be able to install nifi and other HDF components in an HDP cluster. Additionally you can get 1.12.1 from nifi.apache.org and install outside of ambari interface if you need something quick. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven

stevenmatison · ‎09-30-2020

@Elf IMO anything is possible with ambari. That said, out of the box, maybe it would not appear to be possible without some advanced ambari admin skills. I took a look at the link you provided and that is an example of how to spin up a single machine with many of the services you may already have in your ambari cluster. To install griffin in an ambari cluster you would need to pick a node, install griffin, missing requirements (services/components not in your cluster), and thoughtfully modify the configuration to use the existing services from the ambari cluster. For example, feed griffin you configuration locations for hadoop, hdfs, hive, etc and NOT use the specific directions to install those parts based on sample documentation. If you do decide to go down this path, please update here with your progress or create new Questions with specific errors you may have.

stevenmatison · ‎09-30-2020

@ujay Of course. The link referenced xml is a template file. Click through and get the raw xml code and save to a file. From here you import the template: Then in the upper navigation grab the template icon and drag to the nifi canvas: It should automatically choose the last template uploaded: Once the template is on the canvas click through into the process group created: You will need to do some work in Controller Services so check out the notes in the Red Box. The flow is an example of how to generate many flow files and detect duplicates. Be sure to do some research on the processor (google search) to understand how others have resolved working with the processor as your begin to integrate this into your own flow. This community is also a great research tool too:

Online	Offline
Last Visited	‎06-01-2022 03:47 PM

Name	Steven Matison
Location	Florida
Member Since	‎07-19-2018 04:45 PM
Last Visited	‎06-01-2022 03:47 PM
Posts	613
Kudos received	101

Cloudera Community

Re: Apache nifi - how to convert a file .txt into ...

Re: Apache Nifi - Using PutParquet, the HDFS file ...

Re: How to extract csv column record and used it f...

Re: Could not connect to Distributed Map Cache ser...

Re: NiFi InvokeHTTP POST JSON

Re: HBase lookup via NiFi

Re: How to use SQL within a QueryRecord Processor ...

Re: How to install Ambari 2.7.4 without using bund...

Re: Nifi: JAVA Heap Space

Re: How to modify text in csv file in Apache Nifi

Re: Specify timezone when parsing CSV data?

Re: Resource Manager - java.lang.OutOfMemoryError:...

Re: apache nifi 1.12.1

Re: What is recommendation data quality tool for ...

Re: Editing Nifi queue