About ashsskum

Shakti · ‎06-12-2024

The template is not working now. Are there any alternatives??

bhavan12 · ‎03-12-2021

What is the correction we have done

bbende · ‎04-27-2017

When to use "primary node only" depends on whether the operation is something that makes sense to happen on all nodes, or whether its something that only makes sense to happen once. Here are some examples... ListHDFS - this should be primary node only because otherwise you are going to perform the same listing on all nodes ConsumeKafka - this can be run on all nodes because each one will be consuming different data GetFile - this can be run on all nodes because each node will pick up different data from a local directory In your Kafka scenario, instances of a processor equate to what you see on the graph times the # of nodes in the cluster, so if you have a two node cluster with one ConsumeKafka_0_10 on the canvas, then there are two instances of ConsumeKafka_0_10. If you increase concurrent tasks to 3, then there are 3 threads executing each instance on each node, so 6 total. Since you have 6 partitions, each of these 6 threads should consume from a separate partition.

madisonquinn · ‎06-15-2017

Hello Kumar, Configuring the HDFS Handler to write to many HDFS files due to many source replication tables or extensive use of partitioning can result in degraded performance. Oracle GoldenGate does not support DDL replication for all database implementations. You should consult the Oracle GoldenGate documentation for their database implementation to understand if DDL replication is supported. This may not be an issue if you have a steady stream of Replication data and do not require low levels of latency for analytic data from HDFS. Thanks.

gkeys · ‎12-19-2016

@Kumar, as tests by me and @Devin Pinkston show, it is the actual file content that you need to look at (not the UI). Thus ... no fears of the processor adding a new line.

sunile_manjee · ‎12-20-2016

Do you have ranger audit enabled? if so please provide what the log shows when nifi tries to hit /tmp

andrewg · ‎12-16-2016

One needs to allow for a reflect function, it's blacklisted by default. See e.g. this discussion: https://community.hortonworks.com/questions/25828/udf-reflect-is-not-allowed-beeline.html

andrewg · ‎12-15-2016

ExecuteProcess doesn't take input. It is a source of data (e.g. from that process). What you're looking for is ExecuteStreamCommand, which allows for passing in a flowfile and evaluate attributes dynamically: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html

knarayanan · ‎12-15-2016

you could use a simple python script with executescript to achieve that. http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html

gkeys · ‎12-12-2016

There is no out-of-the box UDF to do this in Hive. You need to build it yourself using Hive's map function. Examples: http://stackoverflow.com/questions/23025380/how-to-transpose-pivot-data-in-hive http://stackoverflow.com/questions/37436710/is-there-a-way-to-transpose-data-in-hive http://hadoopmania.blogspot.com/2015/12/transposepivot-table-in-hive.html

Online	Offline
Last Visited	‎03-06-2019 11:26 PM

Member Since	‎07-09-2016 06:23 AM
Last Visited	‎03-06-2019 11:26 PM
Posts	83
Kudos received	17

Cloudera Community

Re: NiFi: ReplaceTextWithMapping processor

Re: NiFi: Hive dynamic variable

Re: NiFi: Flowfile retries

Re: Phoenix: Error connecting through Java

Re: NiFi: Isolated Processors in a clustered

Re: Compaction: ORACLE GoldenGate replication in H...

Re: NiFi: FetchFile processor appends new line at ...

Re: Hive: INSERT OVERWRITE does not work

Re: NiFi: PutHiveQL reflect UDF not working

Re: NiFi: Cannot map Input port to processor Execu...

Re: NiFi: Write to custom delimitted file

Re: Hive:Transpose the set of rows