Member since
07-09-2016
83
Posts
17
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1292 | 12-08-2016 06:46 AM | |
2277 | 12-08-2016 06:46 AM |
06-12-2024
02:49 AM
1 Kudo
The template is not working now. Are there any alternatives??
... View more
04-27-2017
12:41 PM
1 Kudo
When to use "primary node only" depends on whether the operation is something that makes sense to happen on all nodes, or whether its something that only makes sense to happen once. Here are some examples... ListHDFS - this should be primary node only because otherwise you are going to perform the same listing on all nodes ConsumeKafka - this can be run on all nodes because each one will be consuming different data GetFile - this can be run on all nodes because each node will pick up different data from a local directory In your Kafka scenario, instances of a processor equate to what you see on the graph times the # of nodes in the cluster, so if you have a two node cluster with one ConsumeKafka_0_10 on the canvas, then there are two instances of ConsumeKafka_0_10. If you increase concurrent tasks to 3, then there are 3 threads executing each instance on each node, so 6 total. Since you have 6 partitions, each of these 6 threads should consume from a separate partition.
... View more
06-15-2017
10:52 AM
Hello Kumar, Configuring the HDFS Handler to write to many HDFS files due to many source replication tables or extensive use of partitioning can result in degraded performance. Oracle GoldenGate does not support DDL replication for all database implementations. You should consult the Oracle GoldenGate documentation for their database implementation to understand if DDL replication is supported. This may not be an issue if you have a steady stream of Replication data and do not require low levels of latency for analytic data from HDFS. Thanks.
... View more
12-19-2016
12:12 PM
@Kumar, as tests by me and @Devin Pinkston show, it is the actual file content that you need to look at (not the UI). Thus ... no fears of the processor adding a new line.
... View more
12-20-2016
02:43 AM
Do you have ranger audit enabled? if so please provide what the log shows when nifi tries to hit /tmp
... View more
12-16-2016
01:21 PM
1 Kudo
One needs to allow for a reflect function, it's blacklisted by default. See e.g. this discussion: https://community.hortonworks.com/questions/25828/udf-reflect-is-not-allowed-beeline.html
... View more
12-15-2016
06:52 PM
1 Kudo
ExecuteProcess doesn't take input. It is a source of data (e.g. from that process). What you're looking for is ExecuteStreamCommand, which allows for passing in a flowfile and evaluate attributes dynamically: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html
... View more
12-15-2016
07:08 PM
you could use a simple python script with executescript to achieve that. http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
... View more
12-12-2016
07:42 PM
There is no out-of-the box UDF to do this in Hive. You need to build it yourself using Hive's map function. Examples: http://stackoverflow.com/questions/23025380/how-to-transpose-pivot-data-in-hive http://stackoverflow.com/questions/37436710/is-there-a-way-to-transpose-data-in-hive http://hadoopmania.blogspot.com/2015/12/transposepivot-table-in-hive.html
... View more