Member since
09-04-2019
45
Posts
10
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
121 | 01-26-2022 06:25 AM | |
150 | 01-21-2022 09:46 AM | |
364 | 01-19-2022 10:03 AM | |
2662 | 08-06-2020 11:19 AM | |
467 | 12-04-2019 07:02 AM |
02-05-2022
04:52 PM
Those parentheses on your search would be considered special characters ina. regular expression unless escaped. I can get this to work using this: Active: active \(running\)
... View more
02-05-2022
04:16 PM
1 Kudo
You can set the run schedule of the processor to cron driven [1] and give it your cron expression from there. Note that this cron is not the typical OS cron syntax and its based off quartz cron scheduler [2] To build a cron expression this is a good online tool to do so [3] [1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-strategy [2] http://www.quartz-scheduler.org/ [3] https://www.freeformatter.com/cron-expression-generator-quartz.html
... View more
01-26-2022
07:10 AM
Please also see this post: https://community.cloudera.com/t5/Support-Questions/Send-TCP-acknowledgement-in-NIFI/m-p/334438#M231765
... View more
01-26-2022
07:06 AM
Hi, Could you elaborate more? The ACK is part of the TCP protocol
... View more
01-26-2022
06:25 AM
Hello, Most likely because on your CSV Reader you have: Treat First Line as Header = false ( default ) Change that to true
... View more
01-25-2022
11:27 AM
1 Kudo
@zhangliang to accomplish that i would use UpdateRecord Since your data is csv and structured we can use record manipulation to accomplish this. First I would treat all your values as string and build an avro schema to use: {
"type":"record",
"name":"nifiRecord",
"namespace":"org.apache.nifi",
"fields":[
{"name":"test_a","type":["null","string"]},
{"name":"test_b","type":["null","string"]},
{"name":"test_c","type":["null","string"]},
{"name":"test_d","type":["null","string"]},
{"name":"test_e","type":["null","string"]}
]
} Then I would configure my UpdateRecord to use a CSV Reader and a CSV Writer I would configure the CSV Reader like this: Use schema text property Schema Text = Put your avro schema there Value Separator = | And the CSV Writer leave everything default except: Value Separator = | Finally the UpdateRecord processor will need 2 user fields. In this case we want to update the fields "test_c" and "test_d" And then we can use Record path manipulation and in particular for this use case the substringBefore function to only give us everything before the DOT "." Here is what you should configure: This will then take an input like this: test_a|test_b|test_c|test_d|test_e
a|b|3.0|4.0|5.0
a|b|3.0|4.0|5.0
a|b|3.0|4.0|5.0 and produce an output like this: test_a|test_b|test_c|test_d|test_e
a|b|3|4|5.0
a|b|3|4|5.0
a|b|3|4|5.0
... View more
01-25-2022
06:19 AM
1 Kudo
I wonder what java version you have on your Windows machine? NiFi supports java 8 and 11
... View more
01-24-2022
11:57 AM
Seems like you should follow this: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#proxy_configuration
... View more
01-24-2022
11:56 AM
ExecuteSQL will use Database Connection Pooling Service controller service. From there you configure a JDBC connection string. The connection string will depend on the specific jar you will use so in this case Presto Here is reference on JDBC connection for Presto and that is an external database function and not so much NiFi issue: https://prestodb.io/docs/current/installation/jdbc.html
... View more
01-24-2022
11:52 AM
1 Kudo
First confirm if the NiFi JVM is running search your logs for the word "UI is available" If it is then I would look at your systems firewall windows defender? If you do not see that log entry I would focus on why NiFi JVM is not coming up
... View more
01-24-2022
11:44 AM
Old post but for awareness: As reported here: UUID Logical type is now included under Apache Parquet 1.12 https://issues.apache.org/jira/browse/PARQUET-1827 Apache NiFi will use apache Parquet 1.12 starting from Apache NiFi 1.14
... View more
01-21-2022
09:46 AM
1 Kudo
Without knowing how the FTP directory is populated, you might have an issue with the ListFTP state and might consider changing its its Listing Strategy, However assuming that is no issue and you want to only perform an action based on any file with a naming structure like this: xxxx-xxxx-xxxx-xxxx-20220121110000-xxxx-xxxx.csv and only if it is older than two hours from "now" I would run the output of your ListFTP to an UpdateAttribute and add these 2 properties: property name fileTime value ${filename:getDelimitedField(5,'-'):trim():toDate('yyyyMMddHHmmss'):toNumber()} property name timeNow value ${now():toNumber():minus(7200000)} Then route that to a RouteOnAttribute and add a new property: property name value 2 hours old ${fileTime:le(${timeNow})} Then you can drag that connection to follow on processing and the unmatched connection to other processing or terminate it. Explanation: filelName ${filename:getDelimitedField(5,'-'):trim():toDate('yyyyMMddHHmmss'):toNumber()} This grabs the time out of your filename and converts it to a Date object so it can be converted to its epoch representation timeNow ${now():toNumber():minus(7200000)} Sets a value 2 hours ago from current time ${fileTime:le(${timeNow})} If attribute fileName is less than or equal to timeNow it means that it is 2 hours old.
... View more
01-20-2022
12:05 PM
1 Kudo
I understand the issue EvaluateXPath does not take in an attribute syntax and fails at validation since it is validating for a valid Xpath path. I wonder if there is a way to use QueryRecord instead.
... View more
01-20-2022
10:58 AM
Have you considered looking at the ListFtp property: Minimum File Age and set it to 2 hours
... View more
01-20-2022
09:43 AM
Deleting that state directory should not be a normal maintenance function. What you initially described is a very odd case. The fact that your node went down 003 and if it was the primary node or the coordinator node, internally there would have been a election to nominate a new cluster member node to perform those functions. In your case node 003 is a member of the cluster but it is not connected. Why it is not connected could be the cause of n reasons typically node is down or it was manually disconnected. When you see that message how many member nodes do you have? I expect the UI to show 2/3 because node 3 is not connected. The solution is to connect it by fixing the issue of why node is down or connect it through the UI
... View more
01-19-2022
10:03 AM
Reading your sample log messages closer I can see that the coordinator received a heartbeat from node 3 " 2022-01-19 16:04:52,573 INFO [Process Cluster Protocol Request-30] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 35f2ed1a-ca6f-4cc6-ab4a-6c0774fc9c6d (type=HEARTBEAT, length=2837 bytes) from nifi-hatest-03:9091 in 19 millis" So I also wonder if we have a caching rendering issue, can you see what the UI shows using incognito mode? And finally if this is a DEV environment you can also try and delete your local state directory, that value is set in file "state-management.xml" Deleting state will clear out any local state your processors might depend on if configured as such, so remove with caution. It will also clear out cluster node ID's it local knows of.
... View more
01-19-2022
09:35 AM
This to me sounds like a hostname issue. could you confirm the value on nifi.properites for: nifi.cluster.node.address= nifi.web.https.host= Those values should match for the name of the host. If nothing there stands out check log entries for latest messages contain below strings: "org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator: Status" "org.apache.nifi.controller.StandardFlowService: Setting Flow Controller's Node ID:" "org.apache.nifi.web.server.HostHeaderHandler"
... View more
01-19-2022
09:01 AM
Can I clarify that what you want to do is dealing with XML and XPath? Or based on this " I'm trying to pass an output value of a processor dynamically into another and use it as a new property" Output values in to flowfile attributes depend on the previous processor. If you are not dealing with XML then the processor you most likely need is ExtractText followed by an RouteOnAttribute
... View more
08-06-2020
07:27 PM
Hi @yogesh_shisode awesome that you are exploring NiFi. So just to be clear Apache Zookeeper can be considered an external service to help with state management / NiFi clustering With that said and to make things "flow" better, NiFi allows us to start an embedded zookeeper cluster. To me it seems that is what you are trying to connect to given the IP examples, so you are trying to use NiFi's embedded Zookeeper capability. So let's delve a little into zookeeper, we have zookeeper the service that can be single node or multi node. when in multi node we have a zookeeper ensemble and when we have that we need to maintain a quorum. This answer is very eloquently explained https://stackoverflow.com/questions/25174622/difference-between-ensemble-and-quorum-in-zookeeper And with that said please make sure you follow this guide: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#embedded_zookeeper In it it discusses howe to configure the NiFi to start up as an embedded zookeeper service and the settings needed to accomplish this. For clarity port 2181 is the zookeeper listening port and depending on how many servers you configured to be your zookeeper servers based of this nifi.properties entry: nifi.state.management.embedded.zookeeper.start=false if it is set to true, then Nifi will start a zookeeper service too and will depend on this setting: nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties Which is all explained on the link I gave you https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#embedded_zookeeper AND once you decide who the member nodes of your zookeeper are then all the NiFi's regardless of wether they are zookeeper servers or not should have this property set: nifi.zookeeper.connect.string= ^^^ The same on all NiFi's And using your IP examples if you want 3 servers to be zookeeper servers, then I would expect this setting to be: nifi.zookeeper.connect.string=192.168.0.10:2181,192.168.0.20:2181,192.168.0.30:2181 And on those servers this setting to be: nifi.state.management.embedded.zookeeper.start=true And the additional configurations from the Apache NiFi admin guide linked above.
... View more
08-06-2020
12:05 PM
Just install the right java should be good enough. And if you get more than one JAVA just point JAVA HOME to a proper version for NiFi
... View more
08-06-2020
11:19 AM
@SAMSAL NiFi will only run on java 8 or 11 " NiFi requires Java 8 or 11. It is recommended that you have installed one of these Java versions prior to installing NiFi via Homebrew. Homebrew is aware NiFi depends on Java but not the specific supported versions, so it may prompt to install an incompatible JDK as part of its NiFi install." https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#downloading-and-installing-nifi
... View more
08-06-2020
10:58 AM
@SAMSAL thanks for confirmation, what java version is the NiFi JVM using? You can find it multiple ways but maybe easiest is to look at: Global Configuration Menu ( top right ) > Cluster > Versions tab Or if single node maybe just finding the java version windows has, assuming not multiple.
... View more
08-06-2020
07:48 AM
Hi @SAMSAL , Could you provide a bit more info. What version of NiFi are you on? Can you provide the script? If you are using ExecuteScript make sure you are setting groovy as script engine and y ou do not need to install Groovy separately for ExecuteScript because it runs on the internal JVM using JSR-223 scripting. So if you are setting Module Directory just because you think you need to set that to point to groovy, try unsetting that. and can you run a simple script to test this out: import groovy.io.FileType
def flowFile = session.get(); session.transfer(flowFile, REL_SUCCESS)
... View more
07-20-2020
07:41 AM
To put Kerberos DEBUG logging for HDF managed Schema Registry ( SR ) clusters, do the following depending on your environment:
Navigate to Ambari schema registry > configs > advanced > registry-env template
Add the below line: export REGISTRY_OPTS="$REGISTRY_OPTS -Dsun.security.krb5.debug=true" Example:
Restart SR.
You will then be able to view Kerberos DEBUG messages under /var/log/registry (default).
... View more
Labels:
07-15-2020
12:18 PM
When NiFi is secured for TLS server authentication, at UI login time first it tries to use TLS certificates if loaded on the browser and then, it tries to use SPNEGO authentication, and finally, it falls back to your configured login provider.
If you KERBERISE the cluster via AMBARI and want to use login-providers like LDAP or KERBEROS, it automatically sets the following properties which enable SPNEGO authentication.
nifi.kerberos.spnego.keytab.location
nifi.kerberos.spnego.principal
Furthermore, SPNEGO properties through AMBARI are greyed out for:
COMMAND:
From your Amabari manager host, change the setting for NiFi, where the text in red i s tailored to your unique environment:
nifi.kerberos.spnego.keytab.location to be blank:
./configs.py -a set -s http -l c2288-node1.squadron.support.hortonworks.com -t 8080 -n c2288 -u admin -p AdminPassword -c nifi-properties -k 'nifi.kerberos.spnego.keytab.location' -v ''
nifi.kerberos.spnego.principal to be blank
./configs.py -a set -s http -l c2288-node1.squadron.support.hortonworks.com -t 8080 -n c2288 -u admin -p AdminPassword -c nifi-properties -k 'nifi.kerberos.spnego.principal' -v ''
-a set
-s http or https
-l fqdn of ambari host
-t port number ambari is listening on
-n ambari cluster name ( you can get that from top right UI )
-u user that has edit privileges on ambari
-p the password for that user
-c the config type in this case nifi.properties
-k the key to change
-v the value to change
You can also do this for NiFi Registry with the following sample commands:
./configs.py -a set -s http -l c2288-node1.squadron.support.hortonworks.com -t 8080 -n c2288 -u admin -p AdminPassword -c nifi-registry-properties -k 'nifi.registry.kerberos.spnego.principal' -v ''
./configs.py -a set -s http -l c2288-node1.squadron.support.hortonworks.com -t 8080 -n c2288 -u admin -p AdminPassword -c nifi-registry-properties -k 'nifi.registry.kerberos.spnego.principal' -v ''
Restart NiFi and/or NiFi Registry and ensure that you clear your browser cache.
You should see the following on Ambari config sections of NiFi and/or NiFi registry:
... View more
05-19-2020
09:41 AM
Can you give sample data? In one part you say that the payload has the chunks : 1 file consists of n chunks (average 3 to 4 chunks pro file) But further down you imply that its on the metadata of the Rest "response"? "Chunk-data from xml-Response are set into an attribute. And now?"
... View more
05-06-2020
02:13 PM
I do not think that is the issue since on my working phoenix jar I have that too. Did you restart your nifi?
... View more
04-29-2020
06:42 AM
@DamD others, please follow this article. https://my.cloudera.com/knowledge/ERRORCannot-create-PoolableConnectionFactory-row-SYSTEMCATALOG?id=273198 In short the article tells you to add your: hbase-site.xml
core-site.xml
hdfs-site.xml to your phoenix-client.jar jar uf < phoenix client jar > hbase-site.xml core-site.xml and point to your phoenix-client.jar from NIFI's DBCPConnectionPool
... View more
04-17-2020
08:07 AM
1 Kudo
In this article, I will document how to use CFM 1.0.1.0 to interact with Apache Impala. This article still applies if using HDF / Apache NiFi
The latest official JDBC driver that will work when using NiFi is the JDBC driver 2.6.4 or below.
At the time of this writing, any driver above that causes class conflicts with the NiFi JVM and the driver's own use of log4j.
Pre-requisite
Downloading and extracting the JDBC driver:
The JDBC drivers can be found at Impala JDBC Connector 2.6.15 for Cloudera Enterprise.
Select 2.6.4 * or if in the future a version higher than 2.6.15 is available, use that.
Download and extract 2.6.4 and make note of where it extracts to.
Ensure that the user that runs the NiFi JVM ( nifi ) has permission to that path.
The jar file that you will use is called ImpalaJDBC41.jar.
Create Impala table and load dataset sample to HDFS:
Use this data set tips.csv and add it to your HDFS.
hdfs dfs -put data/tips.csv /user/hive/warehouse/tips/
Create your impala table:
impala-shell -i <impala_daemon_hostname>:21000 -q '
CREATE TABLE default.tips (
`total_bill` FLOAT,
`tip` FLOAT,
`sex` STRING,
`smoker` STRING,
`day` STRING,
`time` STRING,
`size` TINYINT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ","
LOCATION "hdfs:///user/hive/warehouse/tips/";'
* These steps were taken from this article.
Configure the Nifi to interact with Impala:
On NiFi drag processor ExecuteSQL
Configure Database Connection Pooling Service on the ExecuteSQL processor
This is a pointer to the DBCPConnectionPool controller service that you will need to configure:
The driver documentation is really good at explaining the different settings you can pass. If you will interact with an Impala that is TLS secured and / or Kerberos there are options for that. In my example, I am interacting with a TLS and Kerberized Impala.
On the controller service section configure your DBCPConnectionPool and add the following:
Database Connection URL
My example:
jdbc:impala://YourImpalaHostFQDN:YourPort
Database Driver Class Name
com.cloudera.impala.jdbc41.Driver
Database Driver Location(s)
The following is the path to the JDBC driver (ImpalaJDBC41.jar) you downloaded earlier:
Back in the ExecuteSQL processor, add your SQL command. For this example, we are running a simple select query. By configuring SQL select query = select * from default.tips
That should be all you need.
If interacting with a TLS and / or Kerberos Impala, then you will need to look at the driver documentation for the options that apply to you. For reference, my connect string looked like below when connecting to a TLS and Kerberos Impala:
jdbc:impala://MyImpalaHost:21050;AuthMech=1;KrbHostFQDN=MyImpalaHostFQDN;KrbServiceName=impala;ssl=1;SSLTrustStore=/My/JKS/Trustore;SSLTrustStorePwd=YourJKSPassword
... View more
12-04-2019
07:02 AM
I'm on mobile but could it be that you are not wrapping it around the single quotes?
... View more