Member since
11-16-2015
905
Posts
665
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 431 | 09-30-2025 05:23 AM | |
| 767 | 06-26-2025 01:21 PM | |
| 659 | 06-19-2025 02:48 PM | |
| 848 | 05-30-2025 01:53 PM | |
| 11387 | 02-22-2024 12:38 PM |
11-21-2016
06:38 PM
Paramiko uses Crypto which is a native module, so this is not pure Python either and cannot be used in ExecuteScript. ExecuteProcess or ExecuteStreamCommand should work though.
... View more
11-21-2016
03:14 PM
According to Python documentation, using key_file and cert_file is deprecated, they recommend you pass in a context (one that has been configured by calling load_cert_chain). You'll need a certfile and a keyfile there too, which you can get using various openssl commands (assuming you have openssl installed). For example, to export a client secret key from a PKCS12 keystore to a PEM file: openssl pkcs12 -in CN=<something_you_typed>_OU=Apache NiFi.p12 -nodes -nocerts -out client.key Or to export a server private key from a JKS keystore to a PEM file: keytool -importkeystore -srckeystore <keystore.jks> -destkeystore keystore.p12 -deststoretype PKCS12
openssl pkcs12 -in keystore.p12 -nodes -nocerts -out nifi.key Or to export a CA cert from a JKS keystore to a PEM file: keytool -export -alias <your_alias> -file ca.der -keystore <truststore.jks>
openssl x509 -inform der -in ca.der -out ca.pem
... View more
11-21-2016
01:31 PM
2 Kudos
PutSQL has a mechanism for batching together statements that were split by processors such as SplitText. Set the "Support Fragmented Transactions" property to true, and PutSQL will wait until all flow files with the same fragment.identifier have arrived, then it will process them all as a single batch. There has also been talk of implementing the same improvement for PutSQL as is being done for PutHiveQL (NIFI-3031), to support multiple statements from a single flow file. Please feel free to file a Jira for this if you like.
... View more
11-18-2016
01:13 PM
QueryCassandra does not support user-defined types, and instead will convert the values to strings. As a workaround, you can use ExecuteScript to parse the strings into values. Here is an example Groovy script to accomplish this: import groovy.json.*
def flowFile = session.get()
if(!flowFile) return
def directReport = flowFile.getAttribute('direct_report')
def json = new JsonSlurper().setType(JsonParserType.LAX).parseText(directReport)
json*.key.each { key ->
flowFile = session.putAttribute(flowFile, key, json[key])
}
session.provenanceReporter.modifyAttributes(flowFile)
session.transfer(flowFile, REL_SUCCESS) This script assumes you have used something like EvaluateJsonPath to extract $.results[0].directReports[0] into an attribute named 'direct_report'. It parses the JSON object and adds attributes to the flow file for each key/value pair in the object. You can adjust this to work with content rather than attributes, e.g. I have examples of various scripts on my blog.
... View more
11-18-2016
12:10 PM
1 Kudo
In addition to @Pierre Villard 's suggestion, PutHDFS transfers flow files that have been successfully written to HDFS to the "success" relationship, so you can put a processor downstream from PutHDFS (along the "success" relationship", and at that point you can be sure that the file has been successfully written to HDFS, and can proceed accordingly.
... View more
11-16-2016
10:19 PM
Might need an "AS blob_contents", can't remember
... View more
11-16-2016
10:19 PM
2 Kudos
DB2 might be returning a different JDBC type for BLOB than what the processor is expecting, such that it tries to convert it to a String or something else rather than a byte array. For your workaround, try a column alias for the case() function, so you can set the name of that column to something Avro will like, such as "blob_contents": SELECT case(BLOBTBL.BLOB_CONTENTS as varchar(2000)) blob_contents FROM BLOB_DECOMP BLOBTBL fetch first 10 rows only with UR
... View more
11-15-2016
04:57 PM
1 Kudo
Does it work if you replace the back-slashes with forward slashes in the Database Driver Jar Url property?
... View more
11-15-2016
01:26 PM
3 Kudos
What does your current schema look like? If you have a field with a type of something like ["null","int"] then it is being declared as a "nullable union", meaning the value can be null or a valid integer. If instead you use simply "int" for the type, then it should enforce non-null values for that field. If it does not, then the CSV reader from the Kite SDK (used to parse the CSV in the ConvertCSVtoAvro processor) likely treats missing values as empty or default rather than null. If this is the behavior you're seeing, please feel free to file a Jira to improve the handling of missing CSV values.
... View more
11-14-2016
06:31 PM
For sufficiently small JSON files, you can use EvaluateJsonPath or ExtractText to get the full body of the document into an attribute before the SplitJson, but keep in mind that this will load the document into memory (rather than being in the content repository and only referenced), and if you modify the flow file, both the original and the new flow file will have a copy in memory. This can get unwieldy pretty quickly. If instead you can determine a smaller portion of the document that is needed, EvaluateJsonPath (with the appropriate JSON Path expression) can store that as an attribute instead. Alternatively you might be able to store the original document with PutDistributedMapCache, and then fetch it into an attribute only when it is needed (so also the use of UpdateAttribute to delete it when finished is recommended). A different approach, if you are comfortable with a scripting language such as Javascript or Groovy, is to use ExecuteScript to invert the behavior of SplitJson; that is, keep the flow file content identical to the original content, and instead store each split value as an attribute in its own flow file. This maintains the original content in each flow file, and as I mentioned the content itself will not be "moved" or copied; instead the flow file maintains a reference to the content (which would be unchanged from the original in this case). If you'd like to see this "inverse" behavior supported in SplitJson (so you can choose whether the splits go in attributes or content), please feel free to file a Jira for this capability.
... View more