Member since
11-16-2015
902
Posts
664
Kudos Received
249
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
115 | 09-30-2025 05:23 AM | |
579 | 06-26-2025 01:21 PM | |
432 | 06-19-2025 02:48 PM | |
675 | 05-30-2025 01:53 PM | |
9650 | 02-22-2024 12:38 PM |
10-14-2016
06:31 PM
You'd need multiple calls to dis.readXYZ(), call toHexString() on each, then concatenate before storing in the "attr" or whatever value will be the result (going into the first16hex attribute). For 36 Hex characters, it's probably two dis.readLongs() followed by a dis.readUnsignedShort().
... View more
10-13-2016
01:58 PM
As of NiFi 1.0.0 (HDF 2.0.0), there is no way to directly evaluate Avro fields, instead you might use ConvertAvroToJSON and apply the same approach as described in your other question. If/When NIFI-962 is implemented, then such operations might be available directly on the Avro file.
... View more
10-10-2016
05:02 PM
Where are you getting your JSON from? If it is a static value, you could use GenerateFlowFile -> ReplaceText to set the value for the body, if it comes from a file you can use GetFile or ListFile -> FetchFile. If the mime.type attribute is not already set for the flow file (containing the JSON response), you can use UpdateAttribute to set mime.type to "application/json", and your auth key (called maybe "auth.key") to your key. Then in InvokeHttp you can list mime.type,auth.key in the "Attributes to send" property, and make sure "Send Message Body" is true. These steps will ensure that the headers are sent and the JSON content from the flow file is sent as the body.
... View more
10-08-2016
05:40 PM
2 Kudos
You could use the ExecuteScript processor if you are comfortable with Groovy, Javascript, Jython, JRuby, or Lua. Here's an example of a Groovy script that I think will do what you're asking: import java.io.DataInputStream
def flowFile = session.get()
if(!flowFile) return
def attr = ''
session.read(flowFile, {inputStream ->
dis = new DataInputStream(inputStream)
attr = Integer.toHexString(dis.readUnsignedShort())
} as InputStreamCallback)
flowFile = session.putAttribute(flowFile, 'first16hex', attr)
session.transfer(flowFile, REL_SUCCESS) This maintains the content in the flow file but adds an attribute called 'first16hex' that contains a string representation of the first 16 bits of the incoming flow file content. Please let me know if I've misunderstood anything here, and I will try to help. I should mention that a full hexdump processor could be helpful, feel free to raise a Jira for this feature.
... View more
10-05-2016
01:00 PM
1 Kudo
At the time of this writing (after HDP 2.5 was released), the sandbox download is still incorrectly named OVF. Changing to OVA still works.
... View more
10-04-2016
01:08 PM
1 Kudo
Great answer! Just to add a caveat, if you are using HDF 2.0 and HDP 2.5, please see the following: https://community.hortonworks.com/questions/59681/puthivestreaming-nifi-processor-various-errors.html
... View more
10-04-2016
12:36 PM
7 Kudos
The issue for Hive Streaming between HDF 2.0 and HDP 2.5 is captured as NIFI-2828 (albeit under a different title, it is the same cause and fix). In the meantime as a possible workaround I have built a Hive NAR that you can try if you wish, just save off your other one (from the lib/ folder with a version like 1.0.0.2.0.0-159 or something) and replace it with this one.
... View more
10-03-2016
06:23 PM
I've left a possible solution as a separate answer. Doing all the processing with a Python script is not ideal, as you'd need your own Hadoop/Hive client libraries and all you'd use NiFi for is executing the external Python script. However if you just need some custom processing during the flow, you can use ExecuteScript (link in my other answer) with Jython, I have some examples on my blog.
... View more
10-03-2016
06:21 PM
Wherever the error happens in the flow (sounds like PutHDFS in your example), there is likely a "failure" relationship (or something of the kind) for that processor. You can route failed flow files to a separate branch, where you can perform your error handling. For your example, you can have PutHDFS route "failure" to an UpdateAttribute that sets some attribute like "status" to "error", and PutHDFS could route "success" to an UpdateAttribute that sets "status" to "success". Assuming your Hive table is created atop CSV files, then at this point you could route both back to a ReplaceText that creates a comma-separated line with the values, using Expression Language to get the date, filename, and the value of the status attribute, so something like: ${now()},${filename},${status} You should avoid having small files in HDFS, so you wouldn't want to write each individual line as a file to HDFS. Instead consider the MergeContent processor to concatenate many rows together, then use a PutHDFS to stage the larger file in Hadoop for use by Hive. If MergeContent et al doesn't give you the file(s) you need, you can always use an ExecuteScript processor for any custom processing needed. If your Hive table expects Avro or ORC format for the files, there are processors for these conversions as well (although you may have to convert to intermediate formats such as JSON first, see the documentation for more details).
... View more
10-03-2016
06:05 PM
It appears your URL has been scrubbed (which is fine), can you find the character at (0-based) index 73? The URL above looks ok (should recognize underscores, semicolons, the at symbol, etc.). Also if you are using the default database, try explicitly putting 'default' in the URL, so jdbc:hive2://host.name.net:10000/default;principal=hive/_HOST@EXAMPLE.COM. You might also try adding "auth=KERBEROS" to the URL parameters, although I don't think that's required (setting the principal is all that's supposed to be needed).
... View more