Created 10-08-2016 04:40 PM
I need a way to get the hexdump of a file using nifi. I have used an executestreamcommand processor already and it work's but uses a lot of processing power to write each file to a file system. Is there a processor that could achieve this? or does anyone have a custom nar I could use to extract the hexdump of a file? Also I only need the first 16 bits of the file.
Thank you!!
Created 10-08-2016 05:40 PM
You could use the ExecuteScript processor if you are comfortable with Groovy, Javascript, Jython, JRuby, or Lua. Here's an example of a Groovy script that I think will do what you're asking:
import java.io.DataInputStream def flowFile = session.get() if(!flowFile) return def attr = '' session.read(flowFile, {inputStream -> dis = new DataInputStream(inputStream) attr = Integer.toHexString(dis.readUnsignedShort()) } as InputStreamCallback) flowFile = session.putAttribute(flowFile, 'first16hex', attr) session.transfer(flowFile, REL_SUCCESS)
This maintains the content in the flow file but adds an attribute called 'first16hex' that contains a string representation of the first 16 bits of the incoming flow file content.
Please let me know if I've misunderstood anything here, and I will try to help. I should mention that a full hexdump processor could be helpful, feel free to raise a Jira for this feature.
Created 10-08-2016 05:40 PM
You could use the ExecuteScript processor if you are comfortable with Groovy, Javascript, Jython, JRuby, or Lua. Here's an example of a Groovy script that I think will do what you're asking:
import java.io.DataInputStream def flowFile = session.get() if(!flowFile) return def attr = '' session.read(flowFile, {inputStream -> dis = new DataInputStream(inputStream) attr = Integer.toHexString(dis.readUnsignedShort()) } as InputStreamCallback) flowFile = session.putAttribute(flowFile, 'first16hex', attr) session.transfer(flowFile, REL_SUCCESS)
This maintains the content in the flow file but adds an attribute called 'first16hex' that contains a string representation of the first 16 bits of the incoming flow file content.
Please let me know if I've misunderstood anything here, and I will try to help. I should mention that a full hexdump processor could be helpful, feel free to raise a Jira for this feature.
Created 10-12-2016 01:23 AM
Hi Matt!
I now understand what you meant by using the executescript processor!
How would I edit this code to allow me to capture more of the hex output? right now I only get the first 4 characters of hex.
Thank you!
Created 10-12-2016 02:41 AM
import java.io.DataInputStream def flowFile = session.get() if(!flowFile) return def attr = '' session.read(flowFile, {inputStream -> dis = new DataInputStream(inputStream) attr = Long.toHexString(dis.readLong()) } as InputStreamCallback) flowFile = session.putAttribute(flowFile, 'first16hex', attr) session.transfer(flowFile, REL_SUCCESS)
Hi Matt, I modified the code to use long instead of short and it gets the first 16 hex bits. How would I get the first 36 Hex bits?
Created 10-14-2016 06:31 PM
You'd need multiple calls to dis.readXYZ(), call toHexString() on each, then concatenate before storing in the "attr" or whatever value will be the result (going into the first16hex attribute). For 36 Hex characters, it's probably two dis.readLongs() followed by a dis.readUnsignedShort().
Created 10-19-2016 05:40 PM
Hi Matt,
I got it to work with some help from a teammate!
See the code below:
import java.io.DataInputStream def flowFile = session.get() if(!flowFile) return def attr = '' session.read(flowFile, {inputStream -> dis = new DataInputStream(inputStream) attr = Long.toHexString(dis.readLong()) attr2 = Long.toHexString(dis.readLong()) } as InputStreamCallback) flowFile = session.putAttribute(flowFile, 'first16hex', attr+attr2) session.transfer(flowFile, REL_SUCCESS)
Created 10-19-2016 05:41 PM
A source of Hex Filetype headers:
Created 10-19-2016 05:46 PM
http://www.tutorialspoint.com/unix_commands/hexdump.htm could be called from executestreamcommand
Created 05-25-2017 06:12 PM
Hey guys,
I know this is an old post and the original question was only about the first 16 bits hex dump. However, since I came across this post and it gave me some directions, I've tried to improve it considering the following things:
1) In case the flowfile has, say, 93 bytes, using Long or UnsignedShort allows only to shift 8 or 4 bytes at the time, making it impossible to read an odd number of bytes. Thus, I've used readUnsignedByte instead.
2) Missing left padding zeros
3) Reading the whole flowfile and dumping it in an attribute, which I've called 'raw'
import java.io.DataInputStream def flowFile = session.get() if(!flowFile) return def raw = '' def aux = '' boolean eof = false session.read(flowFile, {inputStream -> dis = new DataInputStream(inputStream) while (!eof) { try { aux = Integer.toHexString(dis.readUnsignedByte()); raw = raw + aux.padLeft(2,'0'); } catch (EOFException e) { eof = true; } } } as InputStreamCallback) flowFile = session.putAttribute(flowFile, 'raw', raw) session.transfer(flowFile, REL_SUCCESS)
As I am a newbie, please feel free to comment.
Cheers,
Gus