About knarayanan

knarayanan · ‎12-15-2016

@regie canada no you dont have to, it should work out of the box.

knarayanan · ‎12-15-2016

https://community.hortonworks.com/articles/71719/using-snappy-and-other-compressions-with-nifi-hdfs.... I posted the above link on how you can use compression, try and let me know how it goes. I don't think you can use LZO as it is not shipped as part of Nifi. You can try doing a yum install of LZO and then use executestream to compress the file and then do puthdfs, with none for compression codec.

knarayanan · ‎12-15-2016

A usual query that comes up is on using snappy and other compression codes when loading data from hdfs using the nifi puthdfs component. A common error that users come across is "java.lang.UnsatisfiedLinkError". This error occurs because snappy and other compression codecs, which are part of the native linux binaries and which the java libraries for these codecs utilize to do the actual compression, are not in the path of the JVM. Since the JVM cannot find them the bindings fails and you get a java.lang.UnsatisfiedLinkError. Follow the following steps to resolve this issue. 1. copy the native folder containing the compression libraries, from one of your hadoop nodes. cd /usr/hdp/x.x.x.x-xx/hadoop/lib/ tar -cf ~/native.tar native/ 2. scp the native.tar from your hadoop node to your NiFi node, untar to a location of your choice. in my case i use /home/myuser/hadoop/ cd ~ mkdir hadoop cd hadoop tar -cf /path/to/native.tar 3. go to your nifi folder , and open conf/bootstrap.conf and add the following jvn argument for java.library.path pointing to your folder containing the native hadoop binaries. (/home/myuser/hadoop/native in my case ) java.arg.15=-Djava.library.path=/home/myuser/hadoop/native/ i used 15 because that was the number for the last jvm argument in my bootstrap.conf. Alternately, you can edit bin/nifi-env.sh and add export LD_LIBRARY_PATH=/home/mysuser/hadoop/native/ 4. restart nifi.

knarayanan · ‎12-15-2016

you may need header information to convert this to json, technically even the xsl output you have is CSV, just delimited with pipe. I think you can directly use the executescript processor, to call a python script, and go from xsl to json.. getfile (read the file with xsl data ) -> splittext (split the data into lines )-->execute script (with script below to convert to json) --> merge content (merge contents based on fragment.identifier attribute of split text) --> put file (gives you json files ) --- example scrip for converting to json.. import json import java.io from org.apache.commons.io import IOUtils from java.nio.charset import StandardCharsets from org.apache.nifi.processor.io import StreamCallback class PyStreamCallback(StreamCallback): def __init__(self): pass def process(self, inputStream, outputStream): header=["column1","column2","column3"...] # the header for the xsl, this will become the name for json nodes text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) output={} for column in text.split("|"): index=0 # counter to keep track of column, so we can assing a name to the value. output[header(index)]=column outputStream.write(bytearray(json.dumps(output, indent=4).encode('utf-8'))) flowFile = session.get() if (flowFile != None): flowFile = session.write(flowFile,PyStreamCallback()) flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename').split('.')[0]+'_translated.json') session.transfer(flowFile, REL_SUCCESS)

knarayanan · ‎12-15-2016

i guess based on your comment, use replacetext processor to replace all occurrence of | with ,.

knarayanan · ‎12-15-2016

@regie canada you want to convert XSL to CSV? just wanting to confirm you didn't mean XLS.

knarayanan · ‎12-14-2016

any chance you are on OS X?

knarayanan · ‎12-13-2016

in your error logs , it says that the com.mysql.jdbc.driver could not be located. can you see if the mysql jdbc jar is in the location used by the service check. As per logs it should be at /usr/hdp/current/hive-server2/lib/mysql-connector-java.jar.

knarayanan · ‎12-12-2016

you will have to include the path to lzo codec binaries in the NiFi bootstrap script. add an entry like so - java.arg.15=-Djava.libaray.path=/path/to/your/lzocodec.so in the bootstrap.conf file.

knarayanan · ‎12-07-2016

you can use expression language. in you selecthiveql query you put your query as select * from tmp where last_name=${name}.. name will be replaced by the attribute value from your previous processors flow file. So add an updateAttribute before the selecthiveql processor and add attribute name with whatever value you want to set.

Online	Offline
Last Visited	‎08-20-2020 04:40 PM

Member Since	‎05-02-2016 08:13 PM
Last Visited	‎08-20-2020 04:40 PM
Posts	154
Kudos received	54

Cloudera Community

Re: MiniFi to NiFi connection through load balance...

Re: Nifi PutS3Object error with AMI Role (AwsCrede...

Re: Need help on a logic using NiFi.

Re: NIFI Installation: keep getting "Apache NiFi i...

Re: XSL to CSV using NiFi

Re: XSL to CSV using NiFi

Re: Enabling LZO compression using NiFi PutHDFS

using snappy and other compressions with Nifi hdfs...

Re: XSL to CSV using NiFi

Re: XSL to CSV using NiFi

Re: XSL to CSV using NiFi

Re: Enabling LZO compression using NiFi PutHDFS

Re: HiveServer2 shows failure - but starts after u...

Re: Enabling LZO compression using NiFi PutHDFS

Re: NiFi: Hive dynamic variable