About bbende

bbende · ‎06-02-2016

Can you retry all these tests and during the second cat, instead of "cat 'record02' ", cat something longer like "cat 'record123456789'". I'd like to see if tracking the file size is the issue, because record01 and record02 would be the same file size.

bbende · ‎05-25-2016

This most likely means there is another JAR you need to add... If you look at the pom file for the hadoop-azure JAR: http://central.maven.org/maven2/org/apache/hadoop/hadoop-azure/2.7.0/hadoop-azure-2.7.0.pom You can see all the dependencies it needs: <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <scope>compile</scope> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-core</artifactId> <scope>compile</scope> </dependency> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <scope>compile</scope> </dependency> <dependency> <groupId>com.microsoft.azure</groupId> <artifactId>azure-storage</artifactId> <scope>compile</scope> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <scope>compile</scope> </dependency> My guess would be the azure-storage JAR is missing. This becomes a slippery slope though, because then azure-storage might have transitive dependencies as well.

bbende · ‎05-25-2016

Generally all processors are executing within a single OS process started by a single user. The only case I can think of where one processor could execute at a higher level would be when using the ExecuteProcess/ExcuteStreamCommand processors... the command can be "sudo" and the args can be the command to execute. This assumes the user that started NiFi has sudo privileges.

bbende · ‎05-24-2016

The DistributedMapCache is a NiFi concept which is used to store information for later retrieval, either by the current processor by another processor. There are two components - the DistributedMapCacheServer which runs on one node if you are in a cluster, and the DistributedMapCacheClientService which runs on all nodes if in a cluster, and communicates with the server. Both of these are Controller Services, configured in NiFi through the controller section in the top right toolbar. Processors use the client service to store and retrieve data from the cache server. In this case, DetectDuplicate uses the cache to store information about what it has seen and determine if it is a duplicate.

bbende · ‎05-20-2016

I'm not totally sure if this is the problem, but given that NiFi has NARs with isolated class loading, adding something to the classpath usually isn't as simple as dropping it in the lib directory. The hadoop libraries NAR would be unpacked to this location: work/nar/extensions/nifi-hadoop-libraries-nar-<VERSION>.nar-unpacked/META-INF/bundled-dependencies/ You could trying putting the hadoop-azure.jar there, keeping in mind that if the work directory was removed, NiFi would unpack the original NAR again without your added jar. Some have had success creating a custom version of the hadoop libraries NAR to switch to other libraries: https://github.com/bbukacek/nifi-hadoop-libraries-bundle Right now Apache NiFi is based on Apache Hadoop 2.6.2.

bbende · ‎05-19-2016

I'm not 100% sure how LZO works, but in a lot of cases the codec ends up needing a native library. On a unix system you would set LD_LIBRARY_PATH to include the location of the .so files for the LZO codec, or put them in JAVA_HOME/jre/lib native directory. You could do something like: export LD_LIBRARY_PATH=/usr/hdp/2.2.0.0-1084/hadoop/lib/native bin/nifi.sh start That should let PutHDFS know about the appropriate libraries.

bbende · ‎05-18-2016

Have you seen the Kerberos section of the NiFi admin guide? https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#kerberos_login_identity_provide

bbende · ‎05-18-2016

Can you try what Matt suggested above, to remove the "io.compression.codecs" from core-site.xml? I agree with him that this is likely related to the compression codecs, you can see in the stacktracke, the relevant lines are: org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2058) ~[na:na] at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128) ~[na:na] at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:175) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getCompressionCodec(AbstractHadoopProcessor.java:375) ~[na:na] at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:220) ~[na:na] at

bbende · ‎05-18-2016

@bschofield Another idea for transferring large files over a high-latency network, might be the following... On the sending side use a SegmentContent processor to break a large FlowFile into many smaller segments, followed by a PostHTTP processor with the Concurrent Tasks increased higher than 1. This lets the sending side better utilize the network by concurrently sending segments. On the receiving side, use a ListenHTTP processor to received the segmented FlowFiles, followed by a MergeContent processor with a Merge Strategy of Defragment. The Defragment mode will merge all the segments back together to recreate the original FlowFile.

bbende · ‎05-17-2016

What version of NiFi is this?

Online	Offline
Last Visited	‎09-10-2020 01:23 PM

Member Since	‎09-29-2015 04:02 PM
Last Visited	‎09-10-2020 01:23 PM
Posts	871
Kudos received	709

Cloudera Community

Re: Using nifi registry in a nifi cluster.

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: Tailfile with rotate log files

Re: Nifi No FilesSystem for scheme: wasb

Re: How to run one nifi processor as super user

Re: DistributedMapCacheClientService (NiFi Wecrawl...

Re: Nifi No FilesSystem for scheme: wasb

Re: PUTHDFS processor not working - NoClassDefFoun...

Re: NIFI : configure https and kerberized environm...

Re: PUTHDFS processor not working - NoClassDefFoun...

Re: Does nifi offer UDP acceleration for fast larg...

Re: PUTHDFS processor not working - NoClassDefFoun...