Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi No FilesSystem for scheme: wasb

avatar

How can I configure Nifi to connect to HDInsight?

I’m getting the error “No FilesSystem for scheme: wasb” when running Nifi PUT Hdfs command on a server attempting to connect to an HDInsight cluster.

I tried to add the Hadoop-azure.jar in Nifi’s class path but that caused a NoClassDefFoundError for apache hadoop fs filesystem.

1 ACCEPTED SOLUTION

avatar
New Contributor

Ran into this issue on a recent project. The dependences have to be incorporated into the nar file - I've created a version incorporating the dependencies and submitted a pull request on the associated lira issue at https://issues.apache.org/jira/browse/NIFI-1922. Performed some basic testing on a mix of HDInsight clusters and it appears to work OK. Note though - you will need to implement NIFI on the cluster (due to the HDInsight/blob store security model) and will need to install Java 8.

View solution in original post

9 REPLIES 9

avatar
Guru

Could you provide a snippet from your nifi-app log with the stack trace for this error? I suspect the problem is that your hadoop-azure.jar is built against the wrong version of hadoop. What is the source of this file?

avatar
Master Guru

I'm not totally sure if this is the problem, but given that NiFi has NARs with isolated class loading, adding something to the classpath usually isn't as simple as dropping it in the lib directory.

The hadoop libraries NAR would be unpacked to this location:

work/nar/extensions/nifi-hadoop-libraries-nar-<VERSION>.nar-unpacked/META-INF/bundled-dependencies/

You could trying putting the hadoop-azure.jar there, keeping in mind that if the work directory was removed, NiFi would unpack the original NAR again without your added jar.

Some have had success creating a custom version of the hadoop libraries NAR to switch to other libraries:

https://github.com/bbukacek/nifi-hadoop-libraries-bundle

Right now Apache NiFi is based on Apache Hadoop 2.6.2.

avatar

Would dropping things on the bootstrap dir ensure they are on a system classpath maybe?

avatar
@Andrew Grande

It looks like the Hadoop-azure.jar is getting picked up ... but apparently there are other dependencies that are missing

avatar

Hi Bryan, turns out after dropping the Hadoop-azure.jar in the nar directory ... I get a new error:

Caused by: java.lang.NoClassDefFoundError: com/microsoft/azure/storage/blob/BlobListingDetails at org.apache.hadoop.fs.azure.NativeAzureFileSystem.createDefaultStore(NativeAzureFileSystem.java:1064) ~[na:na] at org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1035) ~[na:na] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596) ~[na:na] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) ~[na:na] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) ~[na:na] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) ~[na:na] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) ~[na:na] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:305) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:302) ~[na:na] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_91] at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_91] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java:302) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:274) ~[na:na] at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:196) ~[na:na] at org.apache.nifi.processors.hadoop.PutHDFS.onScheduled(PutHDFS.java:177) ~[na:na]

avatar
Master Guru

This most likely means there is another JAR you need to add...

If you look at the pom file for the hadoop-azure JAR:

http://central.maven.org/maven2/org/apache/hadoop/hadoop-azure/2.7.0/hadoop-azure-2.7.0.pom

You can see all the dependencies it needs:

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-storage</artifactId>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<scope>compile</scope>
</dependency>

My guess would be the azure-storage JAR is missing.

This becomes a slippery slope though, because then azure-storage might have transitive dependencies as well.

avatar

2016-05-24 23:53:52,326 WARN [StandardProcessScheduler Thread-1] org.apache.hadoop.util.NativeCodeLoader Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-05-24 23:53:52,498 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@5c6cae55 // Another save pending = false 2016-05-24 23:53:52,767 ERROR [StandardProcessScheduler Thread-1] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=0a8eeb51-4937-4dfc-a7f4-9c2bce921d0c] PutHDFS[id=0a8eeb51-4937-4dfc-a7f4-9c2bce921d0c] failed to invoke @OnScheduled method due to java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task.; processor will not be scheduled to run for 30000 milliseconds: java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task. 2016-05-24 23:53:52,785 ERROR [StandardProcessScheduler Thread-1] o.apache.nifi.processors.hadoop.PutHDFS java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task. at org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1405) ~[na:na] at org.apache.nifi.controller.StandardProcessorNode.access$100(StandardProcessorNode.java:89) ~[na:na] at org.apache.nifi.controller.StandardProcessorNode$1.run(StandardProcessorNode.java:1243) ~[na:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_91] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] Caused by: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.8.0_91] at java.util.concurrent.FutureTask.get(FutureTask.java:206) [na:1.8.0_91] at org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1388) ~[na:na] ... 9 common frames omitted Caused by: java.lang.reflect.InvocationTargetException: null at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_91] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_91] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137) ~[na:na] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125) ~[na:na] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70) ~[na:na] at org.apache.nifi.controller.StandardProcessorNode$1$1.call(StandardProcessorNode.java:1247) ~[na:na] at org.apache.nifi.controller.StandardProcessorNode$1$1.call(StandardProcessorNode.java:1243) ~[na:na] ... 6 common frames omitted Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FileSystem ...

avatar
New Contributor

Ran into this issue on a recent project. The dependences have to be incorporated into the nar file - I've created a version incorporating the dependencies and submitted a pull request on the associated lira issue at https://issues.apache.org/jira/browse/NIFI-1922. Performed some basic testing on a mix of HDInsight clusters and it appears to work OK. Note though - you will need to implement NIFI on the cluster (due to the HDInsight/blob store security model) and will need to install Java 8.

avatar

Hi Alex... do you have detailed instructions on how to Build and Implement Nifi on the cluster?