Hello I have a custom Sqoop plugin, I want to use in an Oozie Sqoop action. Whenever i run the workflow, i get the following error message in the Oozie jobhistory log: =================================================================
>>> Invoking Sqoop command line now >>>
1828 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2016-08-16 10:35:31,504 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code 
Oozie Launcher failed, finishing Hadoop job gracefully
I have tried running the same workflow, but changing the name of the first argument (toolname), to something random (ie. running "sqoop asdfsdf --target-dir /tmp/"), and the error is the same. Thus it seems to me, that it fails at reading the .jar file, which contains the plugin. As i was unsure where to place the .jar plugin file, i placed it in several directories:
1) hdfs: /user/oozie/share/lib/[library_folder]
2) namenode: /usr/hdp/184.108.40.206-2557/oozie/share/lib/sqoop, /usr/hdp/220.127.116.11-2557/oozie/lib/
I created the tools.d folder: /usr/hdp/18.104.22.168-2557/oozie/share/conf/tools.d, containing an xml file, containing one line: "[package name]=[location of .jar file]" From the same logs i can see the .jar file appear in the list of files: Files in current dir:/grid/XXX/XXX/XXX/XXX
I have also looked in the log files of yarn ("oozie:launcher"), which gives me no further info. Do you have any advice on how to make this work?
... View more
Hi @Andread B, Why do you want to run NiFi on the NameNode ? If you are ingesting lot of data I would recommend running NiFi on a dedicated host or at least on edge node. Also, if you will ingest lot of data for a single NiFi instance, you can use GenerateTableFetch (coming in NiFi 1.0) to divide your import into several chunks, and distribute them on several NiFi nodes. This processor will generate several FlowFiles based on the Partition Size property where each FlowFile is a query to get a part of the data. You can try this by downloading NiFi 1.0 Beta : https://nifi.apache.org/download.html
... View more