Member since
12-21-2015
43
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3368 | 12-08-2016 12:33 AM | |
6462 | 01-29-2016 08:44 PM | |
4283 | 01-28-2016 10:48 PM |
01-29-2016
08:44 PM
1 Kudo
I figured it out by myself. Here is the steps: 1: download sandbox or use your existing sandbox (HDP 2.3.2) 2: create a workflow on Hue's oozie 3: click "Edit Properties" and add a property in Oozie parameters: oozie.action.sharelib.for.spark = spark,hcatalog,hive 4: click Save button 5: add a shell action; fill name field. shell command field may be required; enter whatever any string temporary and save the shell action. We come back to edit it later. 6: Close workflow and open file browser; click oozie, then workspaces. Identify _hue_xxx directory for the workflow you are creating. 7: create lib directory. 8: copy your jar file that contains spark java program. 9: move up the directory and copy shell file (e.g. script.sh) that contains: spark-submit --class JDBCTest spark-test-1.0.jar spark-test-1.0.jar is the file you uploaded to lib directory. 10: Go back to workflow web page 11: Open the shell action and set Shell command by selecting shell file (e.g. script.sh) 12: Also populate Files field to add the schell file (e.g. script.sh) again 13: click Done 14: save the workflow 15: submit the workflow 16: it should run. My java program does like this: Statement stmt = con.createStatement(); String sql = "SELECT s07.description AS job_category, s07.salary , s08.salary , (s08.salary - s07.salary) AS salary_difference FROM sample_07 s07 JOIN sample_08 s08 ON ( s07.code = s08.code) WHERE s07.salary < s08.salary SORT BY s08.salary-s07.salary DESC LIMIT 5"; ResultSet res = stmt.executeQuery(sql); It uses hive jdbc driver.
... View more
01-28-2016
10:54 PM
Thank you, Artem.
... View more
01-28-2016
10:48 PM
1 Kudo
This is my summary: 1. download HDP sandbox 2. create a worflow that does list-databases and run it from Hue. We expect the failure. 3. To see the error log, localhost:19888/jobhistory/?user.name=hue, click job Id link, click value 1 link for Maps at the bottom of tables. In the new web page, click logs link. You should see java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver. 4. open SSH shell as root. Copy oracle jdbc jar to /home/oozie from your local file system. 5. run su - oozie
6. run hdfs dfs -put ojdbc7-12.1.0.1.0.jar /user/oozie/share/lib/lib_20151027124452/sqoop 7. check if oracle jdbc jar is copied or not from Ambari. Use HDFS Files viewer from clicking the second icon from the right side on the Ambari page. Navigate it /user/oozie/share/lib/lib_20151027124452/sqoop. 8. Restart sandbox. 9. Run the workflow created at step 2. 10. It should work.
11. create another workflow that does import Hive and run it from Hue. 12. You see the warning message: 2016-01-28 21:10:23,398 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
Intercepting System.exit(1) 13. Need the followings two things: (1) add oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop >> To do this from Hue oozie workflow web page, click "Edit properties" link and add this property to Oozie parameters. The oozie.use.system.libpath is already set true for default. Just add it. (2) copy hive-site.xml to lib dir >> To do this from Hue, click File Browser, click oozie that takes you to /user/hue/oozie. Click workspaces, click _hue_xxx directory for the current workflow, create lib directory in the identified directory, copy hive-site.xml file that contains something like: (you don't need to update jdbc connection string of yours; it looks it needs hive.metastore.uris value; so maybe you can delete the first property, which I have not tried yet.) <configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://sandbox.hortonworks.com:9083</value>
</property>
</configuration> Once this file is created, go back to the oozie workflow editor, and click Add Path button and select hive-site.xml. Save it. 14. Run the workflow. It should run.
... View more
01-28-2016
07:51 PM
1 Kudo
Since I could see the detail error message from http://localhost:19888/jobhistory/?user.name=hue, I could identify what's wrong. it was due to a temp dir for A_BASE; after I delete this, it works, but since so many things I touched, I will do from the scratch and I will comment my summary here. Thanks, Artem for this with me.
... View more
01-28-2016
07:07 PM
Since whenever we submit the workflow, it will create a new directory, it does not work. I did sudo -u yarn with sqoop import. I does not return code 1 any more, which is good, but neither HDFS file nor Hive table is created. Any idea?
... View more
01-28-2016
06:53 PM
I ran: sudo -u hdfs hdfs dfs -mkdir /user/yarn
sudo -u hdfs hdfs dfs -chown yarn:hdfs /user/yarn
I got: Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=EXECUTE, inode="/user/yarn/.staging/job_1454000721165_0009/libjars/commons-compress-1.4.1.jar":yarn:hdfs:drwx------ How should we do from here?
... View more
01-28-2016
06:50 PM
When I ran sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --table A_BASE --verbose --target-dir /user/root/test1 -m 1, I got an interesting log: at org.apache.sqoop.Sqoop.main(Sqoop.java:244) Caused
by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=yarn, access=WRITE,
inode="/user/yarn/.staging":hdfs:hdfs:drwxr-xr-x
... View more
01-28-2016
05:57 PM
Additional information: I added the above sqoop working command in the script.sh. I can see it works, but it fails at sqoop import.
... View more
01-28-2016
05:47 PM
I got the following error: I used your sqoopshell example. Stdoutput I AM: yarn Stdoutput Running On: sandbox.hortonworks.com Stdoutput CWD: /hadoop/yarn/local/usercache/root/appcache/application_1454000721165_0004/container_e21_1454000721165_0004_01_000002 Stdoutput Can I see these clients? Hive = /usr/bin/hive, Sqoop = /usr/bin/sqoop <<< Invocation of Main class completed <<< Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
... View more
01-27-2016
09:59 PM
Where should we set oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop? I'm using Hue's oozie web page interface. There is "edit properties" link where I could set oozie parameters and hadoop job properties. Since I don't know exact, I set both, but the result is "failed". Could you tell where we should set? Also the above link does not a web page available.
... View more
- « Previous
- Next »