Created 08-20-2017 07:17 AM
I have created a simple java+spark project to read and perform a calculation on JavaRDD. I have setup HDP-Oracle VM on my machine.My question is where to upload .jar(ex: Aambari file view) and execute the jar
Created 08-22-2017 05:25 AM
It seems there is a confusion here is the F: Linux directory? . The designation looks a window directory if so then copy using winscp /FileZilla to you /tmp on the on the sandbox or linux box first.
As root switch to hdfs user
# su - hdfs
Make sure the permission are correct on the /user/maria_dev check the owner
$hdfs dfs -ls /user/maria_dev
The output should give the owner to spark eg maria_dev
drwxr-xr-x - maria_dev hdfs 0 2017-08-08 23:55 /user/maria_dev
Now copy the file to the hdfs directory as the hdfs user
$ hdfs dfs -CopyFromLocal /tmp/sparktest.jar /user/toto
Now the file should be available in hdfs you can list the file:
$ hdfs dfs -ls /user/maria_dev
Now you can run your spark job as see the progress in the YARN UI choose running on the left pane
http://ambari_host:8088/cluster
Hope that helps
Created 08-20-2017 09:41 AM
It can be local or you can upload to hdfs but to do that you need maybe to create your home directory in /user
As root switch to hdfs user
# su - hdfs
check existing directories
$ hdfs dfs -ls /
Make a home directory for your user (toto)
$ hdfs dfs -mkdir /user/toto
Change ownership
$ hdfs dfs -chown toto:hdfs /user/toto
Copy your jar to hdfs imagining the jars are in your local home directory /home/toto/test.jar
As hdfs user while in your
$ hdfs dfs -CopyFromLocal test.jar /user/toto
Now you can execute it from hdfs by passing the paths to the input and output directories in HDFS.
Hope that helps
Created 08-22-2017 03:21 AM
Thanks for quick response
I have jar in my local directory at F:/dev/jars/sparktest.jar and I would like to move to maria_dev directory which is present in hdfs.
$hdfs dfs F:/dev/jars sparktest.jar /user/maria_dev
I tried the above command it is not working
Created 08-22-2017 04:04 AM
If you want to copy a File "F:/dev/jars sparktest.jar" from your local file system to the HDFS location "/user/maria_dev" then you will need the following:
1. Switch to "hdfs" user ( OR else you will have to login to the shell as "maria_dev" to wrtie inside the user directory Or any username who has write access insie the "/user/maria_dev" directory
# su - hdfs
2. Now you can use the "put" switch to place the file with dfs command as following:
# hdfs dfs -put F:/dev/jars sparktest.jar /user/maria_dev
3. Now you should be able to list the file:
# hdfs dfs -ls /user/maria_dev
.
Created 08-22-2017 05:25 AM
It seems there is a confusion here is the F: Linux directory? . The designation looks a window directory if so then copy using winscp /FileZilla to you /tmp on the on the sandbox or linux box first.
As root switch to hdfs user
# su - hdfs
Make sure the permission are correct on the /user/maria_dev check the owner
$hdfs dfs -ls /user/maria_dev
The output should give the owner to spark eg maria_dev
drwxr-xr-x - maria_dev hdfs 0 2017-08-08 23:55 /user/maria_dev
Now copy the file to the hdfs directory as the hdfs user
$ hdfs dfs -CopyFromLocal /tmp/sparktest.jar /user/toto
Now the file should be available in hdfs you can list the file:
$ hdfs dfs -ls /user/maria_dev
Now you can run your spark job as see the progress in the YARN UI choose running on the left pane
http://ambari_host:8088/cluster
Hope that helps
Created 08-22-2017 10:34 PM
Any feeback?
Created 08-23-2017 03:33 AM
Perfect!.It worked for me, I used WinSCP to upload the jar to tmp folder.And using -CopyFromLocal command moved the jar from tmp to the maria_dev user and able execute the jar.
Thanks a lot for quick response