Created 08-23-2018 09:52 PM
I have a job which copy data from Local file system and HDFS
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
2) How to find yarn application ID for this copyformlocal command
thanks,
Created 09-21-2018 03:16 AM
Updating late , After further checking information as below.
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
:- its linux server local command You can check its local server process by #ps -ef|grep file1.dat |grep -i copyFromLocal, you will find the process id ,Hence again we can its local process.
2) How to find yarn application ID for this copyformlocal command
:- Its linux server local command and use the local server resource, hence you wont able to find MR/Yarn Jobs. While data copy RM assign the resources however its for datacopy only.
Hence "hadoop fs " command occupy the resource from local linux server and hadoop cluster as well for copy only. Where proces is local only , it wont create MR/Yarn Jobs.
Created 08-24-2018 06:57 AM
Hi @zkfs
There isn't one. For the above example, you will notice an entry for a non mapreduce job in the namenode log similar to this example;
hadoop-hdfs-namenode.log:2018-08-24 06:44:41,819 INFO hdfs.StateChange (FSNamesystem.java:completeFile(3759)) - DIR* completeFile: /user/hadoop/file1.dat._COPYING_ is closed by DFSClient_NONMAPREDUCE_956954044_1
What happens is; the client used the create() operation defined in the DistributedFileSystem class, and then makes use of the DFSOutputStream class to write to the an internal queue, called the 'data queue' which is used by the datastreamer, which in turn will allocate blocks for the data that we want to write with the copyfromlocal command. There is no mapreduce/yarn job here, which you can notice from the NONMAPREDUCE entry in the namenode log. For some other tools, such as distcp, you would see mapreduce involved.
Created 08-30-2018 01:42 AM
Entries will updated in logs, however is there any command to check application id for Hadoop Command i am looking like that.
Example :- for Yarn we can check list of running jobs by using YARN command #yarn application -list
Created 09-21-2018 03:16 AM
Updating late , After further checking information as below.
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
:- its linux server local command You can check its local server process by #ps -ef|grep file1.dat |grep -i copyFromLocal, you will find the process id ,Hence again we can its local process.
2) How to find yarn application ID for this copyformlocal command
:- Its linux server local command and use the local server resource, hence you wont able to find MR/Yarn Jobs. While data copy RM assign the resources however its for datacopy only.
Hence "hadoop fs " command occupy the resource from local linux server and hadoop cluster as well for copy only. Where proces is local only , it wont create MR/Yarn Jobs.