- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to find yarn application ID for Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
- Labels:
-
Apache Hadoop
-
Apache YARN
Created ‎08-23-2018 09:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a job which copy data from Local file system and HDFS
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
2) How to find yarn application ID for this copyformlocal command
thanks,
Created ‎09-21-2018 03:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Updating late , After further checking information as below.
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
:- its linux server local command You can check its local server process by #ps -ef|grep file1.dat |grep -i copyFromLocal, you will find the process id ,Hence again we can its local process.
2) How to find yarn application ID for this copyformlocal command
:- Its linux server local command and use the local server resource, hence you wont able to find MR/Yarn Jobs. While data copy RM assign the resources however its for datacopy only.
Hence "hadoop fs " command occupy the resource from local linux server and hadoop cluster as well for copy only. Where proces is local only , it wont create MR/Yarn Jobs.
Created ‎08-24-2018 06:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @zkfs
There isn't one. For the above example, you will notice an entry for a non mapreduce job in the namenode log similar to this example;
hadoop-hdfs-namenode.log:2018-08-24 06:44:41,819 INFO hdfs.StateChange (FSNamesystem.java:completeFile(3759)) - DIR* completeFile: /user/hadoop/file1.dat._COPYING_ is closed by DFSClient_NONMAPREDUCE_956954044_1
What happens is; the client used the create() operation defined in the DistributedFileSystem class, and then makes use of the DFSOutputStream class to write to the an internal queue, called the 'data queue' which is used by the datastreamer, which in turn will allocate blocks for the data that we want to write with the copyfromlocal command. There is no mapreduce/yarn job here, which you can notice from the NONMAPREDUCE entry in the namenode log. For some other tools, such as distcp, you would see mapreduce involved.
Created ‎08-30-2018 01:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Entries will updated in logs, however is there any command to check application id for Hadoop Command i am looking like that.
Example :- for Yarn we can check list of running jobs by using YARN command #yarn application -list
Created ‎09-21-2018 03:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Updating late , After further checking information as below.
1) Hadoop fs -copyFromLocal file1.dat /home/hadoop/file1.dat
:- its linux server local command You can check its local server process by #ps -ef|grep file1.dat |grep -i copyFromLocal, you will find the process id ,Hence again we can its local process.
2) How to find yarn application ID for this copyformlocal command
:- Its linux server local command and use the local server resource, hence you wont able to find MR/Yarn Jobs. While data copy RM assign the resources however its for datacopy only.
Hence "hadoop fs " command occupy the resource from local linux server and hadoop cluster as well for copy only. Where proces is local only , it wont create MR/Yarn Jobs.
