Support Questions

Find answers, ask questions, and share your expertise

Who agreed with this topic

Cloudera Support, Very strange issue: system user can't invoke Container, other users can.

avatar
Expert Contributor

everybody, and Cloudera support:

 

i have catched an issue so strange to me. i am not sure there are some parameter to resolve this  or not. please be patient, since this will be a long story.  

 

system user like HDFS, YARN, MAPRED, HUE can't invoke container.  but other users which i created manually can.

 

My env is:  CDH 5.2(the latest version) + Kerberos + Sentry +  OPENLDAP

 

yesteady, i was going to create workflow to import data to hive from MySQL by oozie, the Sqoop job is : 

 

sqoop import -connect jdbc:mysql://10.32.87.4:3306/xxxx  -username admin -password xxxxxxxx  -table t_phone -hive-table t_phone -hive-database xxxx  -hive-import -hive-overwrite   -hive-drop-import-delims -m 1

 

but this job is failed.  the errors like below:

 

INFO mapreduce.Job: Job job_1414579088733_0016 failed with state FAILED due to: Application application_1414579088733_0016 failed 2 times due to AM Container for appattempt_1414579088733_0016_000002 exited with exitCode: -1000 due to: Application application_1414579088733_0016 initialization failed (exitCode=139) with output:
.Failing this attempt.. Failing the application.

 

i have no any doubt about this sqoop job, since this is ok in our PRD env (our PRD env is CDH5.1).  then i was going to OS level using hdfs to issue this sqoop script, it's failed too, the error is the same as above. then i am going to seach google, just find a few issues like me, but these error code is 1 or others , the solution is set HADOOP_YARN_HOME or HADOOP_MAPRED_HOME.  so i was going to try set HADOOP_YARN_HOME or HADOOP_MAPRED_HOME, and try again, mission failed too.

 

 

at that moment, i assumed maybe this is file or directory permission issue(since i have encountered this kind issue ago)

then i was going to delete /tmp, /user/history, /var/log etc..  restart all cluster. try again,  mission failed too too.

 

ok, i have no any idea, then go back to home to cook dinner, and watch moive, enjoy music,  have a good sleep.

 

today morning, i don't try sqoop anymore, since i have no any confidence,  i am going to test example mapreduce, 

 

the command is :  hadoop jar hadoop-examples.jar  pi 10 10

 

mission failed. the errors is the same.  as you can see, the error include some message like :exited with exitCode: -1000.

 

when i see 1000, i remember there is a setting in YARN means if user id is below 1000, this user can't invoke container in default.  we should set 1000 to 0 or add user id below 1000 to allow user list.  then i am going to check these setting, everything is ok, 

 

why ? why ? why ?  i ask myself for many times, but no answer.  but i believe this 1000 has connection to that 1000.

 

test begin:

 

i create a user with my name, the user id is 1500, execute sqoop scripts, it's successful.   i believe my assumption more stronger, since my owner user has executed import data by sqoop successful.

 

then i am going to create another is test, the user id is 999, import data done.

 

and try example mapreduce,   SUCCESSFUL...

 

so i am going back to use hdfs try sqoop and mapreduce, it's failed. then try yarn, hue, mission failed.

 

then i think all the system user can't invoke container, but other users can do it, no mantter what the user id  it is .

 

 

later, i open http://10.32.87.9:8088/cluster/nodes and http://10.32.87.49:8042/node/allContainers  to monitor container activities.  if the user is my owner user, the container can be invoked and run normally, but if the user is hds or other system user, the container can't be invoke(since no container in running state)

 

 

just show you an example, please look carefully on the highlight words, this is LINUX user.

 

[hdfs@datanode01 hadoop-0.20-mapreduce]$ hadoop jar hadoop-examples.jar pi 10 10

Number of Maps = 10
Samples per Map = 10

Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
14/10/29 22:23:25 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 65 for hdfs on ha-hdfs:cluster
14/10/29 22:23:25 INFO security.TokenCache: Got dt for hdfs://cluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster, Ident: (HDFS_DELEGATION_TOKEN token 65 for hdfs)
14/10/29 22:23:25 INFO input.FileInputFormat: Total input paths to process : 10
14/10/29 22:23:25 INFO mapreduce.JobSubmitter: number of splits:10
14/10/29 22:23:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1414579088733_0010
14/10/29 22:23:25 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster, Ident: (HDFS_DELEGATION_TOKEN token 65 for hdfs)
14/10/29 22:23:27 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0010 is still in NEW
14/10/29 22:23:29 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0010 is still in NEW
14/10/29 22:23:31 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0010 is still in NEW
14/10/29 22:23:33 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0010 is still in NEW
14/10/29 22:23:35 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0010 is still in NEW
14/10/29 22:23:36 INFO impl.YarnClientImpl: Submitted application application_1414579088733_0010
14/10/29 22:23:36 INFO mapreduce.Job: The url to track the job: http://namenode01.hadoop:8088/proxy/application_1414579088733_0010/
14/10/29 22:23:36 INFO mapreduce.Job: Running job: job_1414579088733_0010
14/10/29 22:24:00 INFO mapreduce.Job: Job job_1414579088733_0010 running in uber mode : false
14/10/29 22:24:00 INFO mapreduce.Job: map 0% reduce 0%
14/10/29 22:24:00 INFO mapreduce.Job: Job job_1414579088733_0010 failed with state FAILED due to: Application application_1414579088733_0010 failed 2 times due to AM Container for appattempt_1414579088733_0010_000002 exited with exitCode: -1000 due to: Application application_1414579088733_0010 initialization failed (exitCode=139) with output:
.Failing this attempt.. Failing the application.
14/10/29 22:24:00 INFO mapreduce.Job: Counters: 0
Job Finished in 35.389 seconds
java.io.FileNotFoundException: File does not exist: hdfs://cluster/user/hdfs/QuasiMonteCarlo_1414592602966_1277483233/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1083)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1749)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

 

[test@datanode01 hadoop-0.20-mapreduce]$ hadoop jar hadoop-examples.jar pi 10 10
Number of Maps = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
14/10/29 22:29:45 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 66 for test on ha-hdfs:cluster
14/10/29 22:29:45 INFO security.TokenCache: Got dt for hdfs://cluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster, Ident: (HDFS_DELEGATION_TOKEN token 66 for test)
14/10/29 22:29:45 INFO input.FileInputFormat: Total input paths to process : 10
14/10/29 22:29:45 INFO mapreduce.JobSubmitter: number of splits:10
14/10/29 22:29:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1414579088733_0011
14/10/29 22:29:45 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster, Ident: (HDFS_DELEGATION_TOKEN token 66 for test)
14/10/29 22:29:47 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0011 is still in NEW
14/10/29 22:29:49 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0011 is still in NEW
14/10/29 22:29:51 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0011 is still in NEW
14/10/29 22:29:53 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0011 is still in NEW
14/10/29 22:29:55 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1414579088733_0011 is still in NEW
14/10/29 22:29:56 INFO impl.YarnClientImpl: Submitted application application_1414579088733_0011
14/10/29 22:29:56 INFO mapreduce.Job: The url to track the job: http://namenode01.hadoop:8088/proxy/application_1414579088733_0011/
14/10/29 22:29:56 INFO mapreduce.Job: Running job: job_1414579088733_0011
14/10/29 22:30:40 INFO mapreduce.Job: Job job_1414579088733_0011 running in uber mode : false
14/10/29 22:30:40 INFO mapreduce.Job: map 0% reduce 0%
14/10/29 22:30:50 INFO mapreduce.Job: map 30% reduce 0%
14/10/29 22:31:09 INFO mapreduce.Job: map 50% reduce 0%
14/10/29 22:31:18 INFO mapreduce.Job: map 70% reduce 0%
14/10/29 22:31:21 INFO mapreduce.Job: map 100% reduce 0%
14/10/29 22:31:30 INFO mapreduce.Job: map 100% reduce 100%
14/10/29 22:31:30 INFO mapreduce.Job: Job job_1414579088733_0011 completed successfully
14/10/29 22:31:30 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=92
FILE: Number of bytes written=1235676
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2570
HDFS: Number of bytes written=215
HDFS: Number of read operations=43
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=10
Launched reduce tasks=1
Data-local map tasks=9
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=268763
Total time spent by all reduces in occupied slots (ms)=3396
Total time spent by all map tasks (ms)=268763
Total time spent by all reduce tasks (ms)=3396
Total vcore-seconds taken by all map tasks=268763
Total vcore-seconds taken by all reduce tasks=3396
Total megabyte-seconds taken by all map tasks=275213312
Total megabyte-seconds taken by all reduce tasks=3477504
Map-Reduce Framework
Map input records=10
Map output records=20
Map output bytes=180
Map output materialized bytes=339
Input split bytes=1390
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=339
Reduce input records=20
Reduce output records=0
Spilled Records=40
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=423
CPU time spent (ms)=7420
Physical memory (bytes) snapshot=4415447040
Virtual memory (bytes) snapshot=16896184320
Total committed heap usage (bytes)=4080009216
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1180
File Output Format Counters
Bytes Written=97
Job Finished in 105.199 seconds
Estimated value of Pi is 3.20000000000000000000

 

 

 

 

could you give me some suggestion, how to fix this issue?   i have opened an message , the link is :

 

http://community.cloudera.com/t5/Batch-Processing-and-Workflow/example-Mapreduce-FAILED-after-upgrad...

 

please ignore this link, if anybody has idea, please paster your solution in here,  thanks very much.

 

 

Who agreed with this topic