Member since
08-22-2018
10
Posts
0
Kudos Received
0
Solutions
09-05-2018
01:57 PM
I triggered HDFS BDR job as hdfs user. It is failed with below error message. INFO distcp.DistCp: map 96% reduce 0% files 99% bytes 80% throughput 24.7 (MB/s) remaining time 25 mins 27 secs running mappers 1 INFO distcp.DistCp: map 96% reduce 32% files 99% bytes 80% throughput 24.7 (MB/s) remaining time 25 mins 27 secs running mappers 1 INFO ipc.Client: Retrying connect to server: FQDN/IP:PORT. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) INFO ipc.Client: Retrying connect to server: FQDN/IP:PORT. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) INFO ipc.Client: Retrying connect to server: FQDN/IP:PORT. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:PROXY) via <hdfs_principal_name> (auth:KERBEROS) cause:java.io.IOException: Job status not available ERROR util.DistCpUtils: Exception encountered java.io.IOException: Job status not available at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334) at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:621) at com.cloudera.enterprise.distcp.DistCp.checkProgress(DistCp.java:471) at com.cloudera.enterprise.distcp.DistCp.execute(DistCp.java:461) at com.cloudera.enterprise.distcp.DistCp$1.run(DistCp.java:151) at com.cloudera.enterprise.distcp.DistCp$1.run(DistCp.java:148) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at com.cloudera.enterprise.distcp.DistCp.run(DistCp.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at com.cloudera.enterprise.distcp.DistCp.main(DistCp.java:843) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 18/09/05 16:32:09 INFO distcp.DistCp: Used diff: false 18/09/05 16:32:09 WARN distcp.DistCp: Killing submitted job job_1536157028747_0004
... View more
08-23-2018
03:43 PM
Since we are running BDR jobs using hdfs user, I think this '/user/history' has been set to 'hdfs:supergroup'. I changed to mapred:hadoop and ran the job again but still it is failing with the same error. I see it creates another directoy inside '/user/history/done_intermediate/hdfs' with the name whoever runs a mapreduce job. It is writing .jhist, .summary, .xml extension files with below mentioned permissions and ownerships. -rwxrwx--- 1 hdfs supergroup Can we change anything in such a way that files its going to create in '/user/history/done_intermediate/hdfs' will have enough permissions to read/write.
... View more
08-23-2018
03:20 PM
Are you talking on destination cluster? Do you know how to add mapred user to supergroup? I don't see supergroup name at all in /etc/group. But I see mapred user in hadoop group. I need to add mapred user to supergroup as I am running BDR jobs as hdfs user.
... View more
08-22-2018
06:57 PM
Hi,
HDFS and Hive Replication schedule is failing with beloew error.
18/08/22 19:40:02 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:PROXY) via hdfs/<FQDN>@realm (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException): org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not load history file hdfs://nameservice9:8020/user/history/done_intermediate/hdfs/job_1534964122122_0006-1534978377784-hdfs-HdfsReplication-1534981192465-20-1-SUCCEEDED-root.users.hdfs-1534978383044.jhist at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:199) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:218) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:208) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:204) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:204) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getJobReport(HistoryClientService.java:236) at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getJobReport(MRClientProtocolPBServiceImpl.java:122) at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2220) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2214) Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not load history file hdfs://nameservice9:8020/user/history/done_intermediate/hdfs/job_1534964122122_0006-1534978377784-hdfs-HdfsReplication-1534981192465-20-1-SUCCEEDED-root.users.hdfs-1534978383044.jhist at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:341) at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:479) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) at com.google.common.cache.LocalCache.get(LocalCache.java:3965) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829) at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193) ... 18 more Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=READ, inode="/user/history/done_intermediate/hdfs/job_1534964122122_0006-1534978377784-hdfs-HdfsReplication-1534981192465-20-1-SUCCEEDED-root.users.hdfs-1534978383044.jhist":hdfs:supergroup:-rwxrwx---
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache YARN
-
HDFS
-
Kerberos
-
MapReduce