Member since
10-10-2016
6
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11380 | 12-21-2016 06:54 AM |
12-21-2016
06:54 AM
1 Kudo
I was able to resolve the issue. It occurs because of the public-only network access from the client (edge node) to a multi-homed cluster environment (Oracle Big Data appliance in my case) and is also related to the bug MAPREDUCE-6484 . Patch is available for it and in my case it was already included in CDH 5.7.1 (CDH 5.7.1 Release Notes). However, there was an additional setting that needed to be done on Yarn to make it work: 1. Token service naming behavior needed to be changed via core-site.xml. Under CM > YARN > Configuration > Scope: YARN (Service-Wide) > Category: Advanced > "YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" add below property: <property> <name>hadoop.security.token.service.use_ip</name> <value>false</value> </property> 2. Save the configuration change. 3. Deploy Client Configurations for YARN. Restart YARN Services as needed. The details on above setting and the discussion can be found at HADOOP-7510
... View more
12-19-2016
07:50 AM
Edge nodes are managed by CM and have the following roles on them: HDFS Gateway HDFS NFS Gateway Hive Gateway Spark Gateway YARN (MR2 Included) Gateway Latest client configurations are deployed on them via CM and re-startable roles have been restarted.
... View more
12-19-2016
05:41 AM
That is correct, users with un-encrypted HDFS homes are not having this issue running this test job from edge node. As soon as user gets HDFS home folder encrypted (HDFS encryption zone created for respective /user/username folder) - this issue starts affecting that user and standadrd test described will fail for it when running from edge node. Also, as I mentioned in my initial post, this test is successfull regardless of user HDFS home encryption status when executed from actual cluster node (vs. edge nodes).
... View more
12-19-2016
05:16 AM
The job I'm using for testing is a standard Hadoop Pi number calculation job that comes with CDH and doesn't require any additional settings per my knowledge. This job is used as per CDH documentation for cluster testing - please see Step 17: Verify that Kerberos Security is Working for details. Here is excerpt - I'm using parcel-based CDH setup: To verify that Kerberos security is working: Acquire Kerberos credentials for your user account. $ kinit USERNAME@YOUR-LOCAL-REALM.COM Enter a password when prompted. Submit a sample pi calculation as a test MapReduce job. .... If you have a parcel-based setup, use the following command instead: $ hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 10 10000
Number of Maps = 10
Samples per Map = 10000
...
Job Finished in 30.958 seconds
Estimated value of Pi is 3.14120000000000000000
... View more
12-18-2016
09:32 PM
I'm having issues validating my Kerberos + HDFS TDE setup on CDH 5.7.1. I'm running sample MapReduce job (Pi number calculation) that fails with Kerberos ticket error (claiming user has no valid credentials) when initiated from edge node only, after encrypting (HDFS Transparent Data Encryption) user home folder (/user/sampleuser) in HDFS on kerberized cluster. User executing job authenticated (via kinit) as cluster local MIT KDC principal (sampleuser@LOCALMITKDCREALM.COMPANY.COM). Same user, with same Kerberos ticket, can succesfully connect to Hive (via beeline) and run HiveQL commands, as well as run hdfs dfs -ls /user/sampleuser or hdfs dfs -put tmp/localfile /user/sampleuser/hdfsdestfile or hdfs dfs -get /user/sampleuser/hdfsdestfile tmp/localfile, from the same edge node where I'm having issue with sample Pi MR job mentioned above (which suggests that Kerberos and HDFS TDE must be working for HDFS and Hive). Also, same user, authenticated (via kinit) as cluster local MIT KDC principal, can succesfully run the same sample MapReduce job (Pi number calculation) from any other node of the cluster. Users that do not have their home HDFS folder encrypted with HDFS TDE have no issues executing sample MR Pi job from either edge node or any other nodes on the cluster. My environment is as follows: - CDH 5.7.1, secured with local MIT KDC with one way trust to Active Directory KDC. - Edge node: Is managed by Cloudera Cluster Manager. Latest cluster CDH client and Kerberos configs have been deployed to edge node. JCE Unlimited Strength Jurisdiction Policy Files have been deployed to edge and JDK 1.8.0_92 is matching the rest of the cluster. HDFS transparent data encryption is enabled using on-cluster Cloudera Navigator Key Trustee Server / Key Trustee KMS server. User executing user has full access to the encryption keys for his HDFS home folder encryption zone (no custom KMS ACLs have been deloyed). Error suggests that no valid Kerberos credentials have been provided which is not true. Please advise if you have encountered this or have any thoughts. At this point I'm ready to give up on Kerberos + Encrypted (TDE) HDFS home user folder configuration as non-working in case of edge node as source of MR job submission. $ kinit sampleuser@LOCALMITKDCREALM.COMPANY.COM Password for sampleuser@LOCALMITKDCREALM.COMPANY.COM: $ klist Ticket cache: FILE:/tmp/krb5cc_10048 Default principal: sampleuser@LOCALMITKDCREALM.COMPANY.COM Valid starting Expires Service principal 12/18/16 22:06:29 12/19/16 22:06:29 krbtgt/LOCALMITKDCREALM.COMPANY.COM@LOCALMITKDCREALM.COMPANY.COM renew until 12/25/16 22:06:29 $ hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 10 10000 Number of Maps = 10 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 16/12/18 22:09:59 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 294 for sampleuser on ha-hdfs:CDHCLUSTER-ns 16/12/18 22:09:59 INFO security.TokenCache: Got dt for hdfs://CDHCLUSTER-ns; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:CDHCLUSTER-ns, Ident: (HDFS_DELEGATION_TOKEN token 294 for sampleuser) 16/12/18 22:09:59 WARN token.Token: Cannot find class for token kind kms-dt 16/12/18 22:09:59 INFO security.TokenCache: Got dt for hdfs://CDHCLUSTER-ns; Kind: kms-dt, Service: xx.xxx.xxx.16:16000, Ident: 00 03 62 64 64 04 79 61 72 6e 00 8a 01 59 15 45 98 c3 8a 01 59 39 52 1c c3 8e 01 12 7c 16/12/18 22:09:59 WARN token.Token: Cannot find class for token kind kms-dt 16/12/18 22:09:59 INFO security.TokenCache: Got dt for hdfs://CDHCLUSTER-ns; Kind: kms-dt, Service: xx.xxx.xxx.15:16000, Ident: 00 03 62 64 64 04 79 61 72 6e 00 8a 01 59 15 45 99 4d 8a 01 59 39 52 1d 4d 8e 01 13 7a 16/12/18 22:10:00 INFO input.FileInputFormat: Total input paths to process : 10 16/12/18 22:10:00 INFO mapreduce.JobSubmitter: number of splits:10 16/12/18 22:10:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1481262207801_0093 16/12/18 22:10:00 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:CDHCLUSTER-ns, Ident: (HDFS_DELEGATION_TOKEN token 294 for sampleuser) 16/12/18 22:10:00 WARN token.Token: Cannot find class for token kind kms-dt 16/12/18 22:10:00 WARN token.Token: Cannot find class for token kind kms-dt Kind: kms-dt, Service: xx.xxx.xxx.15:16000, Ident: 00 03 62 64 64 04 79 61 72 6e 00 8a 01 59 15 45 99 4d 8a 01 59 39 52 1d 4d 8e 01 13 7a 16/12/18 22:10:00 WARN token.Token: Cannot find class for token kind kms-dt 16/12/18 22:10:00 WARN token.Token: Cannot find class for token kind kms-dt Kind: kms-dt, Service: xx.xxx.xxx.16:16000, Ident: 00 03 62 64 64 04 79 61 72 6e 00 8a 01 59 15 45 98 c3 8a 01 59 39 52 1c c3 8e 01 12 7c 16/12/18 22:10:00 INFO impl.YarnClientImpl: Submitted application application_1481262207801_0093 16/12/18 22:10:00 INFO mapreduce.Job: The url to track the job: http://CDHCLUSTERn03.COMPANY.COM:8088/proxy/application_1481262207801_0093/ 16/12/18 22:10:00 INFO mapreduce.Job: Running job: job_1481262207801_0093 16/12/18 22:10:05 INFO mapreduce.Job: Job job_1481262207801_0093 running in uber mode : false 16/12/18 22:10:05 INFO mapreduce.Job: map 0% reduce 0% 16/12/18 22:10:05 INFO mapreduce.Job: Job job_1481262207801_0093 failed with state FAILED due to: Application application_1481262207801_0093 failed 2 times due to AM Container for appattempt_1481262207801_0093_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://CDHCLUSTERn03.COMPANY.COM:8088/proxy/application_1481262207801_0093/Then, click on links to logs of each attempt. Diagnostics: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:491) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:771) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420) at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1490) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:311) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:778) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:367) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:306) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:485) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:480) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:480) ... 29 more Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:285) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:261) ... 39 more Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:306) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:485) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:480) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:480) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:771) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420) at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1490) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:311) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:778) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:367) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:285) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:261) ... 39 more Caused by: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:285) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:261) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:485) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:480) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:480) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:771) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420) at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1490) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:311) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:305) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:778) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:367) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing the application. 16/12/18 22:10:05 INFO mapreduce.Job: Counters: 0 Job Finished in 6.231 seconds java.io.FileNotFoundException: File does not exist: hdfs://CDHCLUSTER-ns/user/sampleuser/QuasiMonteCarlo_1482120585946_2008146201/out/reduce-out at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219) at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1817) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1841) at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
... View more
10-10-2016
06:43 PM
I'm using Solr in SolrCloud mode (2 nodes out of 18 total) on CDH 5.7 Enterprise cluster for some testing. I also have Hue 3.9 configured to enable Cloudera Search to use this SolrCloud on the same cluster. When using Hue Web app GUI Indexer to index some test *.CSV files (200K rows or so) the resulting Solr collection/index that is created with single shard only allocated to a random Solr node out of SolrCloud cluster. There no option to specify number of shards to use for new index/collection at creation time in Hue Search App. Is there a way to specify/force default sharding (say to match number of Solr nodes in current SolrCloud configuration) for indexes/collections created via Hue Search Web application?
... View more
Labels: