Reply
Contributor
Posts: 32
Registered: ‎03-17-2017

Unable to access HDFS after enabling kerberos using Java

[ Edited ]

I tried several ways to access HDFS on our Kerneros secured CDH5.10 cluser, but to no avail. Below is the simple Java code that I tried run from Eclipse on windows:

 

public static void main(final String[] args) throws IOException {
        final Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "www..../");
        conf.set("hadoop.security.authentication", "kerberos");
        final FileSystem fs = FileSystem.get(conf);
        final RemoteIterator<LocatedFileStatus> files = fs.listFiles(new Path("/hdfs/data-lake/prod/cvprod/csv"), true);
        while (files.hasNext()) {
            final LocatedFileStatus fileStatus = files.next();
            // do stuff with the file like ...
            System.out.println(fileStatus.getPath());
        }
        byte[] contents = createContents();
        String pathName = "/hdfs/data-lake/test/myfile.txt";
        FSDataOutputStream output = fs.create(new Path(pathName));
        output.write(contents);
        output.flush();
        output.close();
    }

    static byte[] createContents() {
        String contents = "This is a test of creating a file on hdfs";
        return contents.getBytes();
    }
}

I ran the program with the following VM flags:

-Djava.security.auth.login.config=c:/iapima/jaas.conf -Djava.security.krb5.conf=c:/iapima/krb5.conf
-Djavax.security.auth.useSubjectCredsOnly=false

 

I keep getting the following error:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

 

Any help or pointer is apprciated.

 

Posts: 642
Topics: 3
Kudos: 121
Solutions: 67
Registered: ‎08-16-2016

Re: Unable to access HDFS after enabling kerberos using Java

I would test with the hdfs command first to ensure that HDFS with Kerberos is good.

On .a node with the HDFS Gateway installed:

kinit
<enter password>
hdfs dfs -ls /

Can you share you jaas.conf file?

For the Java program, I believe there are a few more config settings that tell a client to use Kerberos. I don't recall them off the top of my head. I would try just using the hdfs and core site files in the configuration object.
Highlighted
Contributor
Posts: 32
Registered: ‎03-17-2017

Re: Unable to access HDFS after enabling kerberos using Java

I added the following 2 statements:

    conf.addResource("/etc/hadoop/conf.cloudera.hdfs/core-site.xml");
    conf.addResource("/etc/hadoop/conf.cloudera.hdfs/hdfs-site.xml");

I also created a jar and ran the program from an edge node:

java -Djava.security.auth.login.config=/security/jaas.conf -Djava.security.krb5.conf=/security/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -jar spring-data-hadoop-all-1.0.jar

Here are the contents of my jaas.conf:

Client {
    com.sun.security.auth.module.Krb5LoginModule required
    doNotPrompt=true
    useTicketCache=false
    principal="iapima@AOC.NCCOURTS.ORG"
    useKeyTab=true
    keyTab="/home/iapima/security/iapima.keytab"
    debug=true;
};

I am still getting the following exception:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2103)
        at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:887)
        at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:870)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:815)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:811)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:811)
        at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1742)
        at org.apache.hadoop.fs.FileSystem$5.<init>(FileSystem.java:1863)
        at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:1860)
        at org.nccourts.hadoop.hdfs.AccessHdfs.main(AccessHdfs.java:34)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):

--

From the command line on the edge node, where I ran the java program, I am able to all kind of manipulattion on  HDFS: creating dir, coping files, deleting files.. etc

 

It is very frustring.. I can access secured impala, secured solr on our cluster.. but I cannot seem to be able

to access the hdfs file system.

 

 

 

 

Contributor
Posts: 32
Registered: ‎03-17-2017

Re: Unable to access HDFS after enabling kerberos using Java

I added the following

 UserGroupInformation.setConfiguration(conf);
  UserGroupInformation.loginUserFromKeytab("myId@OurCompany.ORG", "/myPathtoMyKeyTab/my.keytab")

I was able to connect and get a list of the files in the HSFS directory, however the write operation failed with the following exception:

java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2270)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1701)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1620)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:772)
17/08/17 13:31:49 WARN hdfs.DFSClient: Abandoning BP-2081783877-10.91.61.102-1496699348717:blk_1074056717_315940
17/08/17 13:31:49 WARN hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[10.91.61.106:50010,DS-caf46aea-ebbb-4d8b-8ded-2e476bb0acee,DISK]

 

Any ideas? Pointers, help is appreciated.