Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

Cloudera HDFS client KDC DDoS

New Contributor

I researched the reason why my KDC server always becomes down when hadoop task is running.

I checked KDC logs and I found too many of TGS requests inside.

 

My standard hdfs algorythm:

kdestroy
kinit cdh_test

# many of similar hdfs operations for example:
-sh-4.2$ hdfs dfs -ls /tmp
-sh-4.2$ hdfs dfs -ls /tmp

....

 

then I do klist

Ticket cache: FILE:/tmp/krb5cc_1796600024
Default principal: cdh_test@DEV.WINDOWS.LOCAL

Valid starting Expires Service principal
10/26/2021 01:09:39 10/27/2021 01:09:39 krbtgt/DEV.WINDOWS.LOCAL@DEV.WINDOWS.LOCAL
renew until 11/02/2021 01:09:39

 

I see only TGT ticket but no TGS. Why?

if I execute psql or curl so I always see TGS for those services, but hdfs service is out in cache.

 

In KDC logs many of TGS:

Oct 26 01:11:07 ldap.dev.windows.local krb5kdc[27198](info): TGS_REQ (6 etypes {aes256-cts-hmac-sha1-96(18), aes128-cts-hmac-sha1-96(17), UNSUPPORTED:des3-hmac-sha1(16), DEPRECATED:arcfour-hmac(23), UNSUPPORTED:des-cbc-crc(1), UNSUPPORTED:des-cbc-md5(3)}) 192.168.1.122: ISSUE: authtime 1635199779, etypes {rep=aes256-cts-hmac-sha1-96(18), tkt=aes256-cts-hmac-sha1-96(18), ses=aes256-cts-hmac-sha1-96(18)}, cdh_test@DEV.WINDOWS.LOCAL for hdfs/cloudera.ipa.dev.windows.local@DEV.WINDOWS.LOCAL
Oct 26 01:11:11 ldap.dev.windows.local krb5kdc[27199](info): TGS_REQ (6 etypes {aes256-cts-hmac-sha1-96(18), aes128-cts-hmac-sha1-96(17), UNSUPPORTED:des3-hmac-sha1(16), DEPRECATED:arcfour-hmac(23), UNSUPPORTED:des-cbc-crc(1), UNSUPPORTED:des-cbc-md5(3)}) 192.168.1.122: ISSUE: authtime 1635199779, etypes {rep=aes256-cts-hmac-sha1-96(18), tkt=aes256-cts-hmac-sha1-96(18), ses=aes256-cts-hmac-sha1-96(18)}, cdh_test@DEV.WINDOWS.LOCAL for hdfs/cloudera.ipa.dev.windows.local@DEV.WINDOWS.LOCAL

.....

-----------------------------------------------

 

I thought that it is normal because there is no description what is right Kerberos use in Hadoop applications.

However I found some rare articles where authors said Java has never supported Kerberos protocols except build-in authentication. So it is impossbile to use Java native module as optimized for Windows and Unix servers.

There was information that all existed Kerberos protocols work with own GSS libraries like SSPI in Windows environment and MIT Kerberos in Unix environment.

Unsupported Java native module always stuck in the caching because it cannot write tickets to cache and it tries to autheticate any new thread so it ignores on TGT and TGS is valid.

it means DDoS because KDC cannot serve too many of hdfs connections and it is risky to be down.

hadoop threads by one user generates many of re-logins.

It looks like stupid because when we login to Windows we do not inter password every time for any operation or service because it is SSO standard only to do it once and we work with service using service ticket in LSA cache.

 

 

Also I found workaround there is only need to do:

export HADOOP_OPTS="$HADOOP_OPTS -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false"

 

 

So

export HADOOP_OPTS="$HADOOP_OPTS -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false"
kdestroy
kinit cdh_test
-sh-4.2$ hdfs dfs -ls /tmp
-sh-4.2$ hdfs dfs -ls /tmp

.....

-sh-4.2$ klist
Ticket cache: FILE:/tmp/krb5cc_1796600024
Default principal: cdh_test@DEV.WINDOWS.LOCAL

Valid starting Expires Service principal
10/26/2021 01:14:02 10/27/2021 01:14:02 krbtgt/DEV.WINDOWS.LOCAL@DEV.WINDOWS.LOCAL
renew until 11/02/2021 01:14:02
10/26/2021 01:14:10 10/27/2021 01:14:02 hdfs/cloudera.ipa.dev.windows.local@DEV.WINDOWS.LOCAL
renew until 11/02/2021 01:14:02

 

I see TGS and it looks like psql or curl behavior

 

In KDC logs there only one TGS entry:

Oct 26 01:14:10 ldap.dev.windows.local krb5kdc[27198](info): TGS_REQ (8 etypes {aes256-cts-hmac-sha1-96(18), aes128-cts-hmac-sha1-96(17), aes256-cts-hmac-sha384-192(20), aes128-cts-hmac-sha256-128(19), UNSUPPORTED:des3-hmac-sha1(16), DEPRECATED:arcfour-hmac(23), camellia128-cts-cmac(25), camellia256-cts-cmac(26)}) 192.168.1.122: ISSUE: authtime 1635200042, etypes {rep=aes256-cts-hmac-sha1-96(18), tkt=aes256-cts-hmac-sha1-96(18), ses=aes256-cts-hmac-sha1-96(18)}, cdh_test@DEV.WINDOWS.LOCAL for hdfs/cloudera.ipa.dev.windows.local@DEV.WINDOWS.LOCAL

 

 

 

It looks strange that Cloudera doesn't support RHEL and Windows. Or maybe some specific configuration is existed that switch on full support those OS and DDoS disappears.

Could you comment?

0 REPLIES 0