About Seaport

Seaport · ‎02-04-2025

@ggangadharan Thanks for the advice. After I created user xxx on each data node, the Spark job ran successfully. Regarding user account synchronization from ldap to local OS, I had to create the user account on each node manually. Do you mean using SSSD? Regards,

Seaport · ‎01-30-2025

The error from my Spark job is ++++ Failing this attempt.Diagnostics: Application application_1738011234567_0014 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is xxxx main : requested yarn user is xxxx User xxxx not found ++++ I read this post <https://community.cloudera.com/t5/Support-Questions/MapReduce-job-failing-after-kerberos/td-p/160273>. My group mapping configuration is hadoop.security.group.mapping = org.apache.hadoop.security.LdapGroupsMapping. I kinited xxxx before the job run. I added the AD user xxxx to an AD group hadoop. But I still got the same error. This online doc might be appliable <https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/security-authorization/topics/cm-security-authorization-ldap-group-mappings.html#ariaid-title3> I might need to add the flag -Dcom.cloudera.cmf.service.config.emitLdapBindPasswordInClientConfig=true to the variable CMF_JAVA_OPTS flag. But the documentation is for CDP 7.1.8 and does not exist for 7.1.7, which is my cluster. Thank you. Best regards,

Seaport · ‎01-22-2025

James, Thanks for your help. Your reply that "user is required on the active NN" is right to the point. SSSD is mentioned in various online documents related to enabling Kerberos. In my case, SSSD is a background process and I do not need to configure it, right? Best regards,

Seaport · ‎01-13-2025

The issue was resolved after I checked the "Enable HBase Thrift Http Server" property in HBase configuration. It turned out that the TLS implementation for the thrift server on CDP HBase is done at http layer, not at the Transport layer.

Seaport · ‎01-13-2025

I use CDP Private Cloud Base 7.1.7 and just enabled Kerberos security. I followed the setup documentation but could not proceed further than this step <https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/security-kerberos-authentication/topics/cm-security-kerberos-enabling-step7-prepare-cluster.html>. In short, I lost "supergroup" access to hdfs. Here are details. * I created an AD account mysuperuser@example.com and an AD group mysupergroup@example.com. * After Kerberos is enabled, I changed dfs.permissions.superusergroup=mysupergroup, and restarted the cluster. Certainly, "mysupergroup" and "mysuperuser" do not exist anywhere in Hdfs POSIX permission settings. * I kinited mysuperuser@example.com, but got hdfs permission denied error. It looks like that Kerberos could not understand AD groups associated with the kinited account. * Then I changed dfs.permissions.superusergroup=mysuperuser, restarted all services, but still got permission denied error. I intended to use Ranger to manage HDFS resource permissions. I could not get Ranger properly installed due to the HDFS permission error. Ranger depends on Solr and Solr uses HDFS. Right now Solr gave me an HDFS access error (Java error) - Caused by: org.apache.hadoop.ipc.RemoteException: Permission denied: user=solr, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x. I am trying to understand how HDFS permission works after enabling Kerberos but before Ranger is operational. Right now I can only access hdfs via kiniting the hdfs keytab file, which should only be used as a last resort. Thank you. Best regards,

Seaport · ‎12-19-2024

Additional connection tests show that port 9191 still works on unencrypted connections, although "TLS/SSL for HBase Thrift Server over HTTP" is enabled. Neither the log nor the Cloudera Manager UI gave any warnings or errors.

Seaport · ‎12-18-2024

It appeared that the Thrift Server did not start completely, although it has a green light in Cloudera Manager. Inside the log hbase-cmf-hbase-HBASETHRIFTSERVER-mynode.log.out, there is no entry to acknowledge the start like ++ org.eclipse.jetty.server.AbstractConnector: Started ServerConnector@180e6ac4{SSL, (ssl, http/1.1)}{0.0.0.0:9191} ++ But I have no idea why the starting ended up incomplete. Therer was no warning or error from either the log or the Cloudera Manager UI. Thank you.

Seaport · ‎12-17-2024

This issue occurred right after I enabled TLS on my CDP Private Cloud Base 7.1.7. The client call to HBASE Thrift API failed at TLS hanshake. Below is the connection test output with the handshake failure. ++ $ openssl s_client -connect mycompany.com:9191 CONNECTED(00000003) write:errno=0 --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 287 bytes Verification: OK --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated Early data was not sent Verify return code: 0 (ok) --- ++ My Thrift API port is 9191 (not the default 9090). This port worked well before TLS was enabled. There should be no certificate/ca issue because the Thrift (on the same node) UI over TLS works just fine. Below is the connection test output showing a successful handshake. ++ $ openssl s_client -connect mycompany.com:9095 CONNECTED(00000003) depth=2 CN = MYROOTCA ... --- Certificate chain ... --- Server certificate -----BEGIN CERTIFICATE----- ... ++ All my HBASE instances have green lights inside Cloudera Manager. I do not know where to look. It looks like something internal in SDX went wrong. Any suggestions? Thank you. Best regards,

Seaport · ‎10-23-2023

Ezerihun, Thanks for your reply. I repeated my test, which showed that you are correct. I was not sure what happened to my test case previously. When I dropped an external table, the warehouse path for that table "warehouse/tablespace/external/hive/testdb1.db/table1" remains. Actually, I can even re-create that external table again without any error, and files loaded to "warehouse/tablespace/external/hive/testdb1.db/table1" can be read through the re-created table. In other words, although Impala created this path "warehouse/tablespace/external/hive/testdb1.db/table1", Impala does not manage it at all. Thank you.

Seaport · ‎10-18-2023

I ran into an interesting situation using the Impala external table. In short, I used "create external table" statement but ended up with a table like a managed one. Here are details. Step 1: creating an external table created external table testdb1.table1 ( fld1 STRING, fld2 STRING ) PARTITIONED BY ( loaddate INT ) STORED AS PARQUET tblproperties('parquet.compress'='SNAPPY','transactional'='false'); Step 2: adding partitions and loading data files. alter table testdb1.table1 add if not exists partition (loaddate=20231018); load data inpath '/mytestdata/dir1' into table testdb1.table1 partition (loaddate=20231018); Step 2 shows that this table1 behaves exactly like a managed table. Files at /mytestdata/dir1 are moved to hdfs warehouse path warehouse/tablespace/external/hive/testdb1.db/table1/loaddate=20231018 path. If I drop this partition 20231018, the directory at warehouse/tablespace/external/hive/testdb1.db/table1/loaddate=20231018 is removed. So what exactly is the difference between a managed vs external partitioned table, except for the different storage location /warehouse/tablespace/managed vs /warehouse/tablespace/external? From what I read, the key difference is that a managed table's storage is managed by hive/impala, but an external table is not. In my case, even this table1 is created as an external table, its storage is still managed by impala/hive. As I understand, if I add a partition (to an external table) and then add files using "load data inpath", then the storage is managed by hive. If I add a partition with the location specified, like alter table testdb.table1 add if not exists partition (loaddate=20231018 ) location '/mytestdata/dir1' Then the storage is NOT managed by hive. Is this correct?

Online	Offline
Last Visited	‎02-04-2025 07:15 PM

Member Since	‎04-03-2019 03:26 PM
Last Visited	‎02-04-2025 07:15 PM
Posts	97
Kudos received	7

Cloudera Community

Re: HBASE Thrift API failed at TLS hanshake

Re: Zeppelin admin account does not have any permi...

Re: Preparation for CCP Data Engineer Exam (DE575)

Re: Owner Group Write Permision to an HDFS path

Re: "Ask a Question" button disappeared

Re: Spark job failure after Kerberos is enabled

Spark job failure after Kerberos is enabled

Re: How do HDFS Permissions work after Kerberos is...

Re: HBASE Thrift API failed at TLS hanshake

How do HDFS Permissions work after Kerberos is ena...

Re: HBASE Thrift API failed at TLS hanshake

Re: HBASE Thrift API failed at TLS hanshake

HBASE Thrift API failed at TLS hanshake

Re: Impala Partitioned Table - managed vs external...

Impala Partitioned Table - managed vs external - d...