Member since
04-10-2022
5
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3745 | 11-09-2022 07:44 PM |
11-09-2022
07:44 PM
Just an update, this is resolved. 1. With kerberos authentication enabled, you can go to the spark service's configuration tab and turn on "history_server_spnego_enabled", which will "Enables user authentication using SPNEGO (requires Kerberos), and enables access control to application history data.", and after the restart, the shs webui will be authenticated, as below screenshot shows: Underneath, the shs is restarted with below configuration: spark.history.kerberos.enabled=true spark.history.kerberos.principal=xx spark.history.kerberos.keytab=xxx spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.param.type=kerberos spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.param.kerberos.principal=xx spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.param.kerberos.keytab=xx spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.param.kerberos.name.rules=xxx spark.history.ui.acls.enable=true spark.ui.filters=org.apache.spark.deploy.yarn.YarnProxyRedirectFilter,org.apache.hadoop.security.authentication.server.AuthenticationFilter 2. If kerberos is not enabled, you have to implement your own authentication filter and configure below parameters: spark.ui.filters=org.apache.spark.deploy.yarn.YarnProxyRedirectFilter,your-authentication-filter-name spark.your-authentication-filter-name.param.parm-name=parm-value spark.history.ui.acls.enable spark.history.ui.admin.acls spark.history.ui.admin.acls.groups
... View more
08-23-2022
08:35 PM
Hi Guys, I would like to know how to set up authentication for spark history server, so that unauthorized users cannot view the spark history server web ui, any help would be appreciated, thanks! 1. I do notice there are below statements in the official spark document:“Enabling authentication for the Web UIs is done using javax servlet filters. You will need a filter that implements the authentication method you want to deploy. Spark does not provide any built-in authentication filters.” 2. There is also a thread on stackoverflow regardig this : “You re-use Hadoop's jetty authentication filter for Kerberos/SPNEGO: spark.ui.filters=org.apache.hadoop.security.authentication.server.AuthenticationFilter and spark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.params=type=kerberos,kerberos.principal=${spnego_principal_name},kerberos.keytab=${spnego_keytab_path}”. with kerberos authentication enabled in CDH6.3, I followed the instructions in the above stackoverflow thread, but is unable to acheive the expected results, any user can view the spark history server web ui. Thanks, Michael
... View more
Labels:
- Labels:
-
Apache Spark
07-21-2022
01:24 AM
some more information to add: it seems that in DbTxnManager, failed or terminated sql queries will not leave stale locks behind, as there are heartbeat mechanisms involved. Only when the hs2 process is not functioning properly, like when the hs2 host crashed, will there be staled locks left behind, and we have to manually clear them by logging into the metastore db.
... View more
07-19-2022
11:57 PM
It seems that when we use DbTxnManager in CDH6.X, if SQL query does not complete successfully or is terminated abruptly (ex. ctrl+c), then the locks it implicitly acquired will not be automatically released, and other new SQL queries which references the same table/partition will be blocked trying to acquire necessary locks, and hence can't be executed successfully.
we can't manually clear the locks by issuing unlock statements under DbTxnManager,because it will pop up error: Current transaction manager does not support explicit lock requests.Transaction manager:org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
it seems currently the only way to clear above obsolete locks is to log into the metastore db and use sql statements like below to delete records from table hive_locks:
select hl_lock_ext_id from HIVE_LOCKS where HL_TABLE=’prcs_task’; delete from hive_locks where hl_lock_ext_id = 125542;
I am wondering, if this is the correct way to manually clear obsolete locks under DbTxnManager?
And are there any progress in the hive community, regarding how to automatically clear obsolete locks caused by failed or terminated sql queries? like some sort of timeout and house keeping mechanism?
... View more
Labels: