Member since
12-03-2016
91
Posts
27
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
12182 | 08-27-2019 10:45 AM | |
3452 | 12-24-2018 01:08 PM | |
12134 | 09-16-2018 06:45 PM | |
2677 | 12-12-2016 01:44 AM |
07-08-2022
08:44 AM
1 Kudo
I thought it was something like this, but it was hard to believe! After 2 full installation of CDP base, it seems clear that CDP may have been a big step for the final user, but still has a lot of room for improvement in the sysadmin and devops side of the platform, specially in the way-back or recovery of many central configuration changes (kerberos, TLS) where it really sucks, even when compared with the now ancient HDP3.
... View more
06-20-2022
05:27 PM
1 Kudo
The solution above has been the accepted answer for a long time (since CM 6.x), but the procedure is not complete, at least for Cloudera Manager 7.6 with CDP Base 7.1.6 or above. After using Auto-TLS and needing to disable it because I needed to rename some nodes, I changed the web_tls and agent_tls settings in the Database, and changed the config.ini to disable TLS in all the agents' config and restarted all the cloudera-scm-* daemons (server/agents) and I was able to log to the UI and reach the hosts, but in "Management >> Security" the Cloudera Manager UI keeps showing "Auto-TLS is Enabled" in the menu bar, and if I try to add a new host it will fail with an error like "Unable to deploy host certificates" ... so it seems CM still has Auto-TLS enable somewhere/somehow. I have made a dump and double checked there isn't any TLS config parameter enabled in the DB, but there should be some other place from where CM assumes this is enabled.
... View more
05-05-2022
04:36 PM
As a general statement this is not right by any means. LDAP provides secure and encrypted authentication (encrypted user password and SSL/TLS communication) , together with user/group management. It's only the Hadoop stack does not support this and the two only autentication methods implemented for all the CDP components are the dummy simple auth (described above) and the Kerberos authentication (used in combination with PAM or LDAP for user/group mappings). As an example, nothing less than Knox (the security gateway to HDP or CDP) implements full authenticacion using only LDAP (with TLS), and it only relies on Kerberos to authenticate a single service/proxy user to communicate with the rest of the cluster.
... View more
12-01-2020
12:55 PM
The following map rule is wrong: RULE:[2:\$1@\$0](rm@MY_REALM)s/.*/rm/ the user for the ResourceManager is not "rm" but "yarn" and this should be the replacement value. This is the same as for the hadoop.security.auth_to_local in Hadoop/HDFS configuration.
... View more
11-12-2020
07:10 PM
To give some extra information (in case somebody guess about it) I have also tested adding the -b and -c options to curl, in order to use the received cookies in subsequent calls. Something like this: curl -i -k -u "admin:*****" -b cookies.txt -c cookies.txt --config headers-post.conf ... and the options included in headers-post.conf are: -X POST
-H "X-Requested-By:admin"
-H "Content-Type: application/json"
-H "X-XSRF-HEADER:valid" But the problem is the same and I still receive "HTTP/1.1 403 Forbidden" when trying to execute any statement with LIvy over the session. Best regards
... View more
11-12-2020
06:21 PM
I have a kerberized HDP 3.1.4 authenticating against an IPA server (Kerberos/LDAP). Most of the supported web services are working via Knox (WEBHDFS, HDFSUI, YARNUI, YARNUIV2, SPARKHISTORYUI, OOZIEUI, HDFSUI, etc) , in some cases after fixing or replacing the service definitions with the ones from Knox 1.4.0 (some of ones included with HDP 3 have erros o missing redirects). Now I trying to expose the Livy Rest API via Knox and have tried both the LIVY service definitions in HDP 3 (it includes 3 versions 0.4.0, 0.4.1, 0.4.2) and with the one from Knox 1.4.0. A side note is that the service definition from upstream Knox 1.4.0 has a lower version number (0.4.0) but seems to be more recent including many additional rewrite rules and not using the "/v1" hack from the preliminary versions. The service definition for Knox in the ui.xml topology I'm using is: <service>
<role>LIVYSERVER</role>
<url>http://hdp3-livy:8999</url>
</service> I'm using curl for testing and I'm able to initialize the "pyspark" session via Knox (authenticated with admin) to Livy as "knox" user and with the user authenticated to Knox (in this case "admin") as the "proxyUser": $ curl -k -u admin:${SECRET} --config headers-post.conf --data '{"kind": "pyspark"}' $LIVY_URL/sessions
HTTP/1.1 200 OK {
"appId": null, "appInfo": {"driverLogUrl": null, "sparkUiUrl": null},
"id": 19, "kind": "pyspark", "log": [],
"owner": "knox", "proxyUser": "admin", "state": "starting"
} However when in the next step I try to execute some code I get an empty response with code "403 Forbidden": $ curl -k -u admin:${SECRET} --config livy-headers-post.conf $LIVY_URL/sessions/$SESSIONID/statements -d '{"code":"2 + 2"}'
HTTP/1.1 403 Forbidden If I do the same from Knox server, authenticated with kerberos keytab as user knox and including "proxyUser": "admin" in the JSON request (as Knox does) , I get the same response in the session setup but the subsequent code statement works as expected and executes the code: [knox@server ~]$ curl --negotiate -u : --config headers-post.conf $LIVY_URL/sessions/$SESSIONID/statements -d '{"code":"2 + 2"}'
HTTP/1.1 200 OK {
"code": "2 + 2", "id": 0, "output": null, "progress": 0.0, "state": "waiting"
} As an extra diagnostic I have made a capture of traffic for both executions (via Knox and directly with knox using kinit + proxyUser) and the authentication headers received by Livy from the Knox server are clearly different with each method: Authencated as admin with Knox: POST /sessions/19/statements?doAs=admin HTTP/1.1
X-Forwarded-Proto: https
X-Forwarded-Port: 10443
...
X-Forwarded-Context: /gateway/ui
X-Requested-By: admin
Accept: */*
User-Agent: curl/7.29.0
X-XSRF-HEADER: valid
Content-Type: application/json
Transfer-Encoding: chunked
Host: hdp3-livy:8999
Connection: Keep-Alive
Cookie: hadoop.auth="u=knox&p=knox/hdp3-knox@EXAMPLE.COM&t=kerberos&e=1605153545612s=xD7N1bfFduRqQPQ/qtOkg0OVVs6sXC2C2MnTDlUDrSo="
10
{"code":"2 + 2"}
0 Authenticated with kerberos as knox user (--negotiate): POST /sessions/18/statements HTTP/1.1
User-Agent: curl/7.29.0
Host: hdp3-dtlk-mn01.dtlk.in.iantel.com.uy:8999
Accept: */*
X-Requested-By: admin
Content-Type: application/json
X-XSRF-HEADER:valid
Content-Length: 16
{"code":"2 + 2"}
@
HTTP/1.1 401 Authentication required
Date: Wed, 11 Nov 2020 19:18:50 GMT
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; HttpOnly
Cache-Control: must-revalidate,no-cache,no-store
...
Content-Length: 353
Server: Jetty(9.3.24.v20180605)
POST /sessions/18/statements HTTP/1.1
Authorization: Negotiate YIIC1AYJKoZIhvcSAQICAQBuggLD
......
@
HTTP/1.1 201 Created
Date: Wed, 11 Nov 2020 19:18:50 GMT
WWW-Authenticate: Negotiate YGoGCSqGSIb....
...
Content-Type: application/json;charset=utf-8
Location: /sessions/18/statements/0
Content-Length: 70
Server: Jetty(9.3.24.v20180605)
{"id":0,"code":"2 + 2","state":"waiting","output":null,"progress":0.0} Does anybody knows how to fix this and make Livy work with HTTPS and basic authentication via the Apache Knox Gateway server using Kerberos?
... View more
Labels:
- Labels:
-
Apache Knox
-
Apache Spark
-
Kerberos
08-19-2020
07:42 AM
If you use the $ZK_HOST defined in infra-solr-env.sh you should not need to include the /infra-solr prefix when getting the solrconfig.xml: source /etc/ambari-infra-solr/conf/infra-solr-env.sh /usr/lib/ambari-infra-solr/server/scripts/cloud-scripts/zkcli.sh -z $ZK_HOST \ -cmd getfile /configs/ranger_audits/solrconfig.xml solrconfig.xml The same applies when uploading the edited config.
... View more
07-24-2020
08:09 AM
May be it's because of a diferente version (I'm using HDP and Hadoop 3) but this doesn't work as described here. In first place if you try to set a variable in the "hiveConf:" namespace you will get an error like this: Error while processing statement: Cannot modify hiveConf:test at runtime You have to use the "hivevar:" namespace for this like: :2> set hivevar:test=date_sub(current_date, 5); But more importante, Hive won't expand the variable value definition as shown here: :2> set hivevar:test; +-----------------------------------------+ | hivevar:test=date_sub(current_date, 5) | +-----------------------------------------+ So the INSERT will not interpreted as you stated but instead as: INSERT INTO TABLE am_temp2 PARTITION (search_date=date_sub(current_date, 5)) and this for some reason is not supported in Hive and gives a compilation error: FAILED: ParseException line 1:50 cannot recognize input near 'date_sub' '(' It would be very useful to insert data into static partitioning using pre-calculated variable values like this, from functions or select queries, but I still haven't found how to do this in HiveQL. As a reference, this seems to be (at least partially) related with this: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/hive-overview/content/hive_use_variables.html
... View more
08-27-2019
10:45 AM
2 Kudos
This feature is not (well) documented anywere and the intructions for user impersonation in Zeppelin's manual only works for shell or for spark when not using Kerberos, so I will respond myself showing how I was able to make this work with Kerberos and HDP 3.1 (zeppelin 0.8.0). First you DON'T have to change/uncoment ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER in zeppelin-env and should leave this with the default value "true" (meaning that Zeppelin will use --proxy-user option when impersonation is enabled in the spark2 interpreter). The after you kerberized the cluster you will have to edit the Spark2 interpreter and change the following: Set the interpeter to be instantiated "Per User | Isolated" and select the "User Impersonate" checkbox. Remove the following properties from the interpreter configuration: spark.yarn.keytab=/etc/security/keytabs/zeppelin.server.kerberos.keytab
spark.yarn.principal=zeppelin-mycluster@MYREALM.COM Add the following properties to the interpreter (fix domain and kerberos realm's names) : zeppelin.spark.keytab=/etc/security/keytabs/zeppelin.server.kerberos.keytab
zeppelin.spark.principal=zeppelin-mycluster@MYREALM.COM Save and restart the interpreter After you should be able to run your %spark2.* jobs from Zeppelin as the logger user.
... View more
05-24-2019
02:55 PM
@jingyong zou you should not use AUTH_LDAP_APPEND_DOMAIN unless your users are authenticating using the fully qualified principal name as in "username@mydomain.com" instead of simply "username". If you use uid or samAccountName as AUTH_LDAP_UID_FIELD (as is the case with OpenLDAP, IPA or AD) then this is not needed, Also check the values for the parameters AUTH_USER_REGISTRATION=True and AUTH_USER_REGISTRATION_ROLE which should be set to a valid role in Superset (Public, Gamma, Alpha o Admin). Another not very documented parameter which may be important depending on your LDAP setup is AUTH_LDAP_USERNAME_FORMAT, check this also. With the previous advises in mind, check carefully the following documentation articles and you may be able to find your appropiate options combination to make LDAP work with Superset: https://flask-appbuilder.readthedocs.io/en/latest/config.html https://superset.incubator.apache.org/security.html https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-ldap A tcpdump capture in your Superset server + wireshark analysis may be also of much help to debug what is your current Superset config sending to the LDAP server. In my case this was the "final step" to fit all the pieces.
... View more