Member since
10-03-2016
42
Posts
16
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1745 | 03-10-2017 10:52 PM | |
2532 | 01-03-2017 04:22 PM | |
1682 | 10-29-2016 03:08 PM | |
1110 | 10-08-2016 05:49 PM |
06-08-2018
07:17 AM
3 Kudos
Microservice applications prefer OpenID SSO by existing IDP, for example Keycloak. When these application access Hadoop Services with JWT, Knox should be able to verify and proxy to internal services. This article describe how to configure Knox JWT Provider and customize for your IDP requirement. There are two options. KnoxSSO direct integrate with IDP KnoxSSO, Knox Single Sign On service, integrates with IDP, for example Keycloak. After authenticated by IDP, Knox signs a JWT token for all microservices and Hadoop RESTful services. This solution is complex to configure and Knox could become the performance bottleneck. Knox JWTProvider accept jwt signed by other IDP Microservices still use current IDP for single sign on, and configure an extra Knox topology to accept other IDP signed JWT token. The second solution has less impact on existing architecture and Knox is only used when access Hadoop services. Here we will focus on this solution, and discuss how to configure Knox and what is the limitation. Add a JWTProvider Knox topology JWTProvider is explained https://knox.apache.org/books/knox-0-12-0/user-guide.html#JWT+Provider Current Ambari can't edit extra Knox topology, have to add it in command line. # ssh Knox node
$ sudo su - knox
# Add the new topology named jwt
$ vim jwt.xml
<topology>
<gateway>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
<provider>
<role>federation</role>
<name>JWTProvider</name>
<enabled>true</enabled>
<param>
<!-- knox.token.audiences is optional -->
<name>knox.token.audiences</name>
<value>tokenbased</value>
</param>
</provider>
</gateway>
<!-- Add Hadoop Services allowed jwt access, here use Yarn UI as an example -->
<service>
<role>YARNUI</role>
<url>http://{MASTER_NODE_1}:8088</url>
<url>http://{MASTER_NODE_2}:8088</url>
</service>
<service>
<role>RESOURCEMANAGER</role>
<url>http://{MASTER_NODE_1}:8088/ws</url>
<url>http://{MASTER_NODE_2}:8088/ws</url>
</service>
</topology> Add Knox Token Service in knoxsso toplogy for test <service>
<role>KNOXTOKEN</role>
<param>
<name>knox.token.ttl</name>
<value>600000</value>
</param>
<!-- knox.token.audiences is optional, must meet what's configured in JWTProvider -->
<param>
<name>knox.token.audiences</name>
<value>tokenbased</value>
</param>
</service> Test Knox jwt toplogy Enable Knox demo ldap as knoxsso source accounts. Get a knoxsso jwt token. $ curl -ivku guest:guest-password https://{KNOX_NODE}:8443/gateway/knoxsso/knoxtoken/api/v1/token
{"access_token":"eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJndWVzdCIsImF1ZCI6InRva2VuYmFzZWQiLCJpc3MiOiJLTk9YU1NPIiwiZXhwIjoxNTI4MjgxODQ3fQ.i2Y3MMGbTW9M-wHStL7TuwjmL_rYmTGRjN_7QK0KB8EfLxKJzL2zRFEU8USxyFAchDJ-3vDdLaU8UPsTCVufo9UT5p8ywSlBgulFsOzIYuq-YVIqATpJZVZIJWCnoGHjXuTZHXeRreyjAs6cFsiiqsDwL8rCxnAmtBQeoX9fsAI","token_type":"Bearer ","expires_in":1528281847804}
Use this token to access Yarn UI $ curl -ivk -H "Authorization: Bearer eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJndWVzdCIsImF1ZCI6InRva2VuYmFzZWQiLCJpc3MiOiJLTk9YU1NPIiwiZXhwIjoxNTI4MjgxODQ3fQ.i2Y3MMGbTW9M-wHStL7TuwjmL_rYmTGRjN_7QK0KB8EfLxKJzL2zRFEU8USxyFAchDJ-3vDdLaU8UPsTCVufo9UT5p8ywSlBgulFsOzIYuq-YVIqATpJZVZIJWCnoGHjXuTZHXeRreyjAs6cFsiiqsDwL8rCxnAmtBQeoX9fsAI" https://hdp-e.field.hortonworks.com:8443/gateway/jwt/yarn/ admin should be able to view the UI html contents, and guest would be blocked. Also can find the entries in Ranger Audit. Decode knoxsso jwt token Decode the jwt https://jwt.io/ HEADER
{
"alg": "RS256"
}
PAYLOAD
{
"sub": "guest",
"aud": "tokenbased",
"iss": "KNOXSSO",
"exp": 1528281847
} Microservice Application JWT Verification Synchronise certificates between Knox and IDP IDP use private certificate to sign JWT, and Knox use public certificate to verify the JWT. The certificate pair must match for the verification. In current Knox version, public key is not supported, have to configure public certificate as following example. Knox only supports keypair JWT signature algorithms, RS256, RS384, RS512, PS256, PS384, PS512. RS256 is default and not configurable in HDP2.6. Get public certificate from IDP. $ cat knox-pub.pem
-----BEGIN CERTIFICATE-----
MIIENTCCAx2gAwIBAgIJAP4/owzmw1t4MA0GCSqGSIb3DQEBCwUAMIGlMQswCQYD
VQQGEwJVSzEABC0GA1UECAwGTG9uZG9uMQ8wDQYDVQQHDAZMb25kb24xFDASBgNV
BAoMC0hvbWUgT2ZmaWNlMUQwQgYDVQQDDDtwcmltYXJ5LXNlY3VyaXR5MC5ub25w
cm9kZi51ay5zZHAuZGlnaXRhbC5ob21lb2ZmaWNlLmdvdi51azEYMBYGA1UdEQwP
RE5TLjE9bG9jYWxob3N0MB4XDTE4MDUzMTE2MTA0MFoXDTI4MDUyODE2MTA0MFow
gaUxCzAJBgNVBAYTAlVLMQ8wDQYDVQQIDAZMb25kb24xDzANBgNVBAcMBkxvbmRv
bjEUMBIGA1UECgwLSG9tZSBPZmZpY2UxRDBCBgNVBAMMO3ByaW1hcnktc2VjdXJp
dHkwLm5vbnByb2RmLnVrLnNkcC5kaWdpdGFsLmhvbWVvZmZpY2UuZ292LnVrMRgw
FgYDVR0RDA9ETlMuMT1sb2NhbGhvc3QwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAw
ggEKAoIBAQDNb8qkTY0afOAUauLWbLZfF/1kys6Il/aRwKLaf0m+nTiaYowCGBKB
XLbzkXXDNSCOC2b6qPRIul2yF1fd8hAmnfUiVJY2hKbbfhdvbPp1oF9cC6QL+28x
ppj+utmyf2YNYpXGCMKHh7wZHNTTt28jqB/+Co8RC2xQgQ6FX6rCmUSB62/VuMHp
JBCb4Sf3Y9qgiZKyqtK9B9UiNIo4bJqeyF3Ql3qCFKmXIoni4k/3wcKvw9wNiWPy
MQweK0Q7G542K02Q3h7+caRNnZibFSTC/Qvs5jsRrxmkw4vH3npEJGASVr+y9JGq
dCjsi+ocrEdN9SMnGlQpWWtuB8LRlsitAgMBAAGjZjBkMBQGA1UdEQQNMAuCCWxv
Y2FsaG9zdDAdBgNVHQ4EFgQUPlLDl7837zNPS2Dvn2u7mtqDTZgwHwYDVR0jBBgw
FoAUPlLDl7837zNPS2Dvn2u7mtqDTZgwDAYDVR0TBAUwAwEB/zANBgkqhkiG9w0B
AQsFAAOCAQEAZa7ZJfc8MzwmYCCmVt9xcGQqFeAxC4saqKEFuS6PJAJqlZIK+D/A
y+3AT2qJ84Rs3nWFdnIsEmGzWwLbfve/xyFFAizM8d1tYF5DxXWp+7f2c1Ssah+S
t+ua80N9Q2EwdZekQrtnfba58gW5RdTORNGAVjXJjQvHSytwkn1YlRsVQjhvv4Cy
R6LBb5Xdd0R9DIqu2mpp0bGvX6hlx0yPJrsiYxd1DsHl+aFdTnQ3OkZVvxe2MfWi
yhTIWQoLfHrMwc2l1qjn2c3x4AIRsqLiLkMTfgrgUWC+T2IL1oO5jFBjbeV3ljOY
fOfkGmG6TCsdF38qkB/fl869jUGXIBAHjQ==
-----END CERTIFICATE-----
In Knox JWT topology, configure the verification public certificate. <provider>
<role>federation</role>
<name>JWTProvider</name>
<enabled>true</enabled>
<param>
<name>knox.token.verification.pem</name>
<value>
MIIENTCCAx2gAwIBAgIJAP4/owzmw1t4MA0GCSqGSIb3DQEBCwUAMIGlMQswCQYD
VQQGEwJVSzEABC0GA1UECAwGTG9uZG9uMQ8wDQYDVQQHDAZMb25kb24xFDASBgNV
BAoMC0hvbWUgT2ZmaWNlMUQwQgYDVQQDDDtwcmltYXJ5LXNlY3VyaXR5MC5ub25w
cm9kZi51ay5zZHAuZGlnaXRhbC5ob21lb2ZmaWNlLmdvdi51azEYMBYGA1UdEQwP
RE5TLjE9bG9jYWxob3N0MB4XDTE4MDUzMTE2MTA0MFoXDTI4MDUyODE2MTA0MFow
gaUxCzAJBgNVBAYTAlVLMQ8wDQYDVQQIDAZMb25kb24xDzANBgNVBAcMBkxvbmRv
bjEUMBIGA1UECgwLSG9tZSBPZmZpY2UxRDBCBgNVBAMMO3ByaW1hcnktc2VjdXJp
dHkwLm5vbnByb2RmLnVrLnNkcC5kaWdpdGFsLmhvbWVvZmZpY2UuZ292LnVrMRgw
FgYDVR0RDA9ETlMuMT1sb2NhbGhvc3QwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAw
ggEKAoIBAQDNb8qkTY0afOAUauLWbLZfF/1kys6Il/aRwKLaf0m+nTiaYowCGBKB
XLbzkXXDNSCOC2b6qPRIul2yF1fd8hAmnfUiVJY2hKbbfhdvbPp1oF9cC6QL+28x
ppj+utmyf2YNYpXGCMKHh7wZHNTTt28jqB/+Co8RC2xQgQ6FX6rCmUSB62/VuMHp
JBCb4Sf3Y9qgiZKyqtK9B9UiNIo4bJqeyF3Ql3qCFKmXIoni4k/3wcKvw9wNiWPy
MQweK0Q7G542K02Q3h7+caRNnZibFSTC/Qvs5jsRrxmkw4vH3npEJGASVr+y9JGq
dCjsi+ocrEdN9SMnGlQpWWtuB8LRlsitAgMBAAGjZjBkMBQGA1UdEQQNMAuCCWxv
Y2FsaG9zdDAdBgNVHQ4EFgQUPlLDl7837zNPS2Dvn2u7mtqDTZgwHwYDVR0jBBgw
FoAUPlLDl7837zNPS2Dvn2u7mtqDTZgwDAYDVR0TBAUwAwEB/zANBgkqhkiG9w0B
AQsFAAOCAQEAZa7ZJfc8MzwmYCCmVt9xcGQqFeAxC4saqKEFuS6PJAJqlZIK+D/A
y+3AT2qJ84Rs3nWFdnIsEmGzWwLbfve/xyFFAizM8d1tYF5DxXWp+7f2c1Ssah+S
t+ua80N9Q2EwdZekQrtnfba58gW5RdTORNGAVjXJjQvHSytwkn1YlRsVQjhvv4Cy
R6LBb5Xdd0R9DIqu2mpp0bGvX6hlx0yPJrsiYxd1DsHl+aFdTnQ3OkZVvxe2MfWi
yhTIWQoLfHrMwc2l1qjn2c3x4AIRsqLiLkMTfgrgUWC+T2IL1oO5jFBjbeV3ljOY
fOfkGmG6TCsdF38qkB/fl869jUGXIBAHjQ==
</value>
</param>
</provider>
JWT Issuer By default Knox only accept JWT with "iss": "KNOXSSO". Need to verify if IDP, for example Keycloak can customize it. Alternatively, can override this in Knox configuration. Add following parameters in Ambari Knox Custom gateway-site jwt.expected.issuer=CUSTOM_SSO
jwt.expected.sigalg=RS256 However this new feature is only available in HDP3.x. Before upgrade to HDP3, have to hard code jwt issuer as "KNOXSSO". User Account ID Current Knox can only parse user account from jwt parameter "sub": "guest". If IDP, for example Ping Federation, uses "client_id", it would be a problem. And the value should be consistent with account id synchronized into Ranger. Normally it is uid in OpenLDAP. JWT audiences This is optional. It must meet what's configured in knox.token.audiences Known Issues Knox json parser issue Before HDP2.6.5, Knox json parser doesn't support complex json raw format with multiple keys. Knox treats it as invalid json format, and throw NullpointException. Workaround JSON Issue For HDP2.6.3, manually replace dependant JSON jar files. SSH into every Knox node. $ cd /usr/hdp/2.6.3.0-235/knox/dep $ sudo wget http://repo1.maven.org/maven2/com/nimbusds/nimbus-jose-jwt/4.41.2/nimbus-jose-jwt-4.41.2.jar $ sudo wget http://repo1.maven.org/maven2/com/jayway/jsonpath/json-path/2.4.0/json-path-2.4.0.jar $ sudo wget http://repo1.maven.org/maven2/net/minidev/json-smart/2.3/json-smart-2.3.jar $ sudo wget http://repo1.maven.org/maven2/net/minidev/asm/1.0.2/asm-1.0.2.jar $ sudo mv json-path-0.9.1.jar json-path-0.9.1.jar.bak $ sudo mv nimbus-jose-jwt-4.11.jar nimbus-jose-jwt-4.11.jar.bak $ sudo mv json-smart-1.2.jar json-smart-1.2.jar.bak Restart Knox from Ambari
... View more
Labels:
08-03-2017
03:55 PM
Short Description How to setup popular SQL development tools, like DbVisualizer, to access HDP Hive in secured network zone via Knox. Article Most customers setup Hadoop/Hive cluster in secured network Zone. There's no direct network connection between office and Hadoop network. So Knox is always setup to proxy Hive connections from office network. This article will setup DbVisualizer to connect to Knox Hive URL step by step. Download Hive Driver for HDP2.6.1 Download hive-jdbc.jar https://github.com/timveil/hive-jdbc-uber-jar/releases The latest for HDP2.6.1 is https://github.com/timveil/hive-jdbc-uber-jar/releases/download/v1.6-2.6.1/hive-jdbc-uber-2.6.1.0-129.jar Create a new Database Driver use the hive-jdbc.jar URL Format: jdbc:hive2://<server>:<port10000>/<database> The URL Format is for standard hive2 jdbc, will change later in actual connection for Knox. Create a new Connection with this new Driver Use Database URL, and edit the
Database URL as jdbc:hive2://<KNOX_NODE_FQDN>:<port>/;ssl=true;transportMode=http;httpPath=<KNOX_HIVE_HTTPPATH> Default knox hive http path is gateway/default/hive, but please double check with your system admin. Add Knox node certificate or CA certificate into
DbVisualizer JVM truststore, otherwise will get SSL Exception during DB
connection. Check the Java Home of DbVisualizer. Then add Knox node certificate or CA certificate into the truststore $ sudo keytool -import
-alias knox -file wb-e.crt.pem -keystore /Library/InternetPlug-Ins/JavaAppletPlugin.plugin/Contents/Home/lib/security/cacerts
Enter keystore password:changeit Enjoy DbVisualizer
... View more
Labels:
07-31-2019
08:16 PM
@Wendell Bu thanks for your article. At the end you mentioned that you will discuss the detailed configuration in other articles. Can you please share the other articles.
... View more
03-11-2017
12:14 AM
When NiFi flow runs in mission critical production environment, customer would concern about how to change NiFi flow logic but don't impact business. One solution was implemented in customer production is as follow steps: 1. Add an extra non-logic processor, for example "UpdateAttribute", between ListenPort and main ProcessGroup 2. Updated NewProcessGroup, add into canvas 3. Stop "UpdateAttribute" process, make sure RunningProcessGroup consume all flowfiles in the previous queue. But client components can still send messages to Listen Port 4. After all queued flowfiles are processed by RunningProcessGroup, move the queue end connection to the NewProcessGroup 5. Start "UPdateAttribute" process and NewProcessGroup 6. At the end, remove the old RunningProcessGroup Follow these steps, the production flow change is transparent for client components, and guarantee no data lost.
... View more
Labels:
02-21-2018
12:37 PM
@wbu Thank you for the post but could you please help me understand that how you have created HORTONWORKS.COM (REALM) and "hadoopadmin" principal on mac for which you have generated a ticket using principal's password? I am using "kadmin -l" to init a new REALM "EXAMPLE.COM" in line with cluster REALM and also the username "hadoopadmin" but when I try adding a REALM using "init -r <realm name>", I get:
kadmin: create_random_entry(krbtgt/EXAMPLE.COM@EXAMPLE.COM): randkey failed: Principal does not exist
init -r <realm name>
Or if I try adding a principal "add -r hadoopadmin@EXAMPLE.COM", I get:
kadmin: adding hadoopadmin@EXAMPLE.COM: Principal does not exist
vi /Library/Preferences/edu.mit.Kerberos OR vi /etc/krb5.conf
.example.com = "EXAMPLE.COM"
example.com = "EXAMPLE.COM"
[libdefaults]
default_realm = "EXAMPLE.COM"
dns_fallback = "yes"
noaddresses = "TRUE"
[realms]
EXAMPLE.COM = {
admin_server = "ad.example.com"
default_domain = "example.com"
kdc = "ad.example.com"
}
As far as I understand, on mac machine following steps must be performed before doing the above given steps:
1. Create vi /etc/krb5.conf
2. Create a new REALM "EXAMPLE.COM" (same as Hadoop cluster Kerberos REALM)
2. Create a new user principal "hadoopadmin" (same as Hadoop cluster Kerberos principal used to access the services)
3. Then only I can create a ticket (kinit) with the same password used in Step 2 while creating the user principal
Regards,
... View more
05-06-2019
10:05 PM
Hello @Wendell Bu , I am trying same , to send events from Nifi to Splunk (using putSplunk processor) . I was stuck initially , not able to see events in splunk . My AttributetoJSON (In my view data provenance ,I see raw logs are converted to JSON format) is connected to putSplunk processor , It has hostname,port and message delimiter configured as in below screenshot . On splunk side , input port is defined . Not sure if i am missing something .Can you please let me know if there are any other steps, i need to follow ? Appreciate your help in advance !
... View more
03-10-2017
10:52 PM
Find the problem. change ranger.usersync.ldap.user.searchfilter=(sAMAccountName={0}) to ranger.usersync.ldap.user.searchfilter=(sAMAccountName=*) The problem solved.
... View more
12-02-2016
11:52 PM
2 Kudos
Ambari doesn't actually ship with any bits for the HDP stack - we use repositories which you can specify in the installation wizard (or via a blueprint if you're doing a blueprint install). You just have to refer to the new repos for 2.5.3: http://s3.amazonaws.com/dev.hortonworks.com/HDP/hdp_urlinfo.json "2.5.3.0": {
"centos6": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/centos6/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"centos7": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"debian7": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/debian7/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"sles12": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/sles12/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"suse11": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/suse11sp3/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"ubuntu12": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/ubuntu12/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"ubuntu14": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/ubuntu14/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml",
"ubuntu16": "http://s3.amazonaws.com/dev.hortonworks.com/HDP/ubuntu16/2.x/BUILDS/2.5.3.0-38/HDP-2.5.3.0-38.xml"
}
}
},
... View more
04-17-2017
10:22 AM
After stopping DataNode, Hbase RegionServer still holds FDs from mounts and it needs to be stopped before you can unmount the volumes.
... View more
11-22-2016
07:54 PM
1 Kudo
Background When we use NiFi flow to load Adobe ClickStream tsv file into hive, we found around 3% rows are in wrong format or missed. Source Data Quality $ awk -F "\t" '{print NF}' 01-weblive_20161014-150000.tsv | sort | uniq -c | sort
1 154
1 159
1 162
1 164
1 167
1 198
1 201
1 467
2 446
2 449
2 569
6 13
10 3
13 146
13 185
15 151
16 54
18 433
21 432
22 238
23 102
26 2
34 138
179 1
319412 670
After clean the tsv $ awk -F "\t" 'NF == 670' 01-weblive_20161014-150000.tsv >> cleaned.tsv
$ awk -F "\t" '{print NF}' cleaned.tsv | sort | uniq -c | sort
319412 670
Still missed a few percent rows. Root Cause and Solution We are using ConvertCSVToAvro and ConvertAvroToORC. The clickstrem tsv files have " in them and the ConvertCSVtoAvro processor uses " as the value for the "CSV quote Character" processor configuration property by default. As a result many tabbed fields end up in the same record. We can get good output by changing this configuration property to another character that is not used in input files anywhere. We used ¥ So when use CSV related processor, double check the contents don't have the quote character.
... View more
Labels: