Member since
05-09-2018
44
Posts
3
Kudos Received
0
Solutions
06-06-2021
05:24 PM
Hi All, We have Hue (version 4.2.0 ) in our prod environment and it is coupled with our LDAP. We manually onboard new users by going to "Manage Users" -> "Add/Sync Manage Users". Here, we add the LDAP ID and later when the user gets added. Once the user is added, we again search the newly added LDAP user, click on that and go to "Profiles and Group". Here, we select "temporary-access" so that User can able to access the UI. Without Temporary-access selection, user will not be able to see the UI. We want to get this automated by writing a Py program and we are looking for the right REST API for adding users to Hue and give him/her "temp-access" in profiles and group. The following link wasn't helpful - http://cloudera.github.io/hue/latest/developer/api/ Please assist us. I would highly appreciate any form of assistance or help. Regards, Shesh Kumar
... View more
Labels:
07-02-2019
03:59 PM
Hi Michael Bronson To answer your question; JBOD is not recommended! Kafka Broker will crash and stop if any disk is corrupted or any I/O Error is encountered. Your re-assignment of data (or replicas) won't work between disks in a single broker if you want to balance the data. Thanks, Shesh
... View more
05-27-2019
05:40 PM
@Geoffrey Shelton Okot : Thank you for the detailed explanation.
... View more
05-27-2019
05:24 PM
@Geoffrey Shelton Okot I do not see that option 😞
... View more
05-21-2019
08:19 PM
Hi, As per my knowledge (please correct me if I am wrong), the Datanodes sends the block report to both Active and Standby Namenodes. The job of Active NN is to write to the Journal Nodes and the job of Standby namenode is to read from Journal nodes. Now why does Standby namenode need to read from Journal nodes when the Datanodes (slaves) are already sending the block reports to it?
... View more
Labels:
- Labels:
-
Apache Hadoop
05-08-2019
05:38 AM
My cluster was hung. Was unable to add hosts or perform any basic activities in Ambari like restart of a service. Was constantly seeing the WARN snippet in Ambari Server logs: Unable to lookup the cluster by ID; assuming that there is no cluster and therefore no configs for this execution command: Cluster not found, clusterName=clusterID=-1 Here's a small hack to resolve the issue: 1. Check the cluster id in your backend Ambari DB. Mine is MySQL. select * from clusterstate; 2. The same value found in step 1 should be there in Stage table's "cluster_id" columns select stage_id, request_id, cluster_id from stage; 3. If there are values as -1 please update it to the correct value found in step 1. Example: UPDATE stage SET cluster_id='2' WHERE request_id IN (383,384,388,389); 4. Restart Ambari-Server ambari-server restart 5. Post this check by restarting any service like Grafana or any small service which not does impact the Hadoop service. If it proceeds, the cluster is now stable and you will be able to add nodes. 6. If issue persists, the perform the following in your backend Ambari DB. SELECT * FROM host_role_command WHERE status='PENDING'; 7. If you get any output, you need to update the status to "ABORTED". UPDATE host_role_command SET status='ABORTED' WHERE status='PENDING'; 8. Restart Ambari-Server ambari-server restart Validate the health of Ambari by restarting Grafana or any small service which not does impact the Hadoop service. If everything is good, proceed by adding the nodes.
... View more
Labels:
03-02-2019
12:48 PM
How to find out and view the actual 3 Bad Rows in the table ? You can read the bad rows in Mapper logs. It will be marked as ERROR. So when you open the Mapper logs, you should page search for "ERROR" and voila! You should be able to read the bad rows!
... View more
03-02-2019
12:44 PM
Hi, No need to do any deletion. Just follow the steps here to resolve the issue without any data loss: https://community.hortonworks.com/questions/242343/hbase-table-is-stuck-in-disabling-state-neither-en.html Thanks, Shesh
... View more
03-02-2019
12:35 PM
1 Kudo
This is just a knowledge sharing article. I had faced this issue in production and took me a day to resolve it. The workaround that I'm sharing will help you in getting your table back online in "Enabled" without deleting Zookeeper Hbase table znode or any data as a matter of fact.. Here are steps to resolve it. 1. Run a "get" command against hbase:meta for the affected table hbase(main):003:0> get 'hbase:meta', '<AFFECTED_TABLE_NAME>', 'table:state' COLUMN CELL table:state timestamp=1551456805377, value=\x08\x02 2. Notice the above "value". Its pointing to \x08\x02 which is wrong. The value should either be \x08\x00 (Enabled) or \x08\x01 (Disabled) 3. Edit the value manually. hbase(main):003:0> put 'hbase:meta','<AFFECTED_TABLE_NAME>','table:state',"\b\0" Click here for more information on "Control Characters" 4. Verify the same. hbase(main):003:0> get 'hbase:meta', '<AFFECTED_TABLE_NAME>', 'table:state' The "value" now should be \x08\x00 Post this, again run disable <table_name> and enable <table_name> just for the love of sanity check in hbase shell and you are done with the issue.
... View more
Labels:
01-30-2019
09:14 AM
Hi @Josh Elser, The "VerifyReplication tool" will give me the number of Bad Rows. But how to view the actual Bad Rows in the 2 clusters? Thanks, Shesh Kumar
... View more
01-29-2019
05:16 PM
Hi, I've setup Hbase cross cluster replication between 2 clusters. After running some stress test by running the following command - "sudo -su hbase hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1" which will do random inserts. The count on both the cluster's table matched i.e 906856. However we have to verify if the replication is consistent on both cluster. To do that, I followed Hortonwork's document and ran the command. The output is shown below ROWS_SCANNED=906856
RPC_CALLS=9070 RPC_RETRIES=0 org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters BADROWS=3 CONTENT_DIFFERENT_ROWS=3 GOODROWS=906853 The number of rows scanned is correct 906856 which is total count of table. But there are 3 bad rows. Same result was given when ran in another cluster as well. With this result I can say that there is problem with Quality and not with Quantity. The main question now is: How to find out and view the actual 3 Bad Rows in the table ? Regards, Shesh Kumar
... View more
Labels:
12-21-2018
09:32 AM
Hi, I want to know if we can enable replication in Hbase such that the Hbase data in Cluster A is replicated to Cluster B. And the Hbase data of Cluster B is replicated to Cluster A. In short, "cross cluster replication". I want this feature for all the existing tables present in Hbase (both Cluster A and Cluster B). It should be active replication. For instance, any column family is added or populated or new table is created in Hbase, it should immediately replicate the same to to a different cluster. Security: None! the 2 clusters are plain, vanilla cluster. Can this be configured in Ambari? If yes, kindly guide me. Your valuable inputs will be highly appreciated. Thanks, Shesh Kumar
... View more
Labels:
12-18-2018
12:53 PM
I'm getting the following error after following "Option #2". My Ranger is not TLS/SSL enabled. Please help 18 Dec 2018 12:17:23 INFO LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] - Using principal = rangerusersync/stg-agent001-stg-cloud009.XXXXX.nm2@XXXXXX.COM and keytab = /etc/security/keytabs/rangerusersync.service.keytab 18 Dec 2018 12:17:24 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to build Group List : com.sun.jersey.api.client.UniformInterfaceException: POST http://stg-agent001-stg-cloud009.xxxxx.nm2:6080/service/xusers/groups/ returned a response status of 404 Not Found and 18 Dec 2018 12:17:24 INFO LdapDeltaUserGroupBuilder [UnixUserSyncThread] - LdapDeltaUserGroupBuilder.getUsers() completed with user count: 0 18 Dec 2018 12:17:24 ERROR LdapPolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add User : com.sun.jersey.api.client.UniformInterfaceException: POST http://stg-agent001-stg-cloud009.xxxxx.nm2:6080/service/xusers/ugsync/auditinfo/ returned a response status of 404 Not Found
... View more
12-12-2018
03:22 PM
Hi @Shahbaj Sayyad, We disabled the SSL for Ranger (I edited the original description). Still we see the error. Even in our stage cluster (Kerberized), we are not able to sync the unix users which does not have SSL enabled since beginning. I have attached the logs from stage cluster. Please check and kindly guide me.usersync.txt Thanks, Shesh
... View more
12-12-2018
01:01 PM
Hi, I'm seeing this error in RS logs. Can someone help me resolve the issue? 2018-12-12 18:28:54,407 INFO [B.defaultRpcServer.handler=12,queue=0,port=16020] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x3d630f8a): could not load 1093156401_BP-853897652-10.84.192.246-1489729943941 due to InvalidToken exception.
org.apache.hadoop.security.token.SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to /apps/hbase/data/data/default/cyclops-edges/865c0549943a300f09f5dfcd63fbaa67/s/1979e37ca9294d24bd16cd580fb663ab
2018-12-12 18:09:57,326 INFO [sync.2] wal.FSHLog: Slow sync cost: 105 ms, current pipeline: [DatanodeInfoWithStorage[10.84.197.254:50010,DS-159866b7-1b97-469f-9a22-4b03b3dbbe56,DISK], DatanodeInfoWithStorage[10.84.192.255:50010,DS-26463076-ed2d-4883-9013-2870ce87f281,DISK], DatanodeInfoWithStorage[10.84.192.76:50010,DS-b1b894fc-d1e2-4ccf-8cc7-7ccd649cd507,DISK]]
2018-12-12 18:09:57,391 WARN [B.defaultRpcServer.handler=21,queue=0,port=16020] hdfs.BlockReaderFactory: I/O error constructing remote block reader.
java.io.IOException: Got error, status message opReadBlock BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530, for OP_READ_BLOCK, self=/10.84.197.254:13978, remote=/10.84.192.246:50010, for file /apps/hbase/data/data/default/cyclops-audits-dedup/e44738e4889089bcecf58b66878a7501/l/457245b1ea724f7f80bab245ea6c0604, for pool BP-853897652-10.84.192.246-1489729943941 block 1093248459_19508530
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)
at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:816)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:695)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1181)
at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1118)
at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1478)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1441)
at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:92)
at org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:722)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1420)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1625)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1504)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:441)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:269)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:642) at java.lang.Thread.run(Thread.java:745)
2018-12-12 18:09:57,391 WARN [B.defaultRpcServer.handler=21,queue=0,port=16020] hdfs.DFSClient: Connection failure: Failed to connect to /10.84.192.246:50010 for file /apps/hbase/data/data/default/cyclops-audits-dedup/e44738e4889089bcecf58b66878a7501/l/457245b1ea724f7f80bab245ea6c0604 for block BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530:java.io.IOException: Got error, status message opReadBlock BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530, for OP_READ_BLOCK, self=/10.84.197.254:13978, remote=/10.84.192.246:50010, for file /apps/hbase/data/data/default/cyclops-audits-dedup/e44738e4889089bcecf58b66878a7501/l/457245b1ea724f7f80bab245ea6c0604, for pool BP-853897652-10.84.192.246-1489729943941 block 1093248459_19508530
java.io.IOException: Got error, status message opReadBlock BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-853897652-10.84.192.246-1489729943941:blk_1093248459_19508530, for OP_READ_BLOCK, self=/10.84.197.254:13978, remote=/10.84.192.246:50010, for file /apps/hbase/data/data/default/cyclops-audits-dedup/e44738e4889089bcecf58b66878a7501/l/457245b1ea724f7f80bab245ea6c0604, for pool BP-853897652-10.84.192.246-1489729943941 block 1093248459_19508530
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140) The issue that we are facing is very slow Hbase compaction. HDP Version: HDP-2.4.3.0-227 HBase: 1.1.2.2.4 Thanks, Shesh
... View more
Labels:
12-12-2018
09:37 AM
Hi @aquilodran, Thanks for the suggestion. I've removed the SSL for Ranger now and still its not working. Even in our stage cluster (Kerberized), we are not able to sync the unix users which does not have SSL enabled since beginning. Following is the attached logs from Stage cluster. Please check and provide your thoughts. Here's a small intercept from the usersync log: 12 Dec 2018 09:32:55 INFO UnixAuthenticationService [main] - Starting User Sync Service!
12 Dec 2018 09:32:55 WARN UnixUserGroupBuilder [UnixUserSyncThread] - DEPRECATED: Unix backend is configured to use /etc/passwd and /etc/group files directly instead of standard system mechanisms.
12 Dec 2018 09:32:55 INFO UserGroupSync [UnixUserSyncThread] - initializing sink: org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder
12 Dec 2018 09:32:56 INFO PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Using principal = rangerusersync/stg-agent001-stg-cloud009.XXXXXX.nm2@XXXXXX.COM and keytab = /etc/security/keytabs/rangerusersync.service.keytab
12 Dec 2018 09:32:57 INFO PolicyMgrUserGroupBuilder [UnixUserSyncThread] - valid cookie saved
12 Dec 2018 09:32:58 WARN UnixUserGroupBuilder [UnixUserSyncThread] - DEPRECATED: Unix backend is configured to use /etc/passwd and /etc/group files directly instead of standard system mechanisms.
12 Dec 2018 09:32:58 INFO UserGroupSync [UnixUserSyncThread] - initializing source: org.apache.ranger.unixusersync.process.UnixUserGroupBuilder
12 Dec 2018 09:32:58 INFO UserGroupSync [UnixUserSyncThread] - Begin: initial load of user/group from source==>sink
12 Dec 2018 09:32:58 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
12 Dec 2018 09:32:58 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: mahendra.aricent, groups: [mahendra.aricent, dev]
12 Dec 2018 09:32:58 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
12 Dec 2018 09:32:58 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: jatin, groups: [jatin, dev]
12 Dec 2018 09:32:58 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
12 Dec 2018 09:32:58 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: ankit, groups: [ankit, dev]
12 Dec 2018 09:32:58 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
12 Dec 2018 09:32:58 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: jithin.jose, groups: [jithin.jose, dev]
12 Dec 2018 09:32:58 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user Full Log: usersync.txt Thanks, Shesh Kumar
... View more
12-07-2018
10:00 AM
Hi @Mykola Mykhalov, Thanks for the workaround. It worked for me. Was able to proceed. Do we have an option to change the DB to MySQL ? Is it compatible with Kerberos ?
... View more
12-06-2018
09:48 AM
Hi @Mykola Mykhalov, I'm getting error when I install Apache Airflow with Ambari. ambari-server install-mpack --mpack=airflow-service-mpack.tar.gz Using python /usr/bin/python
Installing management pack
ERROR: Download airflow-service-mpack.tar.gz with python lib [urllib2] failed with error: (<type 'exceptions.ValueError'>, ValueError('unknown url type: airflow-service-mpack.tar.gz',), <traceback object at 0x7f270d9c03b0>)
Trying to download airflow-service-mpack.tar.gz to /var/lib/ambari-server/data/tmp/airflow-service-mpack.tar.gz with [curl] command.
ERROR: Download file airflow-service-mpack.tar.gz with [curl] command failed with error: % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (6) Could not resolve host: airflow-service-mpack.tar.gz
ERROR: Unable to download file airflow-service-mpack.tar.gz!
ERROR: unable to donwload file airflow-service-mpack.tar.gz!
ERROR: Management pack could not be downloaded!
ERROR: Exiting with exit code -1.
REASON: Management pack could not be downloaded! Please help here. Thanks!
... View more
12-04-2018
11:40 AM
Hi, I've setup a HDP 3.0.1 cluster which is Kerberized. However, the user sync is not happening with sync source as "Unix", min user id as "500". Below is the error observed in the logs: 04 Dec 2018 14:49:06 INFO UnixAuthenticationService [main] - Starting User Sync Service!
04 Dec 2018 14:49:06 WARN UnixUserGroupBuilder [UnixUserSyncThread] - DEPRECATED: Unix backend is configured to use /etc/passwd and /etc/group files directly instead of standard system mechanisms.
04 Dec 2018 14:49:06 INFO UserGroupSync [UnixUserSyncThread] - initializing sink: org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder
04 Dec 2018 14:49:06 INFO PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Using principal = rangerusersync/prd-lucy110.XXXXX.nm1@XXXXXX.COM and keytab = /etc/security/keytabs/rangerusersync.service.keytab
04 Dec 2018 14:49:07 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to build Group List :
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 1
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:176)
04 Dec 2018 14:49:07 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
04 Dec 2018 14:49:07 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: jatin, groups: [jatin, dev]
04 Dec 2018 14:49:07 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to add portal user
04 Dec 2018 14:49:07 ERROR UnixUserGroupBuilder [UnixUserSyncThread] - sink.addOrUpdateUser failed with exception: Failed to add portal user, for user: suraj.ghosh, groups: [suraj.ghosh, dev] In the same machine I added a user "rangerusersync". Ran the python script "updatepolicymgrpassword.py" and provided same username and password. But still fails! Please see the attached screenshot of Ranger Audit UI (Usersync tab). Is this normal ? Also attached full logs. Please check :- usersync-log.txt Note:- Ambari server is TLS (SSL) enabled. But not Ranger. Can anyone please help me in resolving this issue? It would be highly appreciated. Thanks, Shesh Kumar
... View more
Labels:
12-03-2018
12:39 PM
Hi. We are running HDP 3.0.1 which is Kerberized. We are not able to sync the 'linux' users to Ranger. Below is the error observed in usersync log file. 03 Dec 2018 18:03:40INFO UserGroupSync [UnixUserSyncThread] - Begin: update user/group from source==>sink
03 Dec 2018 18:03:40 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to communicate Ranger Admin :
com.sun.jersey.api.client.ClientHandlerException: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.tryUploadEntityWithCred(PolicyMgrUserGroupBuilder.java:895)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.cookieBasedUploadEntity(PolicyMgrUserGroupBuilder.java:1248)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.getUserGroupAuditInfo(PolicyMgrUserGroupBuilder.java:1688)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.access$1000(PolicyMgrUserGroupBuilder.java:79)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder$8.run(PolicyMgrUserGroupBuilder.java:1660)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder$8.run(PolicyMgrUserGroupBuilder.java:1656)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.addUserGroupAuditInfo(PolicyMgrUserGroupBuilder.java:1656)
at org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder.postUserGroupAuditInfo(PolicyMgrUserGroupBuilder.java:1615)
at org.apache.ranger.unixusersync.process.UnixUserGroupBuilder.updateSink(UnixUserGroupBuilder.java:186)
at org.apache.ranger.usergroupsync.UserGroupSync.syncUserGroup(UserGroupSync.java:107)
at org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:85)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Menlo; color: #000000} span.s1 {font-variant-ligatures: no-common-ligatures}
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler$1$1.getOutputStream(URLConnectionClientHandler.java:238)
at com.sun.jersey.api.client.CommittingOutputStream.commitStream(CommittingOutputStream.java:117)
at com.sun.jersey.api.client.CommittingOutputStream.write(CommittingOutputStream.java:89)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.io.BufferedWriter.flush(BufferedWriter.java:254)
at com.sun.jersey.core.util.ReaderWriter.writeToAsString(ReaderWriter.java:191)
at com.sun.jersey.core.provider.AbstractMessageReaderWriterProvider.writeToAsString(AbstractMessageReaderWriterProvider.java:128)
at com.sun.jersey.core.impl.provider.entity.StringProvider.writeTo(StringProvider.java:88)
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Menlo; color: #000000} span.s1 {font-variant-ligatures: no-common-ligatures}
at com.sun.jersey.core.impl.provider.entity.StringProvider.writeTo(StringProvider.java:58)
at com.sun.jersey.api.client.RequestWriter.writeRequestEntity(RequestWriter.java:300)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:217)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 18 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292)
at sun.security.validator.Validator.validate(Validator.java:260)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491)
... 46 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)
... 52 more
03 Dec 2018 18:03:40INFO UserGroupSync [UnixUserSyncThread] - End: update user/group from source==>sink
03 Dec 2018 18:04:40INFO UserGroupSync [UnixUserSyncThread] - Begin: update user/group from source==>sink
03 Dec 2018 18:04:40 ERROR PolicyMgrUserGroupBuilder [UnixUserSyncThread] - Failed to communicate Ranger Admin :
com.sun.jersey.api.client.ClientHandlerException: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652) I've also followed below HW Support KB but still no luck. https://community.hortonworks.com/content/supportkb/49025/ranger-usersync-fails-with-unable-to-find-valid-ce.html Any technical help will be highly appreciated. Thanks,
... View more
Labels:
12-01-2018
11:45 AM
1 Kudo
Hi @aquilodran, I did try the steps which you recommend. But unfortunately it did not work. To make it work, I edited few properties in HDFS service allow anonymous = true http auth = simple (previous val was 'kerberos') Thank you!
... View more
11-30-2018
02:17 PM
1 Kudo
Hi,
We are running a self-signed certificate Ambari cluster (with HTTPS) and we also enabled the cluster with FreeIPA+Kerberos. Ambari URL: https://xxxx.xxxx.nm1:8443 (its not .com)
HDP: 3.0.1 (Latest)
After successfully integrating FreeIPA+Kerberos with Ambari cluster, we are unable to access few important GUIs such as Namenode UI, Resource Manager UI and Oozie UI. The error we are getting below is this:
HTTP ERROR 401
Problem accessing /index.html.
Reason:Authentication required
I've tried all possible scenarios to debug this error like running the following command in my MAC terminal but its of no use.
defaults write com.google.Chrome AuthServerWhitelist "*.REALM_NAME.COM"
defaults write com.google.Chrome AuthNegotiateDelegateWhitelist "*.REALM_NAME.COM"
I ran the same above command in Google Chrome console (option+command+j in MAC) and got this error: Uncaught SyntaxError: Unexpected identifier The following Keytabs are present in /etc/security/keytabs : kerberos.service_check.113018.keytab ambari.server.keytab spnego.service.keytab yarn-ats.hbase-regionserver.service.keytab yarn-ats.hbase-master.service.keytab smokeuser.headless.keytab oozie.service.keytab nn.service.keytab hive.service.keytab ams-monitor.keytab nm.service.keytab hive.llap.task.keytab hbase.headless.keytab spark.service.keytab spark.headless.keytab rm.service.keytab hdfs.headless.keytab ambari-infra-solr.service.keytab zk.service.keytab yarn.service.keytab yarn-ats.hbase-client.headless.keytab dn.service.keytab There is a valid ticket HDFS user as well but still unable to access the UI: hdfs@xxxxxxx:/etc/security/keytabs$ klist
Ticket cache: FILE:/tmp/krb5cc_1213
Default principal: nn/xxxxxx.xxxxxx.nm1@REALM.COM
Valid starting Expires Service principal
11/30/18 16:13:31 12/01/18 16:13:31 krbtgt/REALM.COM@REALM.COM
renew until 12/07/18 16:13:31 I also tried using "spnego.service.keytab" but still no use: root@xxxxxxx102:/etc/security/keytabs# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: HTTP/xxxxxx102.xxxxx.nm1@REALM.COM
Valid starting Expires Service principal
11/30/18 17:23:38 12/01/18 17:23:38 krbtgt/REALM.COM@REALM.COM
renew until 12/07/18 17:23:38 Kindly provide your technical suggestions. It would be very helpful and highly appreciated. Should I disable the Kerberos HTTP authentication ? If yes, please guide me the same for NN, RM and Oozie URLs Thanks, Shesh Kumar
... View more
Labels:
11-21-2018
05:30 PM
Thank you so much Robert! Highly appreciate your views. I've one more doubt which I came across. It is about auto-renew of Kerberos ticket. As you know we have successfully integrated FreeIPA with Ambari cluster which also has IPA replication as well. I noticed that user's kerberos ticket is not auto-renewing even though they have a valid ticket. shesh.kumar@stg-ambarixenial001:~$ klist Ticket cache: FILE:/tmp/krb5cc_1193 Default principal: shesh.kumar@EXAMPLE.COM Valid starting Expires Service principal 11/18/18 18:15:37 11/19/18 18:15:34 krbtgt/EXAMPLE.COM@EXAMPLE.COM renew until 11/25/18 18:15:34 As you can see above, the ticket is not auto-renewing. How can I make sure that kerberos ticket is auto-renewed once the user executes the "kinit" command. Let me show you what I have done from my side. I've added these 3 lines in /etc/sssd/sssd.conf file which is present in FreeIPA server (which don't have Hadoop client). krb5_lifetime = 120s krb5_renewable_lifetime = 150m krb5_renew_interval = 10s Will this work? Thanks, Shesh Kumar
... View more
11-18-2018
07:55 PM
Thanks for your suggestion. I really appreciate it. I have one more doubt. So If I have to remove/delete multiple users in IPA...say like 50 users, I will also need to login the server as root, switch to their user and fire "kdestroy" to remove the ticket cache? Won't this be too much of manual effort? What is the best practice that you recommend?
... View more
11-18-2018
02:18 PM
We are trying out FreeIPA and integrated the same to our Ambari Hadoop cluster (HDP v3.0.1). We are able to add users and provide them access to Hadoop with help of Kinit command. However, when deleting the users in FreeIPA GUI, the principal gets deleted. The deleted user's principal will not be there in the "kadmin" prompt when I do listprincs. But the user will still be having a valid ticket when he does "klist" and can access Hadoop even though the principal is removed. We cannot do "kdestory" manually. Typically, when users are removed in FreeIPA, the same users should not be able to access Hadoop as well. Can't FreeIPA handle kdestroy? Please provide your suggestions. Thanks, Shesh
... View more
Labels:
- Labels:
-
Apache Hadoop
11-10-2018
01:51 PM
Oh! My bad. I was checking that option in Ranger service. Thank you for correcting me, Ariel 🙂
... View more
11-08-2018
05:53 PM
Hi, I'm trying to configure SSL for HDFS Ranger Plugin (Self-Signed Certificate) by following this document: Configure the Ranger HDFS Plugin for SSL (HDP 3.0.1 version) The Step 6 in the link says -- "Select Advanced ranger-hdfs-policymgr-ssl and set the following properties" However, in my Ambari UI (HDP 3.0.1), I cannot see that option at all. Please see the screenshot (Attached after my signature). I've enabled the HDFS plugin as well. I've no idea how to proceed further. Any suggestions would be highly appreciated. Thanks, Shesh Kumar
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Ranger
10-30-2018
07:47 AM
Thank you! Will surely check the recommendation next time.
... View more
10-26-2018
03:04 AM
Thank you so much for your suggestion. However, I just happen to resolve this issue. Below I've shared my resolution. Please check and let me know what you think about it 🙂 If I face this situation again. Will try your suggestion the next time.
... View more