Member since
09-26-2016
33
Posts
4
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
929 | 08-28-2018 07:37 PM | |
10956 | 12-27-2017 02:48 PM | |
2064 | 11-08-2016 06:45 PM |
10-10-2018
06:44 PM
No missing blocks; a hdfs fsck comes up 100% clean
... View more
10-09-2018
06:46 PM
So as a limited test, I tried your plan--- with JUST namenode... and it worked. This time. auto-started just fine. But in a larger sense, when I use auto-start for entire nodes or the whole cluster- nope, the completion of all services starting up never finishes. SOme of the nodes are fine, but I know if I auto-start everything on the namenode host it gets locked up. I assume that over time other people will start running into this and then we'll have more info on what's failing.
... View more
10-09-2018
06:36 PM
The issue isn't that the namenode process isn't trying to start. or doesn't exist.... it IS....... the problem is that it does not FINISH starting because safemode keeps it locked. So, the deeper issue here is that HDFS does not properly clear safemode when the namenode process auto-starts.
... View more
10-08-2018
03:23 PM
SO AM I. I did a full ambari 2.7 and HDP 3.0.1 upgrade, only to find that the lucidworks HDP SOLR mpacks and such don't work, and I no longer see SOLR as a service I can add (even though I tried to reinstall the mpack etc).
... View more
10-08-2018
02:50 PM
While the upgrade I recently did on an 8-node cluster seems to have gone OK (Ambari 2.7.1.0, HDP 3.0.1) I have found that the auto-start of NAMENODE does not seem to be working properly anymore. A reboot of the namenode host leaves HDFS in a permanent "safemode on" state, and the startup does not work anymore like it used to. The only workaround I know of is either: (A) stop HDFS/namenode before rebooting (B) If after an uexpected reboot, you have to turn safemode off or the namenode will not start. **EDIT** I am also apparently having issues with AUTOSTART in general, even after a *clean* "shut down all services" first before rebooting a cluster. Things just don't come up properly. I have given up on this for now and have completely disabled the autostart mechanism. I don't believe it's ready for prime-time with the latest ambari and hdp......................................... Would love to see if anyone else is having issues too with this...?
... View more
Labels:
10-05-2018
03:58 PM
The u'solr' issue is somewhat embarassing. While it's not clearly documented (I think it WAS at one time..?), the only real solution is that you must COMPLETELY delete and remove the SOLR services from ambari. (note NOT "INFRA-SOLR", that's ok). The SOLR service is, at this point, still an oddball "add on", even though once set up it looks like it's perfectly integrated into HDP and AMBARI. It only... "sort of" is. A sad thing, hopefully this gets truly integrated in the future. (note, make sure to backup and save your collections.....) The SECOND error, the bit about <a service iem> not found in the configurations dictionary is actually a bug that was fixed in Ambari 2.7.1.0.
... View more
08-30-2018
08:14 PM
I got this exact issue upgrading from ambari 2.6 to ambari 2.7. The critical error seems to be caused by: in setup_users groups =params.user_to_groups_dict[user],KeyError: u'solr' I had to reverse the upgrade, since when this is happening I cannot use ambari to start ANY services.
... View more
08-29-2018
08:14 PM
I have the exact same issue. (and actually, one more on top...) First, the same issue with the keyerror on u'solr'. This prevented me from starting ANY service. This is a deal-breaker. I had to completely revert back to ambari 2.6.x. NOT good. ANYONE have an idea how to resolve this? This broke everything: stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 35, in <module> BeforeAnyHook().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute method(env) File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 29, in hook setup_users() File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 50, in setup_users groups = params.user_to_groups_dict[user], KeyError: u'solr' Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-4042.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-4042.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-START/scripts/hook.py', 'START', '/var/lib/ambari-agent/data/command-4042.json', '/var/lib/ambari-agent/cache/stack-hooks/before-START', '/var/lib/ambari-agent/data/structured-out-4042.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] (here is the associated stdout:) 2018-08-29 12:18:45,630 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.4.0-91 -> 2.6.4.0-91 2018-08-29 12:18:45,642 - Using hadoop conf dir: /usr/hdp/2.6.4.0-91/hadoop/conf 2018-08-29 12:18:45,768 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.4.0-91 -> 2.6.4.0-91 2018-08-29 12:18:45,772 - Using hadoop conf dir: /usr/hdp/2.6.4.0-91/hadoop/conf 2018-08-29 12:18:45,773 - Group['livy'] {} 2018-08-29 12:18:45,774 - Group['spark'] {} 2018-08-29 12:18:45,774 - Group['solr'] {} 2018-08-29 12:18:45,774 - Group['ranger'] {} 2018-08-29 12:18:45,774 - Group['hdfs'] {} 2018-08-29 12:18:45,775 - Group['hadoop'] {} 2018-08-29 12:18:45,775 - Group['users'] {} 2018-08-29 12:18:45,775 - Group['knox'] {} 2018-08-29 12:18:45,775 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-29 12:18:45,776 - User['infra-solr'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-29 12:18:45,777 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-29 12:18:45,778 - User['ranger'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['ranger', 'hadoop'], 'uid': None} 2018-08-29 12:18:45,778 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None} 2018-08-29 12:18:45,779 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['livy', 'hadoop'], 'uid': None} 2018-08-29 12:18:45,780 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['spark', 'hadoop'], 'uid': None} 2018-08-29 12:18:45,780 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None} Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-4042.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-4042.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-START/scripts/hook.py', 'START', '/var/lib/ambari-agent/data/command-4042.json', '/var/lib/ambari-agent/cache/stack-hooks/before-START', '/var/lib/ambari-agent/data/structured-out-4042.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] and this one broke solr itself: stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/SOLR/6.6.2/package/scripts/solr.py", line 139, in <module> Solr().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/SOLR/6.6.2/package/scripts/solr.py", line 91, in stop import params File "/var/lib/ambari-agent/cache/common-services/SOLR/6.6.2/package/scripts/params.py", line 35, in <module> zookeeper_hosts = build_zookeeper_hosts() File "/var/lib/ambari-agent/cache/common-services/SOLR/6.6.2/package/scripts/params.py", line 20, in build_zookeeper_hosts zookeeper_hosts_length = len(zookeeper_hosts_list) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/config_dictionary.py", line 73, in __getattr__ raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!") resource_management.core.exceptions.Fail: Configuration parameter 'zookeeper_hosts' was not found in configurations dictionary!
... View more
08-28-2018
07:37 PM
1 Kudo
After several hours of searching, I have come to the conclusion that there is no easy fix for this, specifically through ambari API calls, which is a shame. The API calls in terms of services are limited to stopping, starting, adding, deleting. But there's no way to "fix" a broken state as per above. Apparently the only way to solve this was to hack the ambari database and change the component state from "STOPPING" to "INSTALLED" and then I was fine. There really oight to be a better way.....
... View more
08-28-2018
02:23 PM
1 Kudo
Good morning! So, I just upgraded a small cluster to Version 3.0.0. The upgrade seemed to go well. But after a reboot I am stuck- I have two services that ambari still thinks are "stopping...". (seconday namenode and also zookeeper) Because of this, I can't get the services running. The error I get is: Error message: java.lang.IllegalArgumentException: Invalid transition for servicecomponenthost, clusterName=ontomatedev, clusterId=2, serviceName=HDFS, componentName=SECONDARY_NAMENODE, hostname=myhost.domain.com, currentState=STOPPING, newDesiredState=STARTED As far as I can tell, I *should* hopefully be able to "reset" this somehow via a curl command but I'm at a loss as to what. Something like this has no effect: curl -s -u admin:admin -H 'X-Requested-By: Ambari' -X PUT -d '{"RequestInfo":{"context":"Stop Component"},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://localhost:8080/api/v1/clusters/ontomatedev/hosts/myhost.domain.com/host_components/SECONDARY_NAMENODE If anyone can tell me how to reset this incorrect 'state' I'd be *most* grateful!!!
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
08-22-2018
09:51 PM
This is just TLS not full certificate-based SSL: [root@garth01 ldaptool]# openssl s_client -connect myldapthing.company.com:389 CONNECTED(00000003) 140306741082000:error:140790E5:SSL routines:ssl23_write:ssl handshake failure:s23_lib.c:177: --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 289 bytes --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated SSL-Session: Protocol : TLSv1.2 Cipher : 0000 Session-ID: Session-ID-ctx: Master-Key: Key-Arg : None Krb5 Principal: None PSK identity: None PSK identity hint: None Start Time: 1534974457 Timeout : 300 (sec) Verify return code: 0 (ok) ---
... View more
08-22-2018
09:42 PM
Yes ranger is a "client". I haven't gotten to even trying ranger yet because I can't even get the "ldaptool" to work. See error above in terms of what ldaptool tells me. I'm runnig that on the ranger node.
... View more
08-22-2018
09:34 PM
Hello! So, our corporate folks are forcing us from the "direct to their active directory controller" to a new ldap proxy setup that's based on openldap. Under the older active directory setup, I connect from ranger to ldaps://domain.com:636 and all is good. It works. But under the new setup I need to get working, it's still "ldap" (not ldaps)... and port 389 (not 636). That's simple enough, BUT the connection requires TLS. In an unrelated apache server, the new ldap bind setup was tweaked as like this (the magic sauce is the "TLS" option at end): AuthLDAPURL "ldap://domain.company.com:389/dc=domain,dc=com?sAMAccountName?sub?(objectClass=*)" TLS and similarly, using the standard unix-based "ldapsearch" tool, that has the "-ZZ" option-- to force the use of TLS. But as for RANGER, I'm kinda stuck. Can anyone tell me how the heck I can get TLS negotiation working in RANGER? The "ldaptool" provided as a nifty gadget with ranger errors out thus: javax.naming.AuthenticationNotSupportedException: [LDAP: error code 13 - TLS confidentiality required] This is an error I have seen again and again- until (for example in apache, or ldapsearch..) I figured out how to enable TLS. I am clueless as to what "option" to define or enable to force TLS negotiation for RANGER. Any insight or or ideas would be appreciated!
... View more
Labels:
- Labels:
-
Apache Ranger
12-27-2017
02:48 PM
I deleted all the snapshots and data after getting a go-ahead from the developers...
... View more
08-04-2017
07:39 PM
yup yup yup. Found the snapshots.... guessing THAT is the culprit. Time to have a conversation with the developers.... there's.. a lot.
... View more
08-04-2017
07:14 PM
As far as I can fine, the hbase.master.hfilecleaner.ttl value was not set at all. (does that then mean.. NO cleaning?). I set it to 900000 (15 minutes) and we'll see if anything happens.
... View more
08-04-2017
06:35 PM
hi! So, I'm the sysadmin of a hadoop cluster. I am not a developer, nor do I "use" it. But... I make sure it's running and happy and secure and... so on. In reviewing HDFS disk use lately, I noticed our numbers are kinda high. After some digging, it appears all of the space is going into hbase. OK cool, that's what our developers are doing. Stuffing things in hbase. But I appear to be losing a bunch of disk space to the hbase "archives" folder. Which is something I assume that hbase is putting stuff in when tables are deleted or...? I checked with one of our developers, he sees that in the archive there's tables he deleted long ago. So... my simple question is, how do I clean out unneeded things from the hbase "archive"? I assume manually deleting stuff via hdfs is **not** the way to go. [hdfs dfs -du -s -h /apps/hbase/data/*
338.6 K /apps/hbase/data/.hbase-snapshot
0 /apps/hbase/data/.tmp
20 /apps/hbase/data/MasterProcWALs
830 /apps/hbase/data/WALs
6.6 T /apps/hbase/data/archive <=== THIS. 0 /apps/hbase/data/corrupt
4.1 T /apps/hbase/data/data
42 /apps/hbase/data/hbase.id
7 /apps/hbase/data/hbase.version
30.7 K /apps/hbase/data/oldWALs ANY and all help for an hbase newbie would be really appreciated
... View more
Labels:
- Labels:
-
Apache HBase
08-04-2017
06:26 PM
$^(*!$!^&(!/. I had a huge response all typed up and this forum blew up the answer. Lost my submittion. I will summarize: ENABLE DEBUGGING. It was not until I enabled debugging for ranger that, when I got an error similar to yours, I uncovered that I needed to get my AD certificate into the truststore Note, ranger has TWO truststores. One for the user sync, the other for ranger itself logging in the UI....................... check these, and that your AD certificate is in the keystore mentioned: ranger.usersync.truststore.file ranger.https.attrib.keystore.file
... View more
04-26-2017
09:32 PM
Yeah, the decentralized nature of the keystores is... kind of a huge problem and like both you and I discovered, not inherently obvious. I still like the horton setup for ease of management... I *really* do. (Tried a hadoop cluster once built from scratch manually.. you have no ideas how bad that was). I'm discovering the "ease" of managing courtesy of Horton's setup really applies only about 90% of the time. The other 10% is a bitch. heh. With the Ambari UI, just about everything is there. Great! ALMOST.
... View more
03-23-2017
04:30 PM
Oh my gosh, I totally forgot that yes, this is front-ended by apache. Setting the AllowEncodedSlashes to "On" in my environment solved the issue. (I have apache 2.2.x which does not yet have the "NoDecode" value, but that's OK as "On" seems to work).
thankyouthankyyouthankyouthankyouthankyou.
... View more
03-23-2017
02:50 PM
@SBandaru I tried the change above and it has made no difference. I still have the same behavior.
... View more
03-22-2017
09:22 PM
Good day! My developer team is just starting to want to use Hive View (via ambari). I'm the sysadmin. The first report was "my query isn't working" (hangs, no result). After a bunch of fiddling around and simplifying the tests, I discovered that pretty much everything in Hive View is not functioning properly. I am desperate for help....
My simplest test: creating two empty databases. Boom.
Example: Log into ambari. go into Hive View. In the work window try a very simple operation: "create database test;"
As soon as I hit "execute", I get an error at the top of the window. (see image) BUT..... if I wait a few seconds...... the error vanishes and well, the database DOES get created. There is no explanatory text regarding that error.
Here is where everything goes (further) south. As soon as I try anything else, I don't get the same error above. Instead, after clicking "Execute", I never get a result back. The green button at the bottom stays in an executing state (green button turns to orange "Stop Execution" button.,......... and it never comes back. In like fashion-- the database IS apparently created, which is a partial success. But the GUI gets locked up. Meanwhile, in the hive view log, I see a repeated series of messages over and over and over. Example: 22 Mar 2017 16:13:07,851 INFO [ambari-client-thread-11095] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:13:07,852 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:07,873 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:07,874 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:13:07,880 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-34] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb
22 Mar 2017 16:13:19,271 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-34] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:319 - About to free connector for job 9 and user kcb
22 Mar 2017 16:13:22,868 INFO [ambari-client-thread-11096] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:13:22,869 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:22,891 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:22,891 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:13:22,897 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-70] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb
22 Mar 2017 16:13:37,870 INFO [ambari-client-thread-11095] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:13:37,871 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:37,894 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:37,894 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:13:37,900 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-70] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb
22 Mar 2017 16:13:52,873 INFO [ambari-client-thread-11095] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:13:52,875 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:52,896 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:13:52,897 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:13:52,903 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-34] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb
22 Mar 2017 16:14:07,876 INFO [ambari-client-thread-11005] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:14:07,877 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:14:07,897 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:14:07,897 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:14:07,904 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-70] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb
22 Mar 2017 16:14:22,897 INFO [ambari-client-thread-11095] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] DDLDelegatorImpl:163 - Executing query: show databases like '*', for user: kcb
22 Mar 2017 16:14:22,898 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:86 - Statement executor is executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:14:22,919 INFO [HiveViewActorSystem-akka.actor.result-dispatcher-50] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] StatementExecutor:88 - Finished executing statement: show databases like '*', Statement id: 0, JobId: SYNC JOB
22 Mar 2017 16:14:22,919 INFO [HiveViewActorSystem-akka.actor.jdbc-connector-dispatcher-69] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] JdbcConnector:281 - Finished processing SQL statements for Job id : SYNC JOB
22 Mar 2017 16:14:22,925 INFO [HiveViewActorSystem-akka.actor.default-dispatcher-34] [HIVE 1.5.0 AUTO_HIVE_INSTANCE] OperationController:328 - About to free sync connector for user kcb Finally.. hitting "Stop Execution" on the bottom gets me no further; that too hangs seemingly forever. So that's where I am. I have a relatively new install of HDP (latest- 2.5), and as I currently have it, the whole hive view is unusable. I even tried uninstalling and reinstalling HIVE completely; no change. I am desperate for help here all... I appreciate any and all support the community can give. I will monitor this closely and will provide any other config info anyone needs to know about. In related debugging- I have found that I have NO issues with the hive CLI. There I can do queries and all sorts of things, no issues. My problem seems limited to the Ambari Hive View front end.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
11-08-2016
06:45 PM
1 Kudo
OK, so after pulling out my hair for a day it seems that the ranger ADMIN portion of ranger now either has its own truststore file, or the file vanished during the upgrade (I do not have an old 2.4 hadoop cluster to check). In any case, under the settings for "Advanced ranger-admin-site" , near the bottom is the "ranger.truststore.file", defaulted to the value: " /etc/ranger/admin/conf/ranger-admin-keystore.jks". On my ranger server. that file was not even there, which is what leads me to believe this MIGHT be a new ranger parameter for ranger-admin? To solve my error above, I simply created this file with the java "keytool" and imported the certificate I've always had from our AD folks. In HDP 2.4, ranger admin needed the certificate in the default java keystore (usually something like /usr/java/latest/jre/lib/security/cacerts). It looks like ranger ADMIN has its own truststore now....? (why everything is split between ranger admin and ranger usersync, lord only knows..). I know the ranger.usersync.truststore.file was always there (defaults to /usr/hdp/current/ranger-usersync/conf/mytruststore.jks, but that's not the issue above above).
Is the ranger ADMIN truststore file something new? If not, I swear I'm losing my mind. In any case, anyone using ranger admin UI authenticated with LDAP/AD over SSL.... look here. 🙂
... View more
11-07-2016
09:43 PM
OK, so after some effort I have the debugging working, and I'm confused. The error it's giving me is a 'trustanchors' error, which (after a wee bit of googling) usually indicates that it can't find or access the trustsstore (aka cacerts).
Normal LDAP works fine, secure (SSL) ldap is now broken. (worked with HDP2.4..).
In my case, the cacerts is the same one that was there before the upgrade..... further, permissions are all good, and I have verified the key we use in there is still present. I only have one java on the server. Any hints would be appreciated. I'm missing something silly. Caused by: org.springframework.ldap.CommunicationException: simple bind failed: mydomain.mycompany.net:636; nested exception is javax.naming.CommunicationException: simple bind failed: mydomain.mycompany.net:636 [Root exception is javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty]
at org.springframework.ldap.support.LdapUtils.convertLdapException(LdapUtils.java:100)
at org.springframework.ldap.core.support.AbstractContextSource.createContext(AbstractContextSource.java:266)
at org.springframework.ldap.core.support.AbstractContextSource.getContext(AbstractContextSource.java:106)
at org.springframework.ldap.core.support.AbstractContextSource.getReadOnlyContext(AbstractContextSource.java:125)
at org.springframework.ldap.core.LdapTemplate.executeReadOnly(LdapTemplate.java:792)
at org.springframework.security.ldap.SpringSecurityLdapTemplate.searchForSingleEntry(SpringSecurityLdapTemplate.java:196)
at org.springframework.security.ldap.search.FilterBasedLdapUserSearch.searchForUser(FilterBasedLdapUserSearch.java:116)
at org.springframework.security.ldap.authentication.BindAuthenticator.authenticate(BindAuthenticator.java:90)
at org.springframework.security.ldap.authentication.LdapAuthenticationProvider.doAuthentication(LdapAuthenticationProvider.java:178)
... 32 more
Caused by: javax.naming.CommunicationException: simple bind failed: mydomain.mycompany.net:636 [Root exception is javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty]
at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:219)
at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2788)
at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:319)
at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:192)
at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:210)
at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:153)
at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:83)
at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:313)
at javax.naming.InitialContext.init(InitialContext.java:244)
at javax.naming.ldap.InitialLdapContext.<init>(InitialLdapContext.java:154)
at org.springframework.ldap.core.support.LdapContextSource.getDirContextInstance(LdapContextSource.java:43)
at org.springframework.ldap.core.support.AbstractContextSource.createContext(AbstractContextSource.java:254)
... 39 more
Caused by: javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1906)
at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1889)
at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1815)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:128)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.sun.jndi.ldap.Connection.writeRequest(Connection.java:426)
at com.sun.jndi.ldap.Connection.writeRequest(Connection.java:399)
at com.sun.jndi.ldap.LdapClient.ldapBind(LdapClient.java:359)
at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:214)
... View more
11-04-2016
08:39 PM
So, with 2.4.x, I had it *all* working. Active directory users could log in tot he Ranger web UI, and I also had ranger usersync working against AD as well. Today I updated to HDP 2.5 and while the usersync seems to work (woohoo!) , I cannot for the life of me successfully log into the ranger web UI with an active-directory user. Making matters worse, while I have tried to enable ranger debugging, I get no additional information in the ranger xa log to help tell me what's going on. From: /usr/hdp/current/ranger-admin/ews/webapp/WEB-INF/log4j.xml: <category name="org.apache.ranger" additivity="false">
<priority value="debug" />
<appender-ref ref="xa_log_appender" />
</category> yet, even with debugging enabled, all I see from the logs is not super helpful: (and yes, the id and pw are correct...) 2016-11-04 14:52:54,738 [http-bio-6080-exec-1] INFO org.apache.ranger.security.listener.SpringEventListener (SpringEventListener.java:87) - Login Unsuccessful:kbrodie | Ip Address:xxx.xxx.xxx.xxx | Bad Credentials The last time I saw this type of error, it was when I was originally setting up ranger some time ago (under 2.4), and was having ssl issues with my imported certificate. Enabling debugging helped figure that out. But now... I'm not getting any more useful info when debugging is enabled.... I'm stuck. any ideas? Thank you much in advance.
... View more
Labels:
09-30-2016
04:46 PM
This is the most correct answer..! I *thought* I was using one java, but discovered ranger was using another java that ambari itself installed in a totally different location. I added a ton of commentary below however for others to benefit from.
... View more
09-30-2016
04:45 PM
OK, I finally figured this out. The problem apparently is not so much ranger itself, it's how the horton ambari fires off processes on the nodes via ITS OWN JAVA in most or many cases.. When investigating, I had assumed all along that I had one java jdk on the servers. Apparently I did not notice that in choosing jdk 1.8 in the ambari setup, it downloads an installs ITS OWN jdk and stuffs it deep under /usr/lib64. I assumed incorrectly I was picking the option of "what java I am already using"... Here's the java I installed, and is what is set up to be the default java via /etc/alternatives: /usr/java/jdk1.8.0_65/jre/bin/java
/usr/java/jdk1.8.0_65/bin/java But here is the java that ambari installed, and uses: /usr/jdk64/jdk1.8.0_60/jre/bin/java
/usr/jdk64/jdk1.8.0_60/bin/java which explains why (at the command level), my ldap ssl tests were working fine-- it found and used the proper cacerts I had stufed the certificate into, using the system-default jdk, and worked. But... ranger did not. Because ranger uses the java that ambari has, and NOT the system default. This is what took me a while to figure out. (all of this confusion could have been avoided if, when installing ambarti, I picked a "custom" JDK.... In that case each system would have and use only ONE java, period).
Adding to my issue is that the ranger usersync and ranger admin both use different bits; and I *think* my usersync was probably working (because I had added the cert into the ranger-specific truststore....), but I could not log into the ranger admin console because well... that does NOT use the ranger-specific truststore, and defaults of course to the AMBARI-INSTALLED-JAVA version of cacerts, and not the cacerts I *thought* it was using. (the ranger admin truststore issue is resolved by editing /usr/hdp/current/ranger-admin/ews/ranger-admin-services.sh and adding: -Djavax.net.ssl.trustStore=<path to preferred cacerts location> as a java option )
I think my only complaint I have in this whole deal, is that in the ranger config screens, while there's a truststore option listed for the usersync part, there's NOT a truststore config option listed for the ranger ADMIN part, as they are different things.
... View more
09-28-2016
09:19 PM
1 Kudo
Hi all! I am HOPING this is a simple question. So, I initially got our ranger-to-active-directory user sync working. (Standard LDAP url, port 389). And now per our corporate IT guys, I need to move this to an SSL connection. OK, I'm game...!
I initially tested the LDAPS connection using the 'ldaptool' that's included with ranger for testing things. Quickly uncovered that I needed a certificate from the AD guys. Got it, and I easily got it installed into the system standard location for such things, the java "cacerts" keystore. (/usr/java/latest/jre/lib/security/cacerts). And... voila, the ldaptool worked. Woo! almost there.
So, in the ranger config, I changed the ldap url from LDAP://companyADSERVER.org:389 to LDAPS://companyADSERVER.org:636. This should work, because I already have the specific certificate already imported into the standard cacerts location on the server that hosts my ranger goodies.
Nope. I'm getting the following: Caused by: javax.naming.CommunicationException: simple bind failed: mcwdc1.mcwcorp.net:636 [Root exception is javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target]
at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:219) Whish is of course, java's way of telling me that it can't find the certificate in whatever keystore it's using. So, I did some digging, and found this! <property>
<name>ranger.usersync.truststore.file</name>
<value>/usr/hdp/current/ranger-usersync/conf/mytruststore.jks</value>
</property> ...and so I then took the certificate, and added it into THAT keystore. But, I am still getting the exact same error. I am pretty sure I'm THIS -> <- close. Any help would be much appreciated!
... View more
Labels:
- Labels:
-
Apache Ranger
09-26-2016
06:15 PM
Awesome. OK, that's what I thought- I appreciate the quick reply (!). The documentation on all of this stuff out there isn't super clear for us newbies in this space 🙂
... View more