Member since
02-08-2016
793
Posts
669
Kudos Received
85
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3067 | 06-30-2017 05:30 PM | |
3988 | 06-30-2017 02:57 PM | |
3309 | 05-30-2017 07:00 AM | |
3884 | 01-20-2017 10:18 AM | |
8401 | 01-11-2017 02:11 PM |
12-23-2016
04:56 AM
5 Kudos
SYMPTOM: User was not able to browse ambari UI after ambari server restart. Ambari version : 2.1.2 Below was the error seen in logs ERROR: 06 Jul 2016 09:40:26,505 ERROR [Stack Version Loading Thread] LatestRepoCallable:93 - Could not load the URI for stack HDP-2.1 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json (connect timed out)
06 Jul 2016 09:40:26,506 INFO [Stack Version Loading Thread] LatestRepoCallable:74 - Loading latest URL info for stack HDP-2.2 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
06 Jul 2016 09:40:28,508 ERROR [Stack VersionLoading Thread] LatestRepoCallable:93 - Could not load the URI for stack HDP-2.2 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json (connect timed out)
06 Jul 2016 09:40:28,509 INFO [Stack Version Loading Thread] LatestRepoCallable:74 - Loading latest URL info for stack HDP-2.3 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
06 Jul 2016 09:40:30,511 ERROR [Stack Version Loading Thread] LatestRepoCallable:93 - Could not load the URI for stack HDP-2.3 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json (connect timed out)
06 Jul 2016 09:40:30,511 INFO [Stack Version Loading Thread] LatestRepoCallable:74 - Loading latest URL info for stack HDP-2.0 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
06 Jul 2016 09:40:32,514 ERROR [Stack Version Loading Thread] LatestRepoCallable:93 - Could not load the URI for stack HDP-2.0 from http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json (connect timed out)
06 Jul 2016 09:40:32,514 INFO [Stack VersionL oading Thread] LatestRepoCallable:74 - Loading latest URL info for stack HDP-2.3.GlusterFS from http://s3.amazonaws.com/dev.hortonworks.com/HDP/hdp_urlinfo.json
06 Jul 2016 09:40:34,519 ERROR [Stack Version Loading Thread] LatestRepoCallable:93 - Could not load the URI for stack HDP-2.3.GlusterFS from http://s3.amazonaws.com/dev.hortonworks.com/HDP/hdp_urlinfo.json
ROOT CAUSE: This is a BUG in Ambari 2.1.2 version and below are the jira - https://hortonworks.jira.com/browse/BUG-46081 RESOLUTION: Upgrading Ambari from 2.1.2 to 2.1.2.1 resolved the issue.
... View more
Labels:
12-22-2016
07:26 PM
4 Kudos
SYMPTOM: RM is down due to below error. Earlier we were suspicion the ulimit could be culprit though we have increased it to 128K. But still no luck. ERROR: 2016-07-25 12:19:47,125 WARN security.DelegationTokenRenewer (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(873)) - Unable to add the application to the delegation token renewer. java.lang.OutOfMemoryError: unable to create new native thread. Below was few steps followed - 1. Checked the error and saw that previously the same issue and increasing ulimit resolved the issue. 2. Checked the ulimit and lsof output - $ulimit -n 131072 $lsof |grep yarn |wc 1726 15553 242741 3. Checked the heap size for yarn process which was set to 8Gb and looks good. Below error was displayed in RM out.log file Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f89641cf000, 12288, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 12288 bytes for commtting reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid56149.log
Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages failed.
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f89642d0000, 12288, 0) failed; error='Cannot allocate memory' (errno=12) Below was log in "/tmp/hs_err_pid56149.log" this looks a problem with memory allocation for threads at OS level === Stack: [0x00007f89641cf000,0x00007f89642d0000], sp=0x00007f89642ce900, free space=1022k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x99eb8a] VMError::report_and_die()+0x2ea V [libjvm.so+0x49721b] report_vm_out_of_memory(char const*, int, unsigned long, char const*)+0x9b V [libjvm.so+0x81d9ae] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0xfe V [libjvm.so+0x81da6c] os::pd_commit_memory(char*, unsigned long, bool)+0xc V [libjvm.so+0x8157fa] os::commit_memory(char*, unsigned long, bool)+0x2a V [libjvm.so+0x81bf5d] os::pd_create_stack_guard_pages(char*, unsigned long)+0x6d V [libjvm.so+0x95249e] JavaThread::create_stack_guard_pages()+0x5e V [libjvm.so+0x958de4] JavaThread::run()+0x34 V [libjvm.so+0x81f988] java_start(Thread*)+0x108 === stack suggest memory allocation (malloc) failed at OS level.check you have enough physical memory available at host. ROOT CAUSE: Collected the jstack logs for process and found that - the 'Truststore reloader thread' count is increasing which is the same issue what i earlier mentioned - https://issues.apache.org/jira/browse/YARN-5309. $grep 'Truststore reloader thread' threadDump|wc -l
14873
$ grep 'Truststore reloader thread' threadDump1|wc -l
14999
$grep 'Truststore reloader thread' threadDump2|wc -l
15063
$grep 'Truststore reloader thread' threadDump3|wc -l
15149
$grep 'Truststore reloader thread' threadDump4|wc -l
15230
$grep 'Truststore reloader thread' threadDump5|wc -l
15347 RESOLUTION: This is confirmed as BUG and patch has been provided to resolve the issue https://issues.apache.org/jira/browse/YARN-5309 https://hortonworks.jira.com/browse/BUG-63499
... View more
12-22-2016
01:39 PM
4 Kudos
SYMPTOM: User has latest HDP integrated with kerberos. While starting the datanode user gets the message: Login failure for dn/host1@EXAMPLE.NET from keytab /etc/security/keytabs/dn.service.keytab. But the principal is dn/host1.bc@EXAMPLE.NET Where host1 is the hostname of the datanode host and EXAMPLE.NET is the REALM name. ERROR: The output of klist command is as below - $klist -kt /etc/security/keytabs/dn.service.keytab
Keytab name: FILE:/etc/security/keytabs/dn.service.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
0 12/21/2016 10:38:13 dn/host1.bc@EXAMPLE.NET
In logs it shows - dn/host1@EXAMPLE.NET Where as it should show - dn/host1.bc@EXAMPLE.NET
ROOT CAUSE: This is issue with entries in /etc/host file. RESOLUTION: User has below entry in /etc/hosts file - <ipaddress> <hostname> <FQDN> <FQDN>
Now the order is changed to
<ipaddress> <FQDN> <hostname> <FQDN> Which resolved the issue.
... View more
Labels:
12-22-2016
12:13 PM
Done. Thanks
... View more
12-22-2016
05:38 AM
5 Kudos
Create the self signed certificate and add it to a keystore file using:
keytool -genkey -alias example.com -keyalg RSA -keystore keystore.jks -keysize 2048 2. List the keystore entries to verify that the certificate was added. Note that a keystore can contain multiple such certificates: keytool -list -keystore keystore.jks 3. Export this certificate from keystore.jks to a certificate file: keytool -export -alias example.com -file example.com.crt -keystore keystore.jks 4. Add this certificate to the client's truststore to establish trust: keytool -import -trustcacerts -alias example.com -file example.com.crt -keystore truststore.jks 5. Verify that the certificate exists in truststore.jks: keytool -list -keystore truststore.jks 6. Set hive.server2.thrift.sasl.qop=auth in HS2 configs Then start HiveServer2, login with user->kinit->beeline and try to connect with beeline using: !connect jdbc:hive2://<hs2_hostname>:10001/default;principal=<hive_principal>;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=<truststore_file_path>;trustStorePassword=<truststore_password>
... View more
Labels:
12-22-2016
05:15 AM
5 Kudos
SYMPTOM: Created a user in Ranger. User is visible in ranger DB but not visible in Ranger UI ERROR: Logged into mysql DB and executed below command - SELECT * FROM ranger.x_user where user_name in ('userA'); ==> This shows the user exist in x_user table. SELECT * FROM ranger.x_portal_user where user_name in ('userA'); ==> The user is also present in x_portal_user
ROOT CAUSE: Suspected corruption on Ranger DB sometimes. RESOLUTION: Executing below command resolved the issue >INSERT INTO x_portal_user_role VALUES(NULL,'2016-09-09 00:00:00','2016-09-09 00:00:00',1,1,(SELECT id FROM x_portal_user WHERE login_id='XXXX'),'ROLE_USER',1); Replace XXXX with the login_id of the user ('userA') You can replace 'ROLE_USER' with 'ROLE_SYS_ADMIN' if you want it to be an admin
... View more
Labels:
12-21-2016
07:11 PM
4 Kudos
SYMPTOM: HDP upgrade was failed on HDFS startup. Namenode was not able to start and below were log messages - ERROR:
From the detailed logs we see below error - ROOT CAUSE: The above log clearly indicates"ClassNotFound" error. Customer has integrated customer jar in hadoop which was causing the issue. RESOLUTION: There was custom jar which was already in place with Previous HDP version [located in path - /usr/hdp/2.4.3.0-37/hadoop/sas*.jar ]. Adding the jar from earlier version to the upgraded version path [ie. /usr/hdp/2.5.3.0-37/hadoop/] resolved the issue. [Note: There was custom implementation of SAS with hadoop for the setup and hence the custom jars were present in path mentioned above ie. /usr/hdp/2.4.3.0-37/hadoop/sas*.jar. Default setup never includes any custom app/jar implementation with hadoop. It usually refers or org.apache.hadoop class.]
... View more
12-22-2016
02:12 AM
@Jay SenSharma Although above mentioned steps can be more simplified (no need to create other user), they should do the needed work to get user unblock from this issue
... View more
12-20-2016
01:59 PM
5 Kudos
SYMPTOM: Access Audit logs show 6 hours behind from Central Timezone. This is related to https://issues.apache.org/jira/browse/RANGER-336 ERROR: Ranger Access Audit logs show 6 hours behind from Central Timezone.
ROOT CAUSE: This is bug - https://issues.apache.org/jira/browse/RANGER-336 RESOLUTION: Create a file name "ranger-admin-env-javaopts.sh" with below entry in path "/usr/hdp/current/ranger-admin/conf/" - export JAVA_OPTS=" ${JAVA_OPTS} -Duser.timezone=UTC" Save the file and restart Ranger admin service.
... View more
Labels:
12-20-2016
01:59 PM
5 Kudos
SYMPTOM: Ambari agent not able to register with Ambari server.
ERROR: Below is the error logs -
ERROR 2016-12-19 10:17:54,387 Controller.py:194 - Unable to connect to: https://oser402529.host.com:8441/agent/v1/register/oser402566.host.com
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 137, in registerWithServer
data = json.dumps(self.register.build(self.version))
File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 260, in iterencode
return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xac in position 928: invalid start byte
ERROR 2016-12-19 10:17:54,388 Controller.py:195 - Error:'utf8' codec can't decode byte 0xac in position 928: invalid start byte
WARNING 2016-12-19 10:17:54,388 Controller.py:196 - Sleeping for 11 seconds and then trying again
ERROR 2016-12-19 10:18:05,686 Controller.py:194 - Unable to connect to: https://oser402529.host.com:8441/agent/v1/register/oser402566.host.com
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 137, in registerWithServer
data = json.dumps(self.register.build(self.version))
File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.6/site-packages/ambari_simplejson/encoder.py", line 260, in iterencode
return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xac in position 928: invalid start byte
ROOT CAUSE: This is bug - https://hortonworks.jira.com/browse/BUG-52919
RESOLUTION: Create /usr/lib/python2.6/site-packages/sitecustomize.py file with the below content, restart ambari-agent.
import sys
sys.setdefaultencoding('utf-8’)
Restart ambari-agent.
... View more
Labels: