About ddharam

ddharam · ‎12-31-2016

Issue: In a heavily utilized kafka cluster AMS will keep on crashing with error "ERROR org.apache.hadoop.hbase.client.AsyncProcess: Cannot get replica 0 location for {"totalColumns":5,"row":"kafka.server.FetcherLagMetrics." Solution: 1. Run the following command to gather the amount of metrics being collected: curl http://<Ambari-metrics-collector-host>:6188/ws/v1/timeline/metrics/metadata 2. From Ambari UI -> Kafka -> Configs -> Fliter search for: "external.kafka.metrics.exclude.prefix" 3. Add the following at the end: kafka.log.Log 4. Restart Kafka. This will exclude additional metrics from getting captured and will increase the stability of the AMS.

ddharam · ‎12-31-2016

Error: Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '115' for key 'PRIMARY' ROOT CAUSE: corrupted ambari db-- seeing following errors in Hive View Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '115' for key 'PRIMARY' SOLUTION: we truncated the following tables from amabri db DS_FILERESOURCEITEM_1 | | DS_JOBIMPL_3 | | DS_SAVEDQUERY_6 | | DS_STOREDOPERATIONHANDLE_5 | | DS_TESTBEAN_4 | | DS_UDF_2 Restart ambari server and we are able to access views and run query

ddharam · ‎12-31-2016

Issue: Solr service check fails from ambari when Namenode HA is enabled with error: Unable to create core [collection1_shard2_replica1] Caused by: Connection refused Solution 1: add this line "SOLR_HDFS_CONFIG=/etc/hadoop/conf" to solr-config-env content at the end of file Restart Solr and now the service checks should pass. Solution 2: Another workaround is to edit the following file on your Ambari Server: /var/lib/ambari-server/resources/mpacks/solr-ambari-mpack-5.5.2.2.5/common-services/SOLR/5.5.2.2.5/package/scripts/solr.py, and then make this change: https://github.com/lucidworks/solr-stack/commit/7b79894b37b862b86d80c64b34230bc9fed6e54a. Then restart the Ambari Server. With this change in place the Solr instances will be started and will work fine with NN HA. This will be fixed in HDP 2.6

ddharam · ‎12-31-2016

Issue: If there is a custom PID location configured for services and a non standard service account other than user "ranger", before finalizing upgrade, Ambari shows Ranger service as stopped. Solution: Confirm that the Ranger process is running: ps -ef | grep ranger The pid and the chown values are hard coded in the /usr/bin/ranger*. From this location, start and stop are called. The pidf and the chown values are hard coded in the /usr/bin/ranger-admin from where start and stop are called if you invoke the same from ambari UI. after changing those values to custom parameters ambari reported the service as running from UI. Previous value: cd /usr/bin cat ranger-admin | grep -i pid pidf=/var/run/ranger/rangeradmin.pid New value: vi ranger-admin pidf=<custom location> Save and quit After changing these values restart ranger service from ambari

ddharam · ‎12-31-2016

Error: ERROR [2016-12-13 00:48:04,166] ({pool-2-thread-2} Job.java[run]:189) - Job failed java.lang.NoClassDefFoundError: org/apache/hadoop/security/UserGroupInformation$AuthenticationMethod at org.apache.zeppelin.jdbc.security.JDBCSecurityImpl.getAuthtype(JDBCSecurityImpl.java:66) .... Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.security.UserGroupInformation$AuthenticationMethod at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 15 more Solution: For systems in hdp cluster with no internet access local-repo folder doesnt get created when running zeppelin interpreters which hold the jar files needed for the interpreters. To circumvent this we would need to copy the following jars to a temporary location and then specify these in the interpreter setting from zeppelin UI create and copy all required jars to following "/usr/hdp/current/zeppelin-server/jarss/" /usr/hdp/current/zeppelin-server/jarss/hive-jdbc-2.0.1-standalone.jar /usr/hdp/current/zeppelin-server/jarss/hadoop-common-2.7.3.2.5.0.0-1245.jar /usr/hdp/current/zeppelin-server/jarss/hive-shims-0.23-2.1.0.2.5.0.0-1245.jar /usr/hdp/current/zeppelin-server/jarss/commons-configuration-1.10.jar /usr/hdp/current/zeppelin-server/jarss/hadoop-auth-2.7.3.2.5.0.0-1245.jar /usr/hdp/current/zeppelin-server/jarss/curator-client-2.7.1.jar /usr/hdp/current/zeppelin-server/jarss/curator-framework-2.7.1.jar /usr/hdp/current/zeppelin-server/jarss/zookeeper-3.4.6.2.5.0.0-1245.jar /usr/hdp/current/zeppelin-server/jarss/commons-lang3-3.3.2.jar Specify the complete path to the jar under Zeppelin UI --> Interpreter --> Jdbc

ddharam · ‎12-31-2016

Change the livy.spark.master to yarn cluster and add the following environment variable in the zeppelin-env from ambari export PYSPARK_DRIVER_PYTHON=path_to_python2.7 export PYSPARK_PYTHON=path_to_python2.7 After this restart livy spark interpreter started to work.

ddharam · ‎12-31-2016

Error: %jdbc (Hive) java.util.ServiceConfigurationError: javax.xml.parsers.DocumentBuilderFactory: Provider org.apache.xerces.jaxp.DocumentBuilderFactoryImpl not found at java.util.ServiceLoader.fail(ServiceLoader.java:239) at java.util.ServiceLoader.access$300(ServiceLoader.java:185) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:294) at java.security.AccessController.doPrivileged(Native Method) at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289) at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2549) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2526) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2418) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115) SOLUTION: Copy the following jars : xercesImpl*.jar and xml-apis* For eg: Can be found by running locate xercesImpl* on any of the node in he cluster copy /usr/hdp/<hadoop-version>/hadoop/client/xercesImpl-2.9.1.jar to: /usr/hdp/current/zeppelin-server/interpreter/jdbc/ Restart the interpreter

ddharam · ‎12-31-2016

Error: at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:302) at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:120) at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693) ... 43 more Caused by: KrbException: Identifier doesn't match expected value (906) at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140) at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65) at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60) at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55) ... 49 more 16/11/29 13:13:12 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. Cause: The cause for this issue was that there were multiple accounts in the Active Directory that had a servicePrincipalName value containing the Zookeeper principal names - "zookeeper/<hostname>". This was found by issuing an ldapsearch like: ldapsearch -h <host> -D <user principal> -W -b "<bind dn - something high in the tree>" '(servicePrincipalName=zookeeper/<zk server hostname>)' dn This request found 2 accounts that contained the requested SPN. One way to know this may be an issue is after authenticating (kinit-ing) as any valid user, issue a kvno command like kvno zookeeper/abc.ambari.apache.org If this fails but a different service principal (like nn/abc.ambari.apache.org) succeeds, then the above cause may be the problem. Solution: Find all duplicated SPN values and remove the non-Ambari-managed ones from the Active Directory. Then restart all of the services. Optionally all of the Keytab files can be regenerated to make sure all is in a good state.

Online	Offline
Last Visited	‎10-21-2025 12:24 PM

Member Since	‎01-07-2016 05:14 PM
Last Visited	‎10-21-2025 12:24 PM
Posts	33
Kudos received	14

Cloudera Community

How to limit kafka metrics so as to not crash the ...

Unbale to access hive view with error "Caused by: ...

Solr service check fails if NN HA is enabled with ...

During hdp upgrade Ranger PID file not being read ...

Jdbc hive interpreter failing to run with error " ...

How to get Spark Interpreter working with custom p...

jdbc hive interpreter fails while running hive que...

Some services failing to come up after enabling ke...