Member since
12-02-2015
42
Posts
28
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1531 | 04-03-2018 03:30 AM | |
938 | 04-25-2017 09:27 PM | |
6606 | 03-22-2017 04:45 PM |
06-27-2018
11:54 PM
1 Kudo
Current version of Ambari doesn't monitor Number of Hiveserver2 connections. We often see Hiveserver2 slowness due to heavy load as part increase in number connections to Hiveserver2. Setting up a alert for Hiveserver2 established connections will help us to take required actions like adding additional Hiveserver2 services, proper load balancing or scheduling the jobs. NOTE : Please go through this article https://github.com/apache/ambari/blob/2.6.2-maint/ambari-server/docs/api/v1/alert-definitions.md to understand Ambari Alert Definition Please find the python script and .json file used below in the attachments.
alert_hiveserver_num_connection.py - Is the python script that finds the current established connection for each Hiveserver2 and based on number of connection it returns 'CRITICAL', 'WARN', 'OK’ alerths.json - Is the Ambari Alert definition Below are the steps to setup the Ambari Alert on Hiveserver2 Established connections. Step 1 - Place the file “alert_hiveserver_num_connection.py" in the following path on the ambari-server : "/var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/alerts/ " [root@vb-atlas-ambari tmp]# cp alert_hiveserver_num_connection.py /var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/alerts/ Step 2 - Restart Ambari Server, to force Ambari agents to pull alert_hiveserver_num_connection.py python script to every host. ambari-server restart Once Ambari Server is restarted , we can verify if alert_hiveserver_num_connection.py is available in " /var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/ " location on Hiveserver2 host.
Note : Some time it takes longer for Ambari agent to pull the script from Ambari server. [root@vb-atlas-node1 ~]# ll /var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/
total 116
-rw-r--r--. 1 root root 9740 Jun 27 17:01 alert_hive_interactive_thrift_port.py
-rw-r--r--. 1 root root 7893 Jun 27 17:01 alert_hive_interactive_thrift_port.pyo
-rw-r--r--. 1 root root 9988 Jun 27 17:01 alert_hive_metastore.py
-rw-r--r--. 1 root root 9069 Jun 27 17:01 alert_hive_metastore.pyo
-rw-r--r--. 1 root root 1888 Jun 27 17:01 alert_hiveserver_num_connection.py
-rw-r--r--. 1 root root 11459 Jun 27 17:01 alert_hive_thrift_port.py
-rw-r--r--. 1 root root 9362 Jun 27 17:01 alert_hive_thrift_port.pyo
-rw-r--r--. 1 root root 11946 Jun 27 17:01 alert_llap_app_status.py
-rw-r--r--. 1 root root 9339 Jun 27 17:01 alert_llap_app_status.pyo
-rw-r--r--. 1 root root 8886 Jun 27 17:01 alert_webhcat_server.py
-rw-r--r--. 1 root root 6563 Jun 27 17:01 alert_webhcat_server.pyo
Step 3 - Post the Alert Definition (alerths.json) to Ambari using curl curl -u <Ambari_admin_username>:<Amabri_admin_password> -i -h 'X-Requested-By:ambari' -X POST -d @alerths.json http://<AMBARI_HOST>:<AMBARI_PORT>/api/v1/clusters/<CLUSTER_NAME>/alert_definitions Example : [root@vb-atlas-ambari tmp]# curl -u admin:admin -i -H 'X-Requested-By:ambari' -X POST -d @alerths.json http://172.26.108.142:8080/api/v1/clusters/vinod/alert_definitions
HTTP/1.1 201 Created
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Cache-Control: no-store
Pragma: no-cache
Set-Cookie: AMBARISESSIONID=10f33laf224yy1834ygq9cekbo;Path=/;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
User: admin
Content-Type: text/plain
Content-Length: 0
We should be able to see the Alert in Ambari -> Alerts ( HiveServer2 Established Connections) Alternative we can also see the "HiveServer2 Established Connections" listed in Alert definitions “ http://<AMBARI_HOST>:<AMBARI_PORT>/api/v1/clusters/<CLUSTER_NAME>/alert_definitions “ Step 4 - As per Alert Definition (alerths.json) CRITICAL alert is set to 50 and WARNING is set to 30 connections by default. You can update the values directly from Ambari by editing the values.
... View more
Labels:
06-14-2018
05:35 PM
2 Kudos
In DPS-1.1.0 We can't remove cluster from DPS UI. We can use the curl to remove the cluster. Note : User with Dataplane Admin Role can perform the below. screen-shot-2018-06-14-at-102253-am.png To delete smayani-hdp cluster Step1:- Find the cluster ID, which you want to remove. You can use the developers tools from the browser to find the cluster ID. screen-shot-2018-06-14-at-101629-am.png From above example smayani-hdp cluster ID is: 3 ( https://172.26.125.109/api/lakes/3/servicesDetails ) Step 2 :- From the console use the below curl to remove the cluster. curl -k -u <username>:<Password> -X DELETE https://<DPS_HOST>/api/lakes/<cluster_ID>; Example : curl -k -u admin:kjncsadasdcsdc -X DELETE https://172.26.125.109/api/lakes/3 Once the above is executed you should no longer see the cluster in UI. screen-shot-2018-06-14-at-102509-am.png Alternative You can also use rm_dp_cluster.sh in /usr/dp/current/core/bin on DPS installed server. Usage: ./rm_dp_cluster.sh DP_JWT HADOOP_JWT DP_HOST_NAME CLUSTER_NAME DATA_CENTER_NAME DP_JWT: Value of dp_jwt cookie from a valid user's browser session HADOOP_JWT: Value of hadoop-jwt cookie from a valid user's browser session
DP_HOST_NAME: Hostname or IP address of the DataPlane server CLUSTER_NAME: Name of the cluster to delete
DATA_CENTER_NAME: Name of the datacenter cluster belongs to You can use developers tool to use find the cookies ( DP_JWT, HADOOP_JWT)
... View more
Labels:
06-06-2018
09:35 PM
OBJECTIVE: Updating the log configs of DPS App. Example default log file is set to logs/application.log which can be changed or Updating the log level to DEBUG for troubleshooting. Since DP App will be running in docker we can use docker commands to update them. STEPS: 1. Find the docker container running DP App on the host running DPS. Use "docker ps" [root@dps-node ~]#docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
abd412417907 hortonworks/dlm-app:1.1.0.0-41 "runsvdir /etc/sv" 28 hours ago Up 2 hours 9011/tcp dlm-app
62620e578e31 hortonworks/dp-app:1.1.0.0-390 "/bootstrap.sh" 2 days ago Up 16 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 9000/tcp dp-app
38dda17dfdf4 hortonworks/dp-cluster-service:1.1.0.0-390 "./docker_service_st…" 2 days ago Up 2 days 9009-9010/tcp
Copy the container ID, from above example it is "62620e578e31" 2. Get the current logback.xml file [root@dps-node ~]# docker exec -it 62620e578e31 /bin/cat /usr/dp-app/conf/logback.xml > logback.xml 3. Update the configs in local logback.xml which we redirected in above command. In below I have updated the location from default logs/application.logs to /usr/dp-app/logs/. <configuration>
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>/usr/dp-app/logs/application.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<!-- Daily rollover with compression -->
.
.
<appender name="AKKA" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>/usr/dp-app/logs/akka.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
.
.
.
</encoder>
</appender>
<appender name="ACCESS_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>/usr/dp-app/logs/access.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
.
.
We can also update the log level . <root level="DEBUG">
<appender-ref ref="FILE"/>
</root>
4. If needed make a backup of the original logback.xml file and cp the updated logback.xml [root@dps-node ~]#docker exec -it 62620e578e31 /bin/cp /usr/dp-app/conf/logback.xml /usr/dp-app/conf/logback.xml.bck
[root@dps-node ~]# docker exec -i 62620e578e31 tee /usr/dp-app/conf/logback.xml < logback.xml 5. Restart the docker container is required to make changes effective. [root@dps-node ~]# docker restart 62620e578e31 6. You can verify if the changes have updated. [root@dps-node ~]# docker exec -it 62620e578e31 /bin/ls -lrt /usr/dp-app/logs
total 64
-rw-r--r-- 1 root root 0 Jun 6 20:50 access.log
-rw-r--r-- 1 root root 62790 Jun 6 21:27 application.log
... View more
Labels:
06-04-2018
05:52 PM
1 Kudo
Short Description: Describes ways to manually regenerate keytabs for services through Ambari REST API Article Make sure KDC credentials are added to Ambari credentials store. You can follow this Article to perform. Once KDC credentials are added. You can use the below Ambari's REST API to regenerate keytabs. curl -H "X-Requested-By:ambari" -u <Ambari_Admin_username>:<Amabri_Admin_password> -X PUT -d '{ "Clusters": { "security_type" : "KERBEROS" } }' http://<Ambari_HOST>:8080/api/v1/clusters/<Cluster_Name>/?regenerate_keytabs=all Example : curl -H "X-Requested-By:ambari" -u admin:admin -X PUT -d '{ "Clusters": { "security_type" : "KERBEROS" } }' http://172.26.108.142:8080/api/v1/clusters/vinod/?regenerate_keytabs=all&ignore_config_updates=true Once the Keytabs are regenerated it requires Service restart to use the newly generated keytabs.
... View more
Labels:
04-23-2018
10:19 PM
Problem Description Atlas uses Solr to store Lineage meta information and uses Zookeeper for co-ordination and store/maintain configuration. Due to heavy load on Zookeeper on larger cluster we need to increase timeout for ZK session for some services from default. One such config is for Ambari infra (Solr) zookeeper time on Atlas side. ERROR: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper vb-hortonwork.com:2181/infra-solr within 15000 ms java.util.concurrent.TimeoutException: Could not connect to ZooKeeper vb-hortonwork.com:2181/infra-solr within 15000 ms RESOLUTION: We can increase the session timeout from default 15000 by adding below properties in custom application-properties in atlas -> config atlas.graph.index.search.solr.zookeeper-connect-timeout=60000 atlas.graph.index.search.solr.zookeeper-session-timeout=60000
... View more
Labels:
04-03-2018
03:30 AM
1 Kudo
@Saikiran Parepally Its been fixed in HDF-3.1. Please use nifi.web.proxy.host property to add the hosts.
... View more
01-26-2018
06:05 PM
2 Kudos
Generally we are used to use chown/chmod to change permissions. When we try chown/chmod on a directory which contains some x million objects it takes very long sometime even days. So to reduce the time and make the changes in one command instead of 2 chown and chmod, you can use DistCh faster then regular chown and chmod. Below is the command to use DistCh hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-extras.jar org.apache.hadoop.tools.DistCh java org.apache.hadoop.tools.DistCh [OPTIONS] <path:owner:group:permission>
The values of owner, group and permission can be empty.
Permission is a octal number.
OPTIONS:
-f <urilist_uri> Use list at <urilist_uri> as src list
-i Ignore failures
-log <logdir> Write logs to <logdir>
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
... View more
Labels:
11-08-2017
08:41 PM
4 Kudos
Short Description: Spark Hbase Connector (SHC) is currently hosted in Hortonworks
repo and published as spark package.
Below is simple example how to access Hbase table in Spark shell and Load the data into DataFrame. Once data is
in Dataframe we can use SqlContext to run queries on the DataFrame. Article The documentation here leaves out a few pieces in order access HBase tables using SHC with spark shell. Here is the Example accessing Hbase "emp" table in Spark shell. Hbase Shell Create a simple "emp" Hbase table using Hbase shell and insert sample data create 'emp', 'personal data', 'professional data'
put 'emp','1','personal data:name','raju'
put 'emp','1','personal data:city','hyderabad'
put 'emp','1','professional data:designation','manager'
put 'emp','1','professional data:salary','50000'
Once created exit Hbase shell and run spark shell providing SHC package and hbase-site.xml /usr/hdp/current/spark-client/bin/spark-shell --packages zhzhan:shc:0.0.11-1.6.1-s_2.10 --files /etc/hbase/conf/hbase-site.xml Import the required classes scala> import org.apache.spark.sql.{SQLContext, _}
import org.apache.spark.sql.{SQLContext, _}
scala> import org.apache.spark.sql.execution.datasources.hbase._
import org.apache.spark.sql.execution.datasources.hbase._
scala> import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.{SparkConf, SparkContext}
Define the Hbase schema for mapping the table, rowkey
also been defined as a column (empNumber) which has a specific
cf (rowkey). scala> def empcatalog = s"""{
"table":{"namespace":"default", "name":"emp"},
"rowkey":"key",
"columns":{
"empNumber":{"cf":"rowkey", "col":"key", "type":"string"},
"city":{"cf":"personal data", "col":"city", "type":"string"},
"empName":{"cf":"personal data", "col":"name", "type":"string"},
"jobDesignation":{"cf":"professional data", "col":"designation", "type":"string"},
"salary":{"cf":"professional data", "col":"salary", "type":"string"}
}
}""".stripMargin
Perform DataFrame operation on top of HBase table, First we define and then load data into Dataframe. scala> def withCatalog(empcatalog: String): DataFrame = {
sqlContext
.read
.options(Map(HBaseTableCatalog.tableCatalog->empcatalog))
.format("org.apache.spark.sql.execution.datasources.hbase")
.load()
}
withCatalog: (empcatalog: String)org.apache.spark.sql.DataFrame
scala> val df = withCatalog(empcatalog)
df: org.apache.spark.sql.DataFrame = [city: string, empName: string, jobDesignation: string, salary: string, empNumber: string]
scala> df.show
17/11/08 18:04:22 INFO RecoverableZooKeeper: Process identifier=hconnection-0x55a690be connecting to ZooKeeper ensemble=vb-atlas-node1.hortonworks.com:2181,vb-atlas-node2.hortonworks.com:2181,vb-atlas-ambari.hortonworks.com:2181
17/11/08 18:04:22 INFO ZooKeeper: Client environment:zookeeper.version=3.4.6-8--1, built on 04/01/201
.
.
.
17/11/08 18:04:24 INFO DAGScheduler: ResultStage 0 (show at <console>:39) finished in 1.011 s
17/11/08 18:04:24 INFO DAGScheduler: Job 0 finished: show at <console>:39, took 1.230151 s
+---------+-------+--------------+------+---------+
| city|empName|jobDesignation|salary|empNumber|
+---------+-------+--------------+------+---------+
| chennai| ravi| manager| 50000| 1|
|hyderabad| raju| engineer| null| 2|
| delhi| rajesh| jrenginner| null| 3|
+---------+-------+--------------+------+---------+
We can query using sqlContext on the dataframe. scala> df.registerTempTable("table")
scala>sqlContext.sql("select empNumber,jobDesignation from table").show
+---------+--------------+
|empNumber|jobDesignation|
+---------+--------------+
| 1| manager|
| 2| engineer|
| 3| jrenginner|
+---------+--------------+
Reference : https://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/ https://github.com/hortonworks-spark/shc/blob/master/examples/src/main/scala/org/apache/spark/sql/execution/datasources/hbase/HBaseSource.scala
... View more
Labels:
09-06-2017
03:25 PM
@Nick Price Which version ?
... View more
05-18-2017
03:02 PM
@Juan Manuel Nieto Yes, if its kerberized environment you need to provide the keytab to authenticate. Since you are using shell-action you can use kinit too.
... View more