Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3329 | 02-12-2020 03:17 PM |
12-21-2016
05:07 PM
1 Kudo
SYMPTOM: beeline connection to hiveserver2 is failing with the following exception intermittently. "The initCause method cannot be used. To set the cause of this exception, use a constructor with a Throwable[] argument." further looking at the HiveServer2 logs we saw stack trace like this ava.sql.SQLException: Could not retrieve transation read-only status server
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2259)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getProductName(MetaStoreDirectSql.java:171)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.determineDbType(MetaStoreDirectSql.java:152)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:122)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:300)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:263)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy7.setConf(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.setConf(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:56)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5798)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 35 more
Caused by: java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1086)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3972)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3943)
at com.jolbox.bonecp.ConnectionHandle.isReadOnly(ConnectionHandle.java:867)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:422)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getNucleusConnection(RDBMSStoreManager.java:1382)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2245)
... 51 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 381,109 milliseconds ago. The last packet sent successfully to the server was
381,109 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3988)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2598)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2828)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2777)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1651)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3966)
... 56 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3969)
ROOT CAUSE: After analyzing TCP DUMP we found that it's MySQL server who is dropping network packets very frequently. WORKAROUND: NA RESOLUTION: need to check and fix the network issue between HiveServer2 host and MySQL host.
... View more
Labels:
12-21-2016
04:17 PM
2 Kudos
SYMPTOM: HiveServer2 remains in hung state, jstack reveals the following trace. "HiveServer2-Handler-Pool: Thread-139105" prio=10 tid=0x00007ff34e080800 nid=0x3d43e in Object.wait() [0x00007ff30974e000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1417)
- locked <0x00000003e1c5f298> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy23.checkAccess(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.checkAccess(ClientNamenodeProtocolTranslatorPB.java:1469)
at sun.reflect.GeneratedMethodAccessor93.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy24.checkAccess(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.checkAccess(DFSClient.java:3472)
at org.apache.hadoop.hdfs.DistributedFileSystem$53.doCall(DistributedFileSystem.java:2270)
at org.apache.hadoop.hdfs.DistributedFileSystem$53.doCall(DistributedFileSystem.java:2267)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.access(DistributedFileSystem.java:2267)
at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.shims.Hadoop23Shims.checkFileAccess(Hadoop23Shims.java:1006)
at org.apache.hadoop.hive.common.FileUtils.checkFileAccessWithImpersonation(FileUtils.java:378)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:417)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:752)
at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
at org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:837)
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:628)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:504)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
- locked <0x0000000433e8e4b8> (a java.lang.Object)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:400)
at sun.reflect.GeneratedMethodAccessor148.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy37.executeStatement(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:263)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: hive.security.authorization.enabled is true and user is running a create external table command with a non-existent directory, so authorizer will check parent directory and all the files/directory inside it recursively.the issue is reported in https://issues.apache.org/jira/browse/HIVE-10022. WORKAROUND: Restart hiveserver2 and create a table in a directory which has few file under it. RESOLUTION: the fix for this is available as HOTFIX-332, if you are using Ranger based authorization then please get the fix for RANGER-1126.
... View more
Labels:
12-21-2016
02:21 PM
3 Kudos
ENV: HDP-2.5 Java : openjdk version "1.8.0_111" the following storm topology consist of a KafkaSpout and a SinkTypeBolt Step 1: Create pom.xml with following dependencies <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>hadoop</groupId>
<artifactId>KafkaSpoutStorm</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>stormkafka</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>HDPReleases</id>
<name>HDP Releases</name>
<url>http://repo.hortonworks.com/content/repositories/public</url>
<layout>default</layout>
</repository>
<repository>
<id>HDPJetty</id>
<name>Hadoop Jetty</name>
<url>http://repo.hortonworks.com/content/repositories/jetty-hadoop/</url>
<layout>default</layout>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>1.0.1.2.5.3.0-37</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.10.0.2.5.3.0-37</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>1.0.1.2.5.3.0-37</version>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-hdfs</artifactId>
<version>1.0.1.2.5.3.0-37</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.4</version>
<configuration>
<createDependencyReducedPom>true</createDependencyReducedPom>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.rajkrrsingh.storm.Topology</mainClass>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>2.4</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
<resources>
<resource>
<directory>src/main/java</directory>
<includes>
<include> **/*.properties</include>
</includes>
</resource>
</resources>
</build>
</project>
Step 2: clone the git repo to get the complete code git clone https://github.com/rajkrrsingh/KafkaSpoutStorm.git Step 3: modify default_config.properties according to your cluster Step 4: build using maven, this will create a fat jar in target folder mvn clean package
Step 5: Now Run it on storm cluster storm jar KafkaSpoutStorm-0.0.1-SNAPSHOT.jar com.rajkrrsingh.storm.Topology
... View more
Labels:
12-20-2016
02:19 PM
2 Kudos
while running the sqoop command from the java program with -verbose option can result into race condition during obtaining lock on the console appender.we can workaround this with the help of SSHXCUTE framework which will create java program and sqoop command context separately. ENV: HDP 2.4 Java Version : JDK-8 Step 1: download sshxcute jar from https://sourceforge.net/projects/sshxcute/ Step 2: Create RunSqoopCommand.java import net.neoremind.sshxcute.core.SSHExec;
import net.neoremind.sshxcute.core.ConnBean;
import net.neoremind.sshxcute.task.CustomTask;
import net.neoremind.sshxcute.task.impl.ExecCommand;
public class RunSqoopCommand {
public static void main(String args[]) throws Exception{
ConnBean cb = new ConnBean("localhost", "root","hadoop");
SSHExec ssh = SSHExec.getInstance(cb);
ssh.connect();
CustomTask sqoopCommand = new ExecCommand("sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true
-Dmapred.job.name=test --connect jdbc:oracle:thin:@10.0.2.12:1521:XE
--table TEST_INCREMENTAL -m 1 --username system
--password oracle --target-dir
/tmp/test26
--verbose");
ssh.exec(sqoopCommand);
ssh.disconnect();
}
}
Step 3: compile program javac -cp sshxcute-1.0.jar RunSqoopCommand.java Step 4: Run program java -cp sshxcute-1.0.jar RunSqoopCommand
... View more
Labels:
12-20-2016
01:59 PM
3 Kudos
These are the steps to build and run spark streaming application, it was built and tested on HDP-2.5 setup: ENV: HDP2.5 scala : 2.10.4 sbt: 0.13.11 mkdir spark-streaming-example
cd spark-streaming-example/
mkdir -p src/main/scala
cd src/main/scala sample code: import org.apache.spark.rdd.RDD
import org.apache.spark.sql.{SQLContext, SaveMode}
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.Seconds
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.streaming.Time;
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.{Time, Seconds, StreamingContext}
import org.apache.spark.util.IntParam
import org.apache.spark.sql.SQLContext
import org.apache.spark.storage.StorageLevel
object SqlNetworkWordCount {
def main(args: Array[String]) {
val sparkConf = new SparkConf().setAppName("SqlNetworkWordCount")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
// Convert RDDs of the words DStream to DataFrame and run SQL query
words.foreachRDD((rdd: RDD[String], time: Time) => {
val sqlContext = SQLContextSingleton.getInstance(rdd.sparkContext)
import sqlContext.implicits._
val wordsDataFrame = rdd.map(w => Record(w)).toDF()
wordsDataFrame.write.mode(SaveMode.Append).parquet("/tmp/parquet");
})
ssc.start()
ssc.awaitTermination()
}
}
case class Record(word: String)
object SQLContextSingleton {
@transient private var instance: SQLContext = _
def getInstance(sparkContext: SparkContext): SQLContext = {
if (instance == null) {
instance = new SQLContext(sparkContext)
}
instance
}
} cd - vim build.sbt name := "Spark Streaming Example" version := "1.0" scalaVersion := "2.10.4" libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "1.4.1","org.apache.spark" %% "spark-streaming" % "1.4.1") *Now run sbt package from project home and it will build a jar inside target/scala-2.10/spark-streaming-example_2.10-1.0.jar
*Run this jar using spark-submit
#bin/spark-submit --class TestStreaming target/scala-2.10/spark-streaming-example_2.10-1.0.jar hostname 6002 to test this program open a different terminal and run nc -lk `hostname` 6002 hit enter and
type anything on console while will display on the spark console.
... View more
Labels:
12-19-2016
12:40 PM
4 Kudos
During debugging a problem with delete topic,I dig into Kafka code to know how delete command works, this the sequence of event occurred during command execution 1. TopicCommand issues topic deletion /usr/hdp/current/kafka-broker/bin/kafka-run-class.sh kafka.admin.TopicCommand --zookeeper rkk3.hdp.local:2181 --delete --topic sample 2. which create a new admin path /admin/delete_topics/<topic> 3. The controller listens for child changes on /admin/delete_topic and starts topic deletion for the respective topics 4. The controller has a background thread that handles topic deletion. A topic's deletion can be started only by the onPartitionDeletion callback on the controller. 5. Once DeleteTopicsThread invoked it looks for topicsToBeDeleted and for each topic it deregister partition change listener on the deleted topic, This is to prevent the partition change listener firing before the new topic listener when a deleted topic gets auto created. 6. Controller will remove this replica from the state machine as well as its partition assignment cache. 7. Deletes all topic state from the controllerContext as well as from zookeeper and finally delete /brokers/topics/<topic> path. 8. onTopicDeletion is callback by the DeleteTopicThread,This lets each broker know that this topic is being deleted and can be removed from their caches.
... View more
Labels:
12-18-2016
06:03 PM
2 Kudos
ENV: HDP 2.4.2 STEP 1: Setting up MySQL SSL # Create clean environment
shell> rm -rf newcerts
shell> mkdir newcerts && cd newcerts
# Create CA certificate
shell> openssl genrsa 2048 > ca-key.pem
shell> openssl req -new -x509 -nodes -days 3600 \
-key ca-key.pem -out ca.pem
# Create server certificate, remove passphrase, and sign it
# server-cert.pem = public key, server-key.pem = private key
shell> openssl req -newkey rsa:2048 -days 3600 \
-nodes -keyout server-key.pem -out server-req.pem
shell> openssl rsa -in server-key.pem -out server-key.pem
shell> openssl x509 -req -in server-req.pem -days 3600 \
-CA ca.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem
# Create client certificate, remove passphrase, and sign it
# client-cert.pem = public key, client-key.pem = private key
shell> openssl req -newkey rsa:2048 -days 3600 \
-nodes -keyout client-key.pem -out client-req.pem
shell> openssl rsa -in client-key.pem -out client-key.pem
shell> openssl x509 -req -in client-req.pem -days 3600 \
-CA ca.pem -CAkey ca-key.pem -set_serial 01 -out client-cert.pem STEP 2:update my.cnf as follow and restart MySQL [mysqld]
ssl-ca=/home/hive/ca-cert.pem
ssl-cert=/home/hive/server-cert.pem
ssl-key=/home/hive/server-key.pem STEP 3:grant priv to hive user mysql> GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' IDENTIFIED BY 'hive' REQUIRE SSL;
mysql> FLUSH PRIVILEGES; import client cert and key into keystore as there is no direct way to do it I have taken a help from this guide http://www.agentbob.info/agentbob/79-AB.html convert cert and pem key into DER format and import it using the java program provided at the link. STEP 4: edit hive-env.sh # specified truststore location and password with hive client opts
if [ "$SERVICE" = "hiveserver2" ]; then
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Djavax.net.ssl.trustStore=/home/hive/keystore.ImportKey -Djavax.net.ssl.trustStorePassword=importkey"
fi STEP 5: updated hive-site.xml javax.jdo.option.ConnectionURL
jdbc:mysql://sandbox.hortonworks.com/hive?createDatabaseIfNotExist=true&useSSL=true&verifyServerCertificate=false STEP 6: Restarted HS2 which is now able to connect to MySQL over SSL
... View more
Labels:
12-18-2016
12:29 PM
3 Kudos
ENV: HDP 2.5 Please follow following steps to configure 1. under Ambari-> Knox -> Advanced topologies 2. add the following snippet in advanced topologies <provider>
<role>ha</role>
<name>HaProvider</name>
<enabled>true</enabled>
<param>
<name>HIVE</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=rkk3.hdp.local:2181,rkk2.hdp.local:2181,
rkk1.hdp.local:2181;zookeeperNamespace=hiveserver2</value>
</param>
</provider>
comment out the url from hive service tag <service>
<role>HIVE</role>
<!-- <url>http://{{hive_server_host}}:{{hive_http_port}}/{{hive_http_path}}</url> -->
</service> 3. Restart KNOX 4. Open Beeline and connect to HS2 beeline
Beeline version 1.2.1000.2.5.0.0-1133 by Apache Hive
beeline> !connect jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive
Connecting to jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive
Enter username for jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive: guest
Enter password for jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive: **************
16/11/26 19:58:04 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
16/11/26 19:58:04 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated.
16/11/26 19:58:04 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://<host>:<port>/dbName;transportMode=<transport_mode_value>
16/11/26 19:58:04 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
16/11/26 19:58:04 [main]: WARN jdbc.Utils: The use of hive.server2.thrift.http.path is deprecated.
16/11/26 19:58:04 [main]: WARN jdbc.Utils: Please use httpPath like so: jdbc:hive2://<host>:<port>/dbName;httpPath=<http_path_value>
Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1133)
Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1133)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://rkk1.hdp.local:8443/> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| xx |
+-----------+--+
1 row selected (0.201 seconds)
... View more
Labels:
12-18-2016
12:29 PM
3 Kudos
while debugging kafka producer slowness, I observed following steps Kafka producer take while producing a single record to kafka broker. ENV: HDP 2.5 2 node kafka cluster, single producer producing a record to 'testtopic' which has 2 partitions with 2 replicas Kafka Producer start with the configured settings it start adding matrices sensors. update cluster metadata version which includes cluster information like broker nodes and partitions, assign version id to this cluster metadata version. Updated cluster metadata version 1 to Cluster(nodes = [Node(-2, rkk2, 6667), Node(-1, rkk1, 6667)], partitions = []) Set up and start Kafka producer I/O thread aka Sender thread. Request metadata update for topic testtopic. Producer's NetworkClient metadata request to one of the broker which consist of api_key,api_version,correlation_id and client_id. Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=0,client_id=producer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1482047450018, sendTimeMs=0) to node -1 In the response get metadata from cluster and update it's own copy of metadata, the response include broker information along with topic partitions, its leader and ISR. Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) producer serialized key and value sent as produce record to the leader of that partition, the partition is decided based on the default partitioner scheme if not configured. The default partitioning strategy has following flow while deciding partition, If a partition is specified in the record, use it If no partition is specified but a key is present choose a partition based on a hash of the key If no partition or key is present choose a partition in a round-robin fashion Producer allocate memory buffer for topic configured using batch.size Producer wake up Sender thread once the buffer is full or linger.ms reached or if it is a new batch. Sender thread create a produce request to a leader of partition like this for a produce record with a correlation_id. Created 1 produce requests: [ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.producer.internals.Sender$1@11b2c43e, request=RequestSend(header={api_key=0,api_version=1,correlation_id=1,client_id=producer-1}, body={acks=1,timeout=30000,topic_data=[{topic=testtopic,data=[{partition=1,record_set=java.nio.HeapByteBuffer[pos=0 lim=76 cap=100000]}]}]}), createdTimeMs=1482047460410, sendTimeMs=0)] once the record written successfully to brokers based on ack settings, Sender thread get the response back for correlation_id and Callback get called. Received produce response from node 1002 with correlation id 1
... View more
Labels:
12-17-2016
06:54 PM
5 Kudos
Env : HDP 2.5 2 node kafka cluster having topic name 'testtopic' with partition set as 2 and replication set as 2. I am running two consumer with consumer id 'test'. 1. what happen when Consumer start fresh Consumer NetworkClient will request metadata <- return cluster information 2016-12-17 23:21:05 DEBUG clients.NetworkClient:619 - Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=1,client_id=consumer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1481997065894, sendTimeMs=0) to node -2 2016-12-17 23:21:06 DEBUG clients.Metadata:172 - Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) Sends a GroupMetadata request to one of the brokers 2016-12-17 23:21:06 DEBUG internals.AbstractCoordinator:471 - Issuing group metadata request to broker 1001 as a response get the current coordinator 2016-12-17 23:21:11 DEBUG internals.AbstractCoordinator:484 - Group metadata response ClientResponse(receivedTimeMs=1481997071648, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@31df58ee, request=RequestSend(header={api_key=10,api_version=0,correlation_id=2,client_id=consumer-1}, body={group_id=test}), createdTimeMs=1481997066150, sendTimeMs=1481997071389), responseBody={error_code=0,coordinator={node_id=1002,host=rkk2.hdp.local,port=6667}}) node_id=1002 is designated as a coordinator now start sending JOIN_GROUP request to coordinator 2016-12-17 23:21:16 DEBUG internals.AbstractCoordinator:324 - Issuing request (JOIN_GROUP: {group_id=test,session_timeout=10000,member_id=,protocol_type=consumer,group_protocols=[{protocol_name=range,protocol_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]}) to coordinator 2147482645 2016-12-17 23:21:19 DEBUG internals.AbstractCoordinator:342 - Joined group: {error_code=0,generation_id=1,group_protocol=range,leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,members=[{member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]} the first consume who join the group will become a group leader with some leader id, in our case this is leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c, thing to notice here is that leader id and and member id is same here because this is the only consumer at this point. the leader knows about all the consumer through group coordinator(group coordinator will know all the consumer through the heartbeat mechanism of consumer handled in consumer.poll), after getting the list of all the consumer leader start partition assignment based on the pre configured policy which is by default Range partitioning(refer kafka.consumer.RangeAssignor to understand how it do assignment) leader consumer do partition assignment 2016-12-17 23:21:19 DEBUG internals.ConsumerCoordinator:219 - Performing range assignment for subscriptions {consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c=org.apache.kafka.clients.consumer.internals.PartitionAssignor$Subscription@75bbb4f4} 2016-12-17 23:21:19 DEBUG internals.ConsumerCoordinator:223 - Finished assignment: {consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c=org.apache.kafka.clients.consumer.internals.PartitionAssignor$Assignment@2919c2af} after assignment it sends the assignmet back to the coordinator which will send the respective partition to the other consumer in the group. consumer can see only the partition assign to them only 2016-12-17 23:21:19 DEBUG internals.AbstractCoordinator:403 - Issuing leader SyncGroup (SYNC_GROUP: {group_id=test,generation_id=1,member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,group_assignment=[{member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=33 cap=33]}]}) to coordinator 2147482645 2016-12-17 23:21:20 DEBUG internals.AbstractCoordinator:429 - Received successful sync group response for group test: {error_code=0,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=33 cap=33]} now consumer will start fetching from the respective partitions and do normal heartbeat process.this is the first consumer so all the partition of topic is assigned to it(testtopic-1, testtopic-0) 2016-12-17 23:21:20 DEBUG internals.ConsumerCoordinator:185 - Setting newly assigned partitions [testtopic-1, testtopic-0] 2016-12-17 23:21:20 DEBUG internals.ConsumerCoordinator:575 - Fetching committed offsets for partitions: [testtopic-1, testtopic-0] 2. now lets start the second consumer and see how it behave send metadata request 2016-12-17 23:22:10 DEBUG clients.NetworkClient:619 - Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=1,client_id=consumer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1481997130251, sendTimeMs=0) to node -1 2016-12-17 23:22:10 DEBUG clients.Metadata:172 - Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) 2016-12-17 23:22:10 DEBUG internals.AbstractCoordinator:471 - Issuing group metadata request to broker 1001 will know and connect to coordinator 016-12-17 23:22:16 DEBUG internals.AbstractCoordinator:484 - Group metadata response ClientResponse(receivedTimeMs=1481997135999, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@31df58ee, request=RequestSend(header={api_key=10,api_version=0,correlation_id=2,client_id=consumer-1}, body={group_id=test}), createdTimeMs=1481997130517, sendTimeMs=1481997135760), responseBody={error_code=0,coordinator={node_id=1002,host=rkk2.hdp.local,port=6667}}) 2016-12-17 23:22:16 DEBUG clients.NetworkClient:487 - Initiating connection to node 2147482645 at rkk2.hdp.local:6667. revoke previously assigned partition, in this case it is none 2016-12-17 23:22:21 DEBUG internals.ConsumerCoordinator:241 - Revoking previously assigned partitions [] join group request 2016-12-17 23:22:21 DEBUG internals.AbstractCoordinator:324 - Issuing request (JOIN_GROUP: {group_id=test,session_timeout=10000,member_id=,protocol_type=consumer,group_protocols=[{protocol_name=range,protocol_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]}) to coordinator 2147482645 Join group as follower (notice using follower SyncGroup with some member_id and leader_id is the same of as first consumer) 2016-12-17 23:22:23 DEBUG internals.AbstractCoordinator:342 - Joined group: {error_code=0,generation_id=2,group_protocol=range,leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_id=consumer-1-fd31194d-469c-4d9d-a66e-09ee2db44645,members=[]} 2016-12-17 23:22:23 DEBUG internals.AbstractCoordinator:392 - Issuing follower SyncGroup (SYNC_GROUP: {group_id=test,generation_id=2,member_id=consumer-1-fd31194d-469c-4d9d-a66e-09ee2db44645,group_assignment=[]}) to coordinator 2147482645 2016-12-17 23:22:24 DEBUG internals.AbstractCoordinator:429 - Received successful sync group response for group test: {error_code=0,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=29 cap=29]} After sync a new partition (testtopic-1) is assigned to this consumer 2016-12-17 23:22:24 DEBUG internals.ConsumerCoordinator:185 - Setting newly assigned partitions [testtopic-1] 2016-12-17 23:22:24 DEBUG internals.ConsumerCoordinator:575 - Fetching committed offsets for partitions: [testtopic-1]
... View more
Labels:
- « Previous
- Next »