About rajkumar_singh

rajkumar_singh · ‎12-19-2016

@Rajendra Kalepu could you please spare some time and accept the answer if it clarify your confusion. Thanks

rajkumar_singh · ‎12-19-2016

could you please post the kafka broker startup logs which is failing after setting 'same protocol' and same port.

rajkumar_singh · ‎12-18-2016

@Giuseppe Mento just to test could you please try after updating line builder.setBolt("hbase-bolt", new HBaseBolt("driver_dangerous_event", mapper).withConfigKey("HBCONFIG")).shuffleGrouping("pre-hive"); with this line builder.setBolt("hbase-bolt", new HBaseBolt("driver_dangerous_event", mapper).withConfigKey("HBCONFIG")).fieldsGrouping("pre-hive",new Fields("row"));

rajkumar_singh · ‎12-18-2016

after looking at the attached program it seems that you are not initializing the configuration object properly. you need to do something like this to create configuration object Configuration conf =HBaseConfiguration.create(); conf.set("hbase.zookeeper.property.clientPort","2181"); conf.set("hbase.zookeeper.quorum","<hostname>"); conf.set("zookeeper.znode.parent","/hbase"); please refer this to create connection object https://community.hortonworks.com/articles/2038/how-to-connect-to-hbase-11-using-java-apis.html

rajkumar_singh · ‎12-18-2016

ENV: HDP 2.4.2 STEP 1: Setting up MySQL SSL # Create clean environment shell> rm -rf newcerts shell> mkdir newcerts && cd newcerts # Create CA certificate shell> openssl genrsa 2048 > ca-key.pem shell> openssl req -new -x509 -nodes -days 3600 \ -key ca-key.pem -out ca.pem # Create server certificate, remove passphrase, and sign it # server-cert.pem = public key, server-key.pem = private key shell> openssl req -newkey rsa:2048 -days 3600 \ -nodes -keyout server-key.pem -out server-req.pem shell> openssl rsa -in server-key.pem -out server-key.pem shell> openssl x509 -req -in server-req.pem -days 3600 \ -CA ca.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem # Create client certificate, remove passphrase, and sign it # client-cert.pem = public key, client-key.pem = private key shell> openssl req -newkey rsa:2048 -days 3600 \ -nodes -keyout client-key.pem -out client-req.pem shell> openssl rsa -in client-key.pem -out client-key.pem shell> openssl x509 -req -in client-req.pem -days 3600 \ -CA ca.pem -CAkey ca-key.pem -set_serial 01 -out client-cert.pem STEP 2:update my.cnf as follow and restart MySQL [mysqld] ssl-ca=/home/hive/ca-cert.pem ssl-cert=/home/hive/server-cert.pem ssl-key=/home/hive/server-key.pem STEP 3:grant priv to hive user mysql> GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' IDENTIFIED BY 'hive' REQUIRE SSL; mysql> FLUSH PRIVILEGES; import client cert and key into keystore as there is no direct way to do it I have taken a help from this guide http://www.agentbob.info/agentbob/79-AB.html convert cert and pem key into DER format and import it using the java program provided at the link. STEP 4: edit hive-env.sh # specified truststore location and password with hive client opts if [ "$SERVICE" = "hiveserver2" ]; then export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Djavax.net.ssl.trustStore=/home/hive/keystore.ImportKey -Djavax.net.ssl.trustStorePassword=importkey" fi STEP 5: updated hive-site.xml javax.jdo.option.ConnectionURL jdbc:mysql://sandbox.hortonworks.com/hive?createDatabaseIfNotExist=true&useSSL=true&verifyServerCertificate=false STEP 6: Restarted HS2 which is now able to connect to MySQL over SSL

rajkumar_singh · ‎12-18-2016

@Giuseppe Mento could you please try after adding this property in HBConf and see if it helps,update hbase_zk_node according to your configuration by default it should be /hbase if it is not a secure cluster HBConf.put("zookeeper.znode.parent", "<hbase_zk_node>");

rajkumar_singh · ‎12-18-2016

ENV: HDP 2.5 Please follow following steps to configure 1. under Ambari-> Knox -> Advanced topologies 2. add the following snippet in advanced topologies <provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>HIVE</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=rkk3.hdp.local:2181,rkk2.hdp.local:2181, rkk1.hdp.local:2181;zookeeperNamespace=hiveserver2</value> </param> </provider> comment out the url from hive service tag <service> <role>HIVE</role>  </service> 3. Restart KNOX 4. Open Beeline and connect to HS2 beeline Beeline version 1.2.1000.2.5.0.0-1133 by Apache Hive beeline> !connect jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive Connecting to jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive Enter username for jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive: guest Enter password for jdbc:hive2://rkk1.hdp.local:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1133/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive: ************** 16/11/26 19:58:04 [main]: WARN jdbc.Utils: ***** JDBC param deprecation ***** 16/11/26 19:58:04 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated. 16/11/26 19:58:04 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://<host>:<port>/dbName;transportMode=<transport_mode_value> 16/11/26 19:58:04 [main]: WARN jdbc.Utils: ***** JDBC param deprecation ***** 16/11/26 19:58:04 [main]: WARN jdbc.Utils: The use of hive.server2.thrift.http.path is deprecated. 16/11/26 19:58:04 [main]: WARN jdbc.Utils: Please use httpPath like so: jdbc:hive2://<host>:<port>/dbName;httpPath=<http_path_value> Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1133) Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1133) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://rkk1.hdp.local:8443/> show tables; +-----------+--+ | tab_name | +-----------+--+ | xx | +-----------+--+ 1 row selected (0.201 seconds)

rajkumar_singh · ‎12-18-2016

while debugging kafka producer slowness, I observed following steps Kafka producer take while producing a single record to kafka broker. ENV: HDP 2.5 2 node kafka cluster, single producer producing a record to 'testtopic' which has 2 partitions with 2 replicas Kafka Producer start with the configured settings it start adding matrices sensors. update cluster metadata version which includes cluster information like broker nodes and partitions, assign version id to this cluster metadata version. Updated cluster metadata version 1 to Cluster(nodes = [Node(-2, rkk2, 6667), Node(-1, rkk1, 6667)], partitions = []) Set up and start Kafka producer I/O thread aka Sender thread. Request metadata update for topic testtopic. Producer's NetworkClient metadata request to one of the broker which consist of api_key,api_version,correlation_id and client_id. Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=0,client_id=producer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1482047450018, sendTimeMs=0) to node -1 In the response get metadata from cluster and update it's own copy of metadata, the response include broker information along with topic partitions, its leader and ISR. Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) producer serialized key and value sent as produce record to the leader of that partition, the partition is decided based on the default partitioner scheme if not configured. The default partitioning strategy has following flow while deciding partition, If a partition is specified in the record, use it If no partition is specified but a key is present choose a partition based on a hash of the key If no partition or key is present choose a partition in a round-robin fashion Producer allocate memory buffer for topic configured using batch.size Producer wake up Sender thread once the buffer is full or linger.ms reached or if it is a new batch. Sender thread create a produce request to a leader of partition like this for a produce record with a correlation_id. Created 1 produce requests: [ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.producer.internals.Sender$1@11b2c43e, request=RequestSend(header={api_key=0,api_version=1,correlation_id=1,client_id=producer-1}, body={acks=1,timeout=30000,topic_data=[{topic=testtopic,data=[{partition=1,record_set=java.nio.HeapByteBuffer[pos=0 lim=76 cap=100000]}]}]}), createdTimeMs=1482047460410, sendTimeMs=0)] once the record written successfully to brokers based on ack settings, Sender thread get the response back for correlation_id and Callback get called. Received produce response from node 1002 with correlation id 1

rajkumar_singh · ‎12-18-2016

@Rajendra Kalepu When connecting to a database using JDBC, you can optionally specify extra JDBC parameters via a property file using the option --connection-param-file The contents of this file are parsed as standard Java properties and passed into the driver while creating a connection. sqoop import --driver some_jdbc_driver --connect <connect-string> --connection-param-file The parameters specified via the optional property file are only applicable to JDBC connections. --options-file Option files can be specified anywhere in the command line as long as the options within them follow the otherwise prescribed rules of options ordering. For instance, regardless of where the options are loaded from, they must follow the ordering such that generic options appear first, tool specific options next, finally followed by options that are intended to be passed to child programs sqoop --options-file database.props --table <table> --target-dir <target_dir> //database.props import --connect jdbc:mysql://localhost:5432/test_db --username root --password password

rajkumar_singh · ‎12-17-2016

Env : HDP 2.5 2 node kafka cluster having topic name 'testtopic' with partition set as 2 and replication set as 2. I am running two consumer with consumer id 'test'. 1. what happen when Consumer start fresh Consumer NetworkClient will request metadata <- return cluster information 2016-12-17 23:21:05 DEBUG clients.NetworkClient:619 - Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=1,client_id=consumer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1481997065894, sendTimeMs=0) to node -2 2016-12-17 23:21:06 DEBUG clients.Metadata:172 - Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) Sends a GroupMetadata request to one of the brokers 2016-12-17 23:21:06 DEBUG internals.AbstractCoordinator:471 - Issuing group metadata request to broker 1001 as a response get the current coordinator 2016-12-17 23:21:11 DEBUG internals.AbstractCoordinator:484 - Group metadata response ClientResponse(receivedTimeMs=1481997071648, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@31df58ee, request=RequestSend(header={api_key=10,api_version=0,correlation_id=2,client_id=consumer-1}, body={group_id=test}), createdTimeMs=1481997066150, sendTimeMs=1481997071389), responseBody={error_code=0,coordinator={node_id=1002,host=rkk2.hdp.local,port=6667}}) node_id=1002 is designated as a coordinator now start sending JOIN_GROUP request to coordinator 2016-12-17 23:21:16 DEBUG internals.AbstractCoordinator:324 - Issuing request (JOIN_GROUP: {group_id=test,session_timeout=10000,member_id=,protocol_type=consumer,group_protocols=[{protocol_name=range,protocol_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]}) to coordinator 2147482645 2016-12-17 23:21:19 DEBUG internals.AbstractCoordinator:342 - Joined group: {error_code=0,generation_id=1,group_protocol=range,leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,members=[{member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]} the first consume who join the group will become a group leader with some leader id, in our case this is leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c, thing to notice here is that leader id and and member id is same here because this is the only consumer at this point. the leader knows about all the consumer through group coordinator(group coordinator will know all the consumer through the heartbeat mechanism of consumer handled in consumer.poll), after getting the list of all the consumer leader start partition assignment based on the pre configured policy which is by default Range partitioning(refer kafka.consumer.RangeAssignor to understand how it do assignment) leader consumer do partition assignment 2016-12-17 23:21:19 DEBUG internals.ConsumerCoordinator:219 - Performing range assignment for subscriptions {consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c=org.apache.kafka.clients.consumer.internals.PartitionAssignor$Subscription@75bbb4f4} 2016-12-17 23:21:19 DEBUG internals.ConsumerCoordinator:223 - Finished assignment: {consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c=org.apache.kafka.clients.consumer.internals.PartitionAssignor$Assignment@2919c2af} after assignment it sends the assignmet back to the coordinator which will send the respective partition to the other consumer in the group. consumer can see only the partition assign to them only 2016-12-17 23:21:19 DEBUG internals.AbstractCoordinator:403 - Issuing leader SyncGroup (SYNC_GROUP: {group_id=test,generation_id=1,member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,group_assignment=[{member_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=33 cap=33]}]}) to coordinator 2147482645 2016-12-17 23:21:20 DEBUG internals.AbstractCoordinator:429 - Received successful sync group response for group test: {error_code=0,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=33 cap=33]} now consumer will start fetching from the respective partitions and do normal heartbeat process.this is the first consumer so all the partition of topic is assigned to it(testtopic-1, testtopic-0) 2016-12-17 23:21:20 DEBUG internals.ConsumerCoordinator:185 - Setting newly assigned partitions [testtopic-1, testtopic-0] 2016-12-17 23:21:20 DEBUG internals.ConsumerCoordinator:575 - Fetching committed offsets for partitions: [testtopic-1, testtopic-0] 2. now lets start the second consumer and see how it behave send metadata request 2016-12-17 23:22:10 DEBUG clients.NetworkClient:619 - Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=1,client_id=consumer-1}, body={topics=[testtopic]}), isInitiatedByNetworkClient, createdTimeMs=1481997130251, sendTimeMs=0) to node -1 2016-12-17 23:22:10 DEBUG clients.Metadata:172 - Updated cluster metadata version 2 to Cluster(nodes = [Node(1002, rkk2.hdp.local, 6667), Node(1001, rkk1.hdp.local, 6667)], partitions = [Partition(topic = testtopic, partition = 1, leader = 1002, replicas = [1002,1001,], isr = [1002,1001,], Partition(topic = testtopic, partition = 0, leader = 1001, replicas = [1002,1001,], isr = [1001,1002,]]) 2016-12-17 23:22:10 DEBUG internals.AbstractCoordinator:471 - Issuing group metadata request to broker 1001 will know and connect to coordinator 016-12-17 23:22:16 DEBUG internals.AbstractCoordinator:484 - Group metadata response ClientResponse(receivedTimeMs=1481997135999, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@31df58ee, request=RequestSend(header={api_key=10,api_version=0,correlation_id=2,client_id=consumer-1}, body={group_id=test}), createdTimeMs=1481997130517, sendTimeMs=1481997135760), responseBody={error_code=0,coordinator={node_id=1002,host=rkk2.hdp.local,port=6667}}) 2016-12-17 23:22:16 DEBUG clients.NetworkClient:487 - Initiating connection to node 2147482645 at rkk2.hdp.local:6667. revoke previously assigned partition, in this case it is none 2016-12-17 23:22:21 DEBUG internals.ConsumerCoordinator:241 - Revoking previously assigned partitions [] join group request 2016-12-17 23:22:21 DEBUG internals.AbstractCoordinator:324 - Issuing request (JOIN_GROUP: {group_id=test,session_timeout=10000,member_id=,protocol_type=consumer,group_protocols=[{protocol_name=range,protocol_metadata=java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]}]}) to coordinator 2147482645 Join group as follower (notice using follower SyncGroup with some member_id and leader_id is the same of as first consumer) 2016-12-17 23:22:23 DEBUG internals.AbstractCoordinator:342 - Joined group: {error_code=0,generation_id=2,group_protocol=range,leader_id=consumer-1-b53995e2-ac0b-43dd-a9a3-1f19f03d679c,member_id=consumer-1-fd31194d-469c-4d9d-a66e-09ee2db44645,members=[]} 2016-12-17 23:22:23 DEBUG internals.AbstractCoordinator:392 - Issuing follower SyncGroup (SYNC_GROUP: {group_id=test,generation_id=2,member_id=consumer-1-fd31194d-469c-4d9d-a66e-09ee2db44645,group_assignment=[]}) to coordinator 2147482645 2016-12-17 23:22:24 DEBUG internals.AbstractCoordinator:429 - Received successful sync group response for group test: {error_code=0,member_assignment=java.nio.HeapByteBuffer[pos=0 lim=29 cap=29]} After sync a new partition (testtopic-1) is assigned to this consumer 2016-12-17 23:22:24 DEBUG internals.ConsumerCoordinator:185 - Setting newly assigned partitions [testtopic-1] 2016-12-17 23:22:24 DEBUG internals.ConsumerCoordinator:575 - Fetching committed offsets for partitions: [testtopic-1]

Online	Offline
Last Visited	‎08-23-2021 03:30 PM

Member Since	‎04-25-2016 07:57 AM
Last Visited	‎08-23-2021 03:30 PM
Posts	579
Kudos received	568

Cloudera Community

Re: Why Hive Compaction is failing ?

Re: how to setup queue name for squirrel

Re: How to set the logging level of the hiveserver...

Re: Hive Tez Client Memory

Re: Resource Manager API ?

Re: What is the format of --connection-param-file ...

Re: Kafka multi broker setup through ambari

Re: Storm-HBase: HbaseBolt does not write into the...

Re: HBase - Using Java API to put data into Table ...

connect HiveServer2 to MySQL metastore over SSL

Re: TableNotFoundException in Storm-HBASE

Setup knox over highly available HiveServer2 insta...

How Kafka Producer work Internally

Re: What is the format of --connection-param-file ...

Understanding Kafka Consumer partition assignment ...