Member since
02-24-2016
175
Posts
56
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1278 | 06-16-2017 10:40 AM | |
10978 | 05-27-2016 04:06 PM | |
1281 | 03-17-2016 01:29 PM |
09-15-2016
11:23 AM
1 Kudo
Guys, We have setup Kerberized cluster (HDP 2.4.x) and have setup Kafka Broker(0.9.x) with SASL (kerberization). What are the steps required to connect third party tool (producers/publishers) to connect to Kafka? Going through the link : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_secure-kafka-ambari/content/ch_secure-kafka-config-options.html What I understand is : this tool needs access to JAAS.conf file. For now I've copied the /usr/hdp/current/kafka-broker/config/kafka_client_jaas.conf and shared with the third party tool and kept on the classpath. Do we need anything else also in place? Regards, SS
... View more
Labels:
- Labels:
-
Apache Kafka
09-08-2016
02:14 PM
Thank you for the nice cheat sheet. I configured according to the cheat sheet above on secure HDP 2.4.2 + Ambari 2.2 cluster. I could send the messages messages using console producer. <Broker_home>/bin/kafka-console-producer.sh --broker-list <KAFKA_BROKER>:6667 --topic test --security-protocol SASL_PLAINTEXT When I am trying to consume the messages (on the same machine) I get error. Starting like this : /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --zookeeper ZK1:2181,ZK-2:2181,ZK-3:2181 --topic test --from-beginning --security-protocol SASL_PLAINTEXT Error stack is : [2016-09-08 13:59:26,167] WARN [console-consumer-39119_HOST_NAME-1473343165849-9f1b8f0d-leader-finder-thread], Failed to find leader for Set([test,0], [test,1]) (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not found for broker 0
at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:141)
at kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:180)
at kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:180)
What do you think has gone wrong? Regards, SS PS : Do we have another WIKI page for Best practices around Kafka? @Sriharsha Chintalapani, @Andrew Grande, @Vadim Vaks , @Predrag Minovic, @
... View more
09-07-2016
06:45 PM
1 Kudo
Hi there, My question is quite similar to the https://community.hortonworks.com/questions/40457/kafka-producer-giving-error-when-running-from-a-di.html and https://community.hortonworks.com/questions/23775/unable-to-produce-message.html But this fails at the very first step : while trying to sending messages to Kafka topic I am using the below commands to start the producer: bin/kafka-console-producer.sh --broker-list <HOST_FQDN>:6667 --topic test This FQDN is copied from the $ hostname -f , also verified, the <HOST_FQDN> is the one matching in <KAFKA_BROKER_HOME>/config/server.properites advertised.listeners=PLAINTEXTSASL://<HOST_FQDN_SAME_AS_HOSTNAME_F>:6667 Now when I start the server I see : bin/kafka-console-producer.sh --broker-list <BROKER_FQDN>:6667 --topic test Note : test is a valid topic created previously. HI
[2016-09-07 18:10:15,713] WARN Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [BrokerEndPoint(0,<BROKER_FQDN>,6667)] failed (kafka.client.ClientUtils$)
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:140)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:131)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:79)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$doSend(SyncProducer.scala:76)
at kafka.producer.SyncProducer.send(SyncProducer.scala:121)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at kafka.producer.async.DefaultEventHandler$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:79)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-09-07 18:10:15,716] ERROR fetching topic metadata for topics [Set(test)] from broker [ArrayBuffer(BrokerEndPoint(0,<BROKER_FQDN>,6667))] failed (kafka.utils.CoreUtils$) Also checked kafka.out and server.log files, does not show any errors/exceptions It would be really helpful if someone can help me understand missing bit. Thanks, SS
... View more
Labels:
- Labels:
-
Apache Kafka
08-23-2016
01:31 PM
Guys, I have a few questions related to Spark cache and would like to know your inputs on the same. 1) How much cache memory can available to each of the executor nodes? Is there a way to control it? 2) We want to restrict the developers from persisting any data to the disk. Is there any configuration can we change to disable non -memory caching? This is to make sure by mistake, any secure data is not spilled to the disk. 3) If point#2 cannot be achieved, is there a way to make sure that spillage (In case developers use Memory_And_Disk option) happens only to a secure directory and data is encrypted? 4) For streaming data, processing with Spark how secure is it, can encryption be applied to data in flight? 5) If the developers decide to cache steaming RDDs, how secure is it? And same case point#2 above. Thanks, SS
... View more
Labels:
- Labels:
-
Apache Spark
08-18-2016
12:37 PM
1 Kudo
Guys, I was going through articles for Spark ML, found references that suggests to have netlib-java for setting up Spark-MLlib if we plan to run ML applications in Java/Scala. Another posts/article suggests to install Anaconda libraries for using Spark with Python. I ran simple programs and used Spark SQL without Anaconda, was wondering do we really need Anaconda packages for Spark Python for MLlib usage? It would be great if someone could kindly comment on the netlib-java and Anacoda dependencies with respect to Spark and Spark MLlib use cases. Thanks, SS
... View more
Labels:
- Labels:
-
Apache Spark
08-17-2016
04:00 PM
Thanks @Michael Young. By any chance do you know the time line for security components integration or when is it in road map? BTW : I was checking tech preview of HDP 2.5. and I heard its due some time in late Aug or early Sept. Want to know if we have list of features/fixes coming in HDP 2.5 for Zeppelin? Thanks again. SS
... View more
08-17-2016
09:46 AM
2 Kudos
Hi guys, We are planning to setup Zeppelin for interactive usage with Spark. I see that we can configure it as Ambari Service ( described on http://hortonworks.com/hadoop-tutorial/apache-zeppelin). However, I wonder is the integration mature is enough to be used in production which has Kerberized environment, AD integrated with all other HDP components in place? Or still this integration is in tech preview (as mentioned in the Hortonworks blog)? Thanks, SS
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Zeppelin
08-16-2016
03:06 PM
1 Kudo
Hi all, We have HDP 2.4.2 cluster configured with Spark. I did run smoke tests (spark PI, shell, Spark SQL) for various components. I am looking forward to a few smoke tests to prove that spark has been configured with ML libraries. Moreover, how to make sure that Spark ML configurations are optimized? I was planning to run a couple of samples from https://spark.apache.org/docs/1.6.1/mllib-guide.html to make sure ML libs are configured. Is that enough? Thanks, SS
... View more
Labels:
08-15-2016
08:37 AM
Thanks fro getting back . @Alex Miller Here is the connect using Curl to connect the Knox server: curl -i -k -u admin:P@ssword 'https://<Knox_SERVER_Hostname>:<KNOX_PORT>/gateway/default/templeton/v1/status' RHEL : Oracle Linux Server release 6.7 Curl Version : 7.19.7 JDK : openjdk version "1.8.0_71" OpenJDK Runtime Environment (build 1.8.0_71-b15)
... View more