- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to read from a Kafka topic using Spark (streaming) in a secure Cluster?
- Labels:
-
Apache Kafka
-
Apache Spark
Created ‎12-14-2015 10:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Specifications:
- HDP 2.3.2
- Kerberos enabled
- Kafka topic exists and user <username> has read access
- Kafka topic is readable/writable using the Kafka command line tools with specified user
- We already have a Spark streaming application that works fine in an unsecure cluster reading from a Kafka topic.
What would be a working example of a Spark streaming job that reads input from Kafka in a secure cluster under above conditions?
We made a Spark streaming job work that reads/writes into secure HBase and thought it couldn't be that different to do it with Kafka.
Created ‎12-15-2015 07:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
spark streaming haven't yet enabled security for their kafka connector
Created ‎12-15-2015 07:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
spark streaming haven't yet enabled security for their kafka connector
Created ‎12-18-2015 01:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@schintalapani - Do you have more details? Such as if there are JIRA issues or dev email threads tracking this?
Created ‎04-28-2016 12:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
is not enable for the DirectApi connector, but is enable with the old one.
Created ‎01-23-2016 05:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you shed some light on how you use Spark Streaming to read/write to HBase? I have hard time to use Spark/Scala to access HBase. Thanks!
Created ‎01-26-2016 01:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have made a small change to the spark streaming code here : https://github.com/davidtambjss/spark-release/tree/HDP-2.3.0.0-KERBEROS_KAFKA_STREAMING and was able to get Spark streaming to work with kafka with kerberos.
All you need rebuild spark-streaming-kafka_2.10-1.3.1.2.3.0.0-2557.jar from there and use that jar.
Created ‎01-26-2016 02:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎01-05-2017 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi David,
Can you please provide the running code.
Inside pom.xml im getting below compilation error:
Project build error: Non-resolvable parent POM for com.accenture.ngap:spark-test:[unknown-version]: Failure to transfer
Thanks,
Krishna
Created ‎04-20-2016 05:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Checked with engineering today and the feature will be officially landing in upcoming HDP 2.4.2 patch. There will be documentation around the fact that users will need to use HDP spark streaming kafka jars (instead of vanilla Apache)
Created ‎10-03-2016 01:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Ali Bajwa we have HDP 2.4.2 and when we try to consume the messages form the Secured Kafka topics using Spark Streaming (spark 1.6.1) we can't consume any messages.
I followed the documentation on https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/spark-streaming-kafk...
Was this patch after 2.4.2 or am I missing something.
Thanks.
