I want to read habse tables and perform transformation over data and store final result into kafka topic with the help of spark streaming job . I am following below procedure to achieve the above requirement .
I am using newAPIHadoopRDD to read hbase table from spark stream and run transformation over data , In this step I am loading data into RDD but to I want to register schema also with Kafka as my final destination is insertion of records into hive .
Basically i am following below steps :
1. read hbase tables and load data into spark rdd.
2. perform transformation .
3. load transformed data into kafka topic .
all above steps I am running via spark stream job . for step first newAPIHadoopRDD will help to read data , for transformation I have functions in spark also custom functions and this final data load into kafka acting spark job as producer for kafka and utilizing kafka apis to achieve the same .But I am not sure how to register hbase schema with kafka ?