Member since
05-05-2016
35
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1587 | 05-06-2016 10:54 AM |
06-13-2017
08:06 PM
Hi I have to design a spark streaming application with below use case. I am looking for best possible approach for this.
I have application which pushing data into 1000+ different topics each has different purpose . Spark streaming will receive data from each topic and after processing it will write back to corresponding another topic.
Ex.
Input Type 1 Topic --> Spark Streaming --> Output Type 1 Topic
Input Type 2 Topic --> Spark Streaming --> Output Type 2 Topic Input Type 3 Topic --> Spark Streaming --> Output Type 3 Topic .
.
.
Input Type N Topic --> Spark Streaming --> Output Type N Topic and so on. I need to answer following questions. 1. Is it a good idea to launch 1000+ spark streaming application per topic basis ? Or I should have one streaming application for all topics as processing logic going to be same ? 2. If one streaming context , then how will I determine which RDD belongs to which Kafka topic , so that after processing I can write it back to its corresponding OUTPUT Topic? 3. Client may add/delete topic from Kafka , how do dynamically handle in Spark streaming ?
4. How do I restart job automatically on failure ? Any other issue you guys see here ?
... View more
Labels:
- Labels:
-
Apache Spark
06-12-2017
08:29 AM
Thanks Laurent. I agree on that. I am trying to get detailed understanding of communication between HDF and HDP cluster. When HDF Nifi connects (via HDFS processor , Hive connection or spark ) to HDP cluster, does it writes anything to local disks of data nodes of HDP ?
... View more
06-11-2017
09:24 PM
Hi I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark) on HDP cluster. Usually nifi writes all information to different repositories available on local machine. My question is - Does nifi writes any data,provenance information or does spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing HDFS,Hive or Spark services ? Thanks
... View more
Labels:
01-24-2017
12:24 AM
Thanks... It worked...
... View more
01-23-2017
10:44 AM
I am trying to parse my json using Nifi Expression language - jsonpath https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#jsonpath Its uses '.' for node traversal. If json which has one node name with '.' in it. Below is sample json {"feedName":"trigger_category.childfeed123",
"feedId":"eff68e0b-a9e6-4c11-b74f-53f161a47faf",
"dependentFeedNames":["trigger_category.test_shashi"],
"feedJobExecutionContexts":{"trigger_category.test_shashi":[{"jobExecutionId":23946,
"startTime":1485145059971,
"endTime":1485145111733,
"executionContext":{"feedts":"1485145061170"}}]},
"latestFeedJobExecutionContext":{"**trigger_category.test_shashi**":{"jobExecutionId":23946,
"startTime":1485145059971,
"endTime":1485145111733,
"executionContext":{"**feedts**":"1485145061170"}}}} I am trying to read feedts but its parent node 'trigger_category.test_shashi' has dot ('.') in it. How do i escape "."character?
... View more
Labels:
- Labels:
-
Apache NiFi
11-23-2016
07:05 AM
Nice Article....!!!!
... View more
11-11-2016
09:02 AM
Is possible to follow above approach in Kerberos environment? I tried above step to run job as proxy user but it failed. Got GSS initialization exception. Any pointers?
... View more
09-19-2016
04:58 PM
Did you get solution for this? I am facing exact same issue.
... View more
09-07-2016
07:42 AM
@Michael Young It worked.
... View more