Member since
09-08-2016
9
Posts
4
Kudos Received
0
Solutions
05-23-2017
01:16 AM
I am using custom scheme which outputs values (key, timestamp, x, y, z) public class KeyScheme implements Scheme { return new Values(key, x, ,y z); } Topology: kafkaConf.scheme = new SchemeAsMultiScheme(new KeyScheme());
... View more
05-22-2017
08:16 AM
The input from kafka queue: {'key':'deviceid1_siteid1_ipaddress1','timestamp':12345678.123,'x':12,'y':13,'z':14}
{'key':'deviceid2_siteid1_ipaddress2','timestamp':12345678.123,'x':13,'y':13,'z':14}
{'key':'deviceid1_siteid1_ipaddress3','timestamp':12345678.123,'x':12,'y':13,'z':14}
{'key':'deviceid1_siteid2_ipaddress1','timestamp':12345678.123,'x':15,'y':13,'z':14}
{'key':'deviceid3_siteid3_ipaddress1','timestamp':12345678.123,'x':12,'y':14,'z':14}
{'key':'deviceid3_siteid2_ipaddress2','timestamp':12345678.123,'x':13,'y':13,'z':14}
{'key':'deviceid2_siteid1_ipaddress3','timestamp':12345678.123,'x':16,'y':13,'z':6} I need to filter the events by key, sort the timestamp and calculate moving average for the last 3 events. Topology code: builder.setSpout("kafkaSpout", kafkaSpout, 1);
builder.setBolt("movingSpeedBolt", movingSpeedBolt, 1).fieldsGrouping("kafkaSpout", new Fields("key"));
The distinct number of deviceid/siteid/ip combinations would be 800.
... View more
05-21-2017
08:18 AM
1 Kudo
I have a requirement to consume kafka events and calculate moving average. The structure of the kafka event would look like, {'deviceid':'a','siteid':'s1','ip':'i1','timestamp':123,'val':1} The same timestamp would appear for different deviceid, siteid and ipaddress. I need to calculate the moving average for a particular deviceid, siteid and ip. The events are pushed to kafka queue from different device/ip. I have tried fieldgrouping in bolt and by using withwindow function to calculate average for last 3 events. When I tested, the events are not grouped by the fields. I am getting different values in my bolt tuplewindow. The events are not being filtered. Appreciate your help! @Bryan Bende
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Storm
01-24-2017
12:31 AM
@Bikas, Is this correct to create LivyClient object during application start up and use that object for all the API requests to submit jar?
... View more
01-23-2017
12:58 AM
Hi, I have a requirement to train a predictive machine learning model from web application. User can change tuning parameters/ algorithm from User Interface which goes to Spark job and returns regression metric. Right now, I have developed a web application REST API with Jersey and Jetty. I have a Spark machine learning application which implements Livy job. I am referring this jar from web application. The flow is as follows. User changes tuning parameter/ algorithm from UI -> Jersey API calls Livy client with parameters -> Livy job runs in YARN and returns metric results -> Jersey returns metric results to UI. Am I doing it correct? Is there any other way that I can interact with SPARK from web app? I am creating LivyClient singleton object and I am using this object for submitting job throughout the application. The code was working fine. I am getting the below exception now. Do I have to create LivyClient object and upload jar (Spark machine learning application with Livy) each and every time? javax.servlet.ServletException: java.util.concurrent.ExecutionException: java.io.IOException: Not Found: "Session '339' not found."
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
09-12-2016
11:58 PM
Yes. That is exactly what I wanted. Thank you!
... View more
09-09-2016
03:50 AM
2 Kudos
I need to read huge CSV file from user where historian sensor data is stored. I can't use http to upload the csv into my JEE web application as the size of csv would be 200gb max. Format: sensor_name,timestamp,value sensor1,timestamp1,value1 sensor1,timestamp2,value2 sensor2,timestamp1,value1 Once user uploads csv, I need to display unique values from first column where user can map existing sensor(keyspace.table.pk1)with the sensor from csv (sensor1). I need to import timestamp, value from sensor1 to keyspace.table.pk1. I tried using Nifi but got struck. How can I notify user that the reading is done? so that user can start mapping. How can I implement this feature? Shall I use Spark to calculate unique values? Where can I write the output? How to notify user? How to trigger Spark job every time user uploads the file? How do I transfer my file from the client app, What happens when there are failures (do we retry, etc.), How often my jobs will be run (will it be triggered every time user uploads the file or it can be a cron job)?
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi
-
Apache Spark
09-09-2016
01:33 AM
Yes I am using PutCassandraQL to write into Cassandra by replacing csv into cql statement. If I can't do it with Nifi, Can I use Spark, Kafka or Storm to implement my requirement?
... View more
09-08-2016
05:28 AM
1 Kudo
I have a requirement to read huge CSV file from Kafka topic to Cassandra. I configuredApache Nifito achieve the same. Flow: User does not have a control on Nifi setup. He only specifies the URL where the CSV is located. The web application writes the URL into kafka topic. Nifi fetches the file and inserts into Cassandra. How will I know that Nifi has inserted all the rows from the CSV file into Cassandra? I need to let the user know that inserting is done and display a page where he can see the unique values from the CSV. Any help would be appreciated.
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi