About lalitha_ramakri

lalitha_ramakri · ‎05-23-2017

I am using custom scheme which outputs values (key, timestamp, x, y, z) public class KeyScheme implements Scheme { return new Values(key, x, ,y z); } Topology: kafkaConf.scheme = new SchemeAsMultiScheme(new KeyScheme());

lalitha_ramakri · ‎05-22-2017

The input from kafka queue: {'key':'deviceid1_siteid1_ipaddress1','timestamp':12345678.123,'x':12,'y':13,'z':14} {'key':'deviceid2_siteid1_ipaddress2','timestamp':12345678.123,'x':13,'y':13,'z':14} {'key':'deviceid1_siteid1_ipaddress3','timestamp':12345678.123,'x':12,'y':13,'z':14} {'key':'deviceid1_siteid2_ipaddress1','timestamp':12345678.123,'x':15,'y':13,'z':14} {'key':'deviceid3_siteid3_ipaddress1','timestamp':12345678.123,'x':12,'y':14,'z':14} {'key':'deviceid3_siteid2_ipaddress2','timestamp':12345678.123,'x':13,'y':13,'z':14} {'key':'deviceid2_siteid1_ipaddress3','timestamp':12345678.123,'x':16,'y':13,'z':6} I need to filter the events by key, sort the timestamp and calculate moving average for the last 3 events. Topology code: builder.setSpout("kafkaSpout", kafkaSpout, 1); builder.setBolt("movingSpeedBolt", movingSpeedBolt, 1).fieldsGrouping("kafkaSpout", new Fields("key")); The distinct number of deviceid/siteid/ip combinations would be 800.

lalitha_ramakri · ‎05-21-2017

I have a requirement to consume kafka events and calculate moving average. The structure of the kafka event would look like, {'deviceid':'a','siteid':'s1','ip':'i1','timestamp':123,'val':1} The same timestamp would appear for different deviceid, siteid and ipaddress. I need to calculate the moving average for a particular deviceid, siteid and ip. The events are pushed to kafka queue from different device/ip. I have tried fieldgrouping in bolt and by using withwindow function to calculate average for last 3 events. When I tested, the events are not grouped by the fields. I am getting different values in my bolt tuplewindow. The events are not being filtered. Appreciate your help! @Bryan Bende

lalitha_ramakri · ‎01-24-2017

@Bikas, Is this correct to create LivyClient object during application start up and use that object for all the API requests to submit jar?

lalitha_ramakri · ‎01-23-2017

Hi, I have a requirement to train a predictive machine learning model from web application. User can change tuning parameters/ algorithm from User Interface which goes to Spark job and returns regression metric. Right now, I have developed a web application REST API with Jersey and Jetty. I have a Spark machine learning application which implements Livy job. I am referring this jar from web application. The flow is as follows. User changes tuning parameter/ algorithm from UI -> Jersey API calls Livy client with parameters -> Livy job runs in YARN and returns metric results -> Jersey returns metric results to UI. Am I doing it correct? Is there any other way that I can interact with SPARK from web app? I am creating LivyClient singleton object and I am using this object for submitting job throughout the application. The code was working fine. I am getting the below exception now. Do I have to create LivyClient object and upload jar (Spark machine learning application with Livy) each and every time? javax.servlet.ServletException: java.util.concurrent.ExecutionException: java.io.IOException: Not Found: "Session '339' not found."

lalitha_ramakri · ‎09-12-2016

Yes. That is exactly what I wanted. Thank you!

lalitha_ramakri · ‎09-09-2016

I need to read huge CSV file from user where historian sensor data is stored. I can't use http to upload the csv into my JEE web application as the size of csv would be 200gb max. Format: sensor_name,timestamp,value sensor1,timestamp1,value1 sensor1,timestamp2,value2 sensor2,timestamp1,value1 Once user uploads csv, I need to display unique values from first column where user can map existing sensor(keyspace.table.pk1)with the sensor from csv (sensor1). I need to import timestamp, value from sensor1 to keyspace.table.pk1. I tried using Nifi but got struck. How can I notify user that the reading is done? so that user can start mapping. How can I implement this feature? Shall I use Spark to calculate unique values? Where can I write the output? How to notify user? How to trigger Spark job every time user uploads the file? How do I transfer my file from the client app, What happens when there are failures (do we retry, etc.), How often my jobs will be run (will it be triggered every time user uploads the file or it can be a cron job)?

lalitha_ramakri · ‎09-09-2016

Yes I am using PutCassandraQL to write into Cassandra by replacing csv into cql statement. If I can't do it with Nifi, Can I use Spark, Kafka or Storm to implement my requirement?

lalitha_ramakri · ‎09-08-2016

I have a requirement to read huge CSV file from Kafka topic to Cassandra. I configuredApache Nifito achieve the same. Flow: User does not have a control on Nifi setup. He only specifies the URL where the CSV is located. The web application writes the URL into kafka topic. Nifi fetches the file and inserts into Cassandra. How will I know that Nifi has inserted all the rows from the CSV file into Cassandra? I need to let the user know that inserting is done and display a page where he can see the unique values from the CSV. Any help would be appreciated.

Online	Offline
Last Visited	‎06-08-2017 01:20 AM

Member Since	‎09-08-2016 05:25 AM
Last Visited	‎06-08-2017 01:20 AM
Posts	9
Kudos received	4

Cloudera Community

Re: Storm filtering kafka events for calculating m...

Re: Storm filtering kafka events for calculating m...

Storm filtering kafka events for calculating movin...

Re: Livy rest api with web app

Livy rest api with web app

Re: How to check whether NIFI has completed the jo...

Huge CSV import to Cassandra

Re: How to check whether NIFI has completed the jo...

How to check whether NIFI has completed the job re...