Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Huge CSV import to Cassandra

avatar

I need to read huge CSV file from user where historian sensor data is stored. I can't use http to upload the csv into my JEE web application as the size of csv would be 200gb max.

Format:

sensor_name,timestamp,value

sensor1,timestamp1,value1

sensor1,timestamp2,value2

sensor2,timestamp1,value1

Once user uploads csv, I need to display unique values from first column where user can map existing sensor(keyspace.table.pk1)with the sensor from csv (sensor1). I need to import timestamp, value from sensor1 to keyspace.table.pk1.

I tried using Nifi but got struck. How can I notify user that the reading is done? so that user can start mapping.

How can I implement this feature? Shall I use Spark to calculate unique values? Where can I write the output? How to notify user? How to trigger Spark job every time user uploads the file? How do I transfer my file from the client app, What happens when there are failures (do we retry, etc.), How often my jobs will be run (will it be triggered every time user uploads the file or it can be a cron job)?

1 ACCEPTED SOLUTION

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
1 REPLY 1

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login