Hello everyone,
I'm streaming data from twitter in json format by using apache flume and storing it then in HDFS, then i load json files from hdfs into an external hive table, but the problem is that tweets texts aren't classified, i need to extract only profils who are against the issue i'm working on , and i was wondering how can i use apache spark to extract these data from my hive table and then apply sentiment analysis algorithms on extraxted tweets texts.
Thanks.