Member since
05-01-2017
1
Post
1
Kudos Received
0
Solutions
05-01-2017
05:22 PM
Hive is a good place for the data. HiveMall can run machine learning in SQL. SparkSQL + SparkMLLib works great on Hive data I have used Apache NiFi to load from SQL Server tables land into ORC files with Hive tables on top. From there you can easily run your ML in Python as it comes in. I even run TensorFlow on data as it is in motion Denormalized into one wide table often works, because you can never have too many rows in Hive. https://community.hortonworks.com/articles/58265/analyzing-images-in-hdf-20-using-tensorflow.html
... View more