Created 06-07-2016 06:23 AM
I was reading about the presto, where a single Presto query can process data from multiple sources e.g. HDFS, MySQL, Cassandra or even Kafka. Presto, where you can define objects called 'catalogs' which can point to remote data sources.
Do we have such mechanism in Hive to process data from multiple sources?
Also can we access another hive table(from remote source) in same beeline connection?
Created 06-07-2016 06:29 AM
no Hive don't have capability to query other data source until its storage handler is defined. hive has concept of native and non native tables, for native tables it know how to manage it but for non native table it dont have a capability until it has not storage handler. to know more of storage handler you can refer this doc https://cwiki.apache.org/confluence/display/Hive/StorageHandlers
Created 06-07-2016 06:29 AM
no Hive don't have capability to query other data source until its storage handler is defined. hive has concept of native and non native tables, for native tables it know how to manage it but for non native table it dont have a capability until it has not storage handler. to know more of storage handler you can refer this doc https://cwiki.apache.org/confluence/display/Hive/StorageHandlers
Created 06-07-2016 06:47 AM
Good info @Rajkumar Singh, As HBase provides a storage handler in hive. What all storage handlers do we have for hive? as per the doc Cassandra, JDBC, MongoDB, and Google Spreadsheets.
Created 06-07-2016 07:06 AM
not much popular but hive has storage handler for mongodb and cassandra
https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage
https://github.com/tuplejump/cash/tree/master/cassandra-handler