Is there any way in Hadoop or Spark or any other components which enables dynamic querying on Big Data. For example i have 3 TB of data in HDFS. I wanted to build an application which enables users to choose there filters or predicates on their own and build a query and get the result in real time or near real time
Example in detail :
I have an Employee data of size 3 TB in HDFS. I have created hive external partitioned tables on top of this pointing to hdfs files. so here the goal is to enable users to choose the required data by filtering or selecting or ordering required columns. need may change depending upon users requirement.one user might be interested in only 50 columns , other user may be interested in only 10 column data.
Also please note the schema of the source data wont change .