Member since
06-27-2019
4
Posts
0
Kudos Received
0
Solutions
08-06-2019
06:05 AM
Thanks for the reply ! Views are already created by joining many underlying table, hence joining the views again for data aggregation will result performance issue. Here are the two approach i came up with 1. Extract data from Hive view into files. 2. Create intermediate Hive tables and load data extracted from views. 3. Join the new hive tables to generate the final file. Another approach to use PySpark to read data from views directly , aggreate and transform the data and generate the final output file.
... View more