We are planning to implement a project in Azure cloud where data storage will be Azure Data lake for now and in future HDP will be implemented and ADLS will be the extended datanode. From ADLS we want to expose data for Dashboard creation using Tableau. Initial plan was to use Hive and Tableau will connect to Data through Hive. But here comes the performance issue as:
1. There will be multiple users who will have access to Data through Tableau(100+)
2. We will also have to expose Data to different portal with API calls.
Which means multiple connectivity will be established at the same time which will hit hive . My question is:
1. Can hive serve the purpose with minimal time?
2. How can i measure the performance?
3. I dont want to let my users to sit back after running a query in tableau and wait for a long time to see the dashboard.
Would you please share your experiences in this design issue? Should we use Hive or should We use some other tools which have better performance to work with tableau and HDFS storage. Someone suggested me to use Azure SQL Server and connect Tableau to SQL server. But its again the old fashion and also matter of cost as price is related with the execution of each query.
If you have any better solution experience please share , would be greatly appreciated.