I am running hive query on PROD env on spark using hive context/Sparksession like this : sparksession.sql("query")
it takes approx 10-15 mins to parse the query and then query runs which looks abnormal to me becuase same query i run on UAT env and it takes less than 1 mins in parsing.
this relaly looks very abnormal. when i see hive.log of UAT while i execute that spark sql , cant see logs are moving.
but when i see hive.log of prod , i see logs are moving and see this kind of logs :
initialize called using direct sql underlying db oracle.... this kind of stememnts i see in log for all the tables involved in query.
now this is is really strange.... if we can see its trying to load metadata from hive metastore which will used in query parsing then why its not happening in UAT...
Please help, its very serious issue.
UAT has same number of tables and columns as in PROD but UAT has more data compare to PROD. and the problem is while query parsing and optimizer.once parsing is done there is no problem in execution, so did not talked about driver or executers.
i executed same query in both UAT and PROD and started monitoring the logs... in UAT i saw parsing was done within a minut of time and in PROD it took 20 mins. i was also watvhing UAT and PROD hive log. UAT log was not moving but PROD hive.log was moving. PROD was fetching table metdata from hive metastore but UAT was not.