Could you share the details on analysing the data quality that is loaded in Hive.
I have got a text file around 250 million records which I have loaded into hive and stored in parquet file.
Now my next task is to analyse the quality of data. Since I am not from ETL background, this is new to me.
Could you share some details that could be used on Hive tables. I would prefer spark or pig.