I am new to Hadoop ecosystem. For data ingestion into hive, we are using the sqoop import commands. Data ingestion populates data into staging tables. Now we need to clean up data & insert into the production hive tables. I have written the hive udf to simulate the auto increment feature which works fine in hive shell. Hive query is taking very long to clean up data, generate auto_incremented number. Impala queries are working good. I m wondering if i can use the same hive defined udf in the impala .
Is there any way to use the hive udf in the impala shell to generate the auto_incremented number?
Answer is Yes ..Link
We don't support impala so I suggest to ask this question in CDH community forum.
I highly recommend to stick with Hive and Tez & buckle up for LLAP :)