I have a zip file with 10k Mera files and 10k data files. What is the best way to ingest this in hive? Meta and data tables are separate
You should experiment with several methods of Data -> HDFS -> Hive.
In its simplest form, if your data is concise, you can always upload to HDFS and create external Hive table feeding your HIVE CREATE EXTERNAL TABLE Statement with the necessary configurations to understand your data.
If your data needs processing and preparation I recommend Nifi. I use NiFi to do this (more than 50 million records) in several different manners. You will need to inspect all of the NiFi Hive Processors and decide which one fits best for your Use Case.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Steven @ DFHZ
Here are a few options
Tools you can for ingest