Move files into HDFS directory
You can move all files from Local file system to HDFS directory,
[bash $] hadoop fs -put <local-path> <hdfs-directory>
Write a shell script to move all the required files into HDFS directory.
Then create hive table(assume this table is staging table) on top of the hdfs directory(i.e where we have moved the files).
In this method we are not loading the data from each file(by using load data local inpath ...) instead we are copying the files into HDFS directory and creating table on top of copied files.
Even these small files in the table will create performance issues so create another table(i.e final table) and
insert overwrite finaltable select * from staging table order by <fileld> //now we are going to create only one file in the final table.if you are having millions of records then you need to use other than order by clause to initialize more than one reducer.
Merge small files in local
Merging files into one big file
[bash $] cat file-name1 file-name2 file-name3 > merge.txt //or we can even use wild cards in filenames also
Now we are creating one merge.txt file by merging all the files into one.
Once you merge all the files then move the merged file into hadoop directory then create table on top of moved directory (or) by using load data local inpath <merged-file-path> into table <table-name>;
By Using NiFi
Use List and Fetch File processors (or) GetFile processor to fetch local files into NiFi and then Use MergeContent processor to merge small files into one big file based on your required maximum size then store the file into HDFS using PutHDFS processor.
In addition you can use Record Processors to read incoming data and change the output flowfile format then create ORC format files inside NiFi then store the files into HDFS.
References regarding Merge content processor NiFi
References regarding record oriented processor
If the Answer addressed your question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.