Support Questions

Find answers, ask questions, and share your expertise

Hive export

avatar
Explorer

I'm exporting several tables and I observe that some files are copied to the target path using DistCp (slow) while other files with some other (fast) mean

There is no evidence of the rational behind the choice but the other odd thing is that even if a table is made up of multiple files, hive starts 1 DistCp for each file instead of passing the entire directory

Is there any option to drive the behaviour?

1 ACCEPTED SOLUTION

avatar
Explorer

Self resolved

The switch between direct single threaded copy and distcp depends on file size gt  hive.exec.copyfile.maxsize

The default value is 32MB

 

 

View solution in original post

1 REPLY 1

avatar
Explorer

Self resolved

The switch between direct single threaded copy and distcp depends on file size gt  hive.exec.copyfile.maxsize

The default value is 32MB