Support Questions

pvillard · ‎06-09-2016

In the case I want to export data, using Sqoop, from HDFS to an external destination (Teradata for example), is there a recommendation regarding the format of the input files?

AFAIK, supported formats are :

Delimited text files
Sequence files
ORC files

Do we observe performance differences between input formats?

Thanks

ssubhas · ‎06-09-2016

@Pierre Villard

Sqoop internally using yarn jobs for extracting data from HDFS. ORC is regarding as better performance for read even with Hive: You can refer to below link for details:

http://www.slideshare.net/StampedeCon/choosing-an-hdfs-data-storage-format-avro-vs-parquet-and-more-...

Hope this helps.

Thanks and Regards,

Sindhu

View solution in original post

ssubhas · ‎06-09-2016