- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Sqoop performance regarding input format
- Labels:
-
Apache Sqoop
Created ‎06-09-2016 11:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the case I want to export data, using Sqoop, from HDFS to an external destination (Teradata for example), is there a recommendation regarding the format of the input files?
AFAIK, supported formats are :
- Delimited text files
- Sequence files
- ORC files
Do we observe performance differences between input formats?
Thanks
Created ‎06-09-2016 11:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sqoop internally using yarn jobs for extracting data from HDFS. ORC is regarding as better performance for read even with Hive: You can refer to below link for details:
Hope this helps.
Thanks and Regards,
Sindhu
Created ‎06-09-2016 11:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sqoop internally using yarn jobs for extracting data from HDFS. ORC is regarding as better performance for read even with Hive: You can refer to below link for details:
Hope this helps.
Thanks and Regards,
Sindhu
