I wanted to load data from a hdfs underlying hive directory through Sqoop to my-sql and db2. The data contains \000 as delimiter. Hive accepts \000 as delimiter but sqoop fails. Sqoop works for other control characters as I tried for \001 and \020.
Could you suggest how to run sqoop with \000 as delimiter. This will be used for both sql and db2. I couldn't find this delimiter in the Sqoop generated .JAVA. So, not sure how this is handled. Could someone explain this please.
I am guessing because my-sql and db2 considers \000 as NULL while hive considers \N as NULL.
In Sqoop user guide, it is mentioned as:
Delimiters may be specified as:
a character (
an escape character (
--fields-terminated-by \t). Supported escape characters are:
\0 (NUL) - This will insert NUL characters between fields or lines, or will disable enclosing/escaping if used for one of the
Yes, You could handle the senario by using the sqoop-Hcat integration where as sqoop gabs the metadata from Hcat and process.
this is a great way to export the data as you can leave the format (not to worry about the serde / field separators etc..)
example Export code
sqoop export –-connect <jdbc_driver_detalisWithHostandPort> –-username <mysqlUser> –-password <mysqlPassword> –-table <target_tablename(mysql/postgres)> –-hcatalog-table <hiveTable> --hcatalog-database <HiveDatabase>
more on this can be found at here
Hope this Helps !!