Support Questions

Find answers, ask questions, and share your expertise

importing data in hive from a location

avatar
Expert Contributor

I created schema for my table through a sqoop job:

sqoop job --meta-connect jdbc:hsqldb:hsql://IP:16000/sqoop --create sales_flat_order_initialjob -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://IP/db?zeroDateTimeBehavior=convertToNull --username root --password 'perconaMYSQL' --table sales_flat_order --incremental lastmodified --check-column wh_update --last-value 0 --merge-key entity_id --split-by entity_id --hive-import --hive-database Magento --hive-drop-import-delims

After executing this, I now have the schema and data.

For incremental imports:

  1. I replaced --hive-import with --target dir
sqoop job --meta-connect jdbc:hsqldb:hsql://warehouse.swtched.com:16000/sqoop --create sales_flat_order_initialjob -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://ip/db?zeroDateTimeBehavior=convertToNull --username root --password 'perconaMYSQL' --table sales_flat_order --incremental lastmodified --check-column wh_update --last-value 0 --merge-key entity_id --split-by entity_id --target-dir /location --hive-database Magento --hive-drop-import-delims

Now, I changed this table to an external table

and

changed location to that of target.

Now, I get all 'NULLS' in all columns for all rows.

What looks like the issue?

1 ACCEPTED SOLUTION

avatar
Super Guru

ALTER TABLE MAGNETO.SALES_FLAT_ORDER SET SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'

assuming you have hive 0.14 or later.

View solution in original post

5 REPLIES 5

avatar
Super Guru

It is likely an issue of field delimiter. Default in Hive is ^A. you should specify what your fields are delimited by.

--fields-terminated-by

Might want to do --lines-terminated-by also.

avatar
Expert Contributor

I understood that but not sure how to specify the delimiters and what should be the delimiters? Could you please be more specific? @mqureshi

avatar
Super Guru

@Simran Kaur check your data by doing a "cat" to see what the data looks like. Are fields separated by a space or whatever it is. You can also instead create a table and in create table statement specify what you want your fields to be terminated by and then do an import using Sqoop.

avatar
Expert Contributor

@mqureshi : Fields seem to be separated by comma. How do I update this for an existing table? I don't want to create the table manually since there are a lot of columns.

avatar
Super Guru

ALTER TABLE MAGNETO.SALES_FLAT_ORDER SET SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'

assuming you have hive 0.14 or later.