- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
importing data in hive from a location
- Labels:
-
Apache Hive
Created ‎06-23-2016 04:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I created schema for my table through a sqoop job:
sqoop job --meta-connect jdbc:hsqldb:hsql://IP:16000/sqoop --create sales_flat_order_initialjob -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://IP/db?zeroDateTimeBehavior=convertToNull --username root --password 'perconaMYSQL' --table sales_flat_order --incremental lastmodified --check-column wh_update --last-value 0 --merge-key entity_id --split-by entity_id --hive-import --hive-database Magento --hive-drop-import-delims
After executing this, I now have the schema and data.
For incremental imports:
- I replaced --hive-import with --target dir
sqoop job --meta-connect jdbc:hsqldb:hsql://warehouse.swtched.com:16000/sqoop --create sales_flat_order_initialjob -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://ip/db?zeroDateTimeBehavior=convertToNull --username root --password 'perconaMYSQL' --table sales_flat_order --incremental lastmodified --check-column wh_update --last-value 0 --merge-key entity_id --split-by entity_id --target-dir /location --hive-database Magento --hive-drop-import-delims
Now, I changed this table to an external table
and
changed location to that of target.
Now, I get all 'NULLS' in all columns for all rows.
What looks like the issue?
Created ‎06-23-2016 06:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ALTER TABLE MAGNETO.SALES_FLAT_ORDER SET SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
assuming you have hive 0.14 or later.
Created ‎06-23-2016 05:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is likely an issue of field delimiter. Default in Hive is ^A. you should specify what your fields are delimited by.
--fields-terminated-by
Might want to do --lines-terminated-by also.
Created ‎06-23-2016 05:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I understood that but not sure how to specify the delimiters and what should be the delimiters? Could you please be more specific? @mqureshi
Created ‎06-23-2016 05:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Simran Kaur check your data by doing a "cat" to see what the data looks like. Are fields separated by a space or whatever it is. You can also instead create a table and in create table statement specify what you want your fields to be terminated by and then do an import using Sqoop.
Created ‎06-23-2016 05:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mqureshi : Fields seem to be separated by comma. How do I update this for an existing table? I don't want to create the table manually since there are a lot of columns.
Created ‎06-23-2016 06:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ALTER TABLE MAGNETO.SALES_FLAT_ORDER SET SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
assuming you have hive 0.14 or later.
