Created 02-07-2017 10:45 AM
Hi.
I’m asking this way because we imported data from Teradata to HDFS with Sqoop. I found many examples in the internet how to do it. It works an everything seems to be ok. I used the –as-avrodatafile command and also know where to find the .avsc file.
But now I am trying to export the same data back to my Teradata Table. I found many examples how to use the export in Sqoop to export .csv files. I was successful in exporting these files to teradata. But I couldn’t find anywhere how you can do it if you have avro files.
Can anyone give me some help, for example a link to a tutorial or a code example?
Created 02-08-2017 02:52 AM
Interesting question, I should try it out. That said if you can't figure out on your own or won't get other responses, then try creating a hive schema on top of your avro files, then run the following statement in hive to create a new table in orc or text format from your original avro table. Then you should be able to find examples of exporting with sqoop from orc or text
create table newtable as select * from avrotable
--sample avro backed hive table CREATE DATABASE DEMO; USE DEMO; CREATE EXTERNAL TABLE mailinglist COMMENT "just drop the schema right into the HQL" ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/tmp/rdbms' TBLPROPERTIES ( 'avro.schema.literal'='{ "type": "record", "name": "mailinglist", "namespace": "any.data", "fields": [ { "name": "id", "type": [ "null", "int" ] }, { "name": "first_name", "type": [ "null", "string" ] }, { "name": "last_name", "type": [ "null", "string" ] }, { "name": "email", "type": [ "null", "string" ] }, { "name": "gender", "type": [ "null", "string" ] } ] }');
Created 02-08-2017 02:52 AM
Interesting question, I should try it out. That said if you can't figure out on your own or won't get other responses, then try creating a hive schema on top of your avro files, then run the following statement in hive to create a new table in orc or text format from your original avro table. Then you should be able to find examples of exporting with sqoop from orc or text
create table newtable as select * from avrotable
--sample avro backed hive table CREATE DATABASE DEMO; USE DEMO; CREATE EXTERNAL TABLE mailinglist COMMENT "just drop the schema right into the HQL" ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/tmp/rdbms' TBLPROPERTIES ( 'avro.schema.literal'='{ "type": "record", "name": "mailinglist", "namespace": "any.data", "fields": [ { "name": "id", "type": [ "null", "int" ] }, { "name": "first_name", "type": [ "null", "string" ] }, { "name": "last_name", "type": [ "null", "string" ] }, { "name": "email", "type": [ "null", "string" ] }, { "name": "gender", "type": [ "null", "string" ] } ] }');
Created 02-08-2017 08:45 AM
Thank you Artem!
I think it is a good idea to do it this way, if there is no other possibility. We use the avro files as backup before we update the data on Teradata. If we just need some information, we can do it with the hive schema on top of the avro files.
Usually we don’t need to write the data back to Teradata. But a backup without the possibility to restore the data would be useless.