- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Is there a way to create Hive table based on Avro data directly ?
- Labels:
-
Apache Hive
Created 03-09-2016 03:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a dataset that is almost 600GB in Avro format in HDFS. Whay is the most efficient way to create a Hive table directly on this dataset ?
For smaller datasets, I can move my data to disk, use Avro tools to extract schema, upload schema to HDFS and create Hive table based on that schema. Is there a way to directly extract Avro schema from a dataset in HDFS without writing java code ?
Created 03-09-2016 04:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try the following, cat your large file, grab a few lines output to new file on local fs. Ill be curious to know if that works with avro serialization.
Then use avro-tools to extract schema.
Created 09-02-2017 05:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hadoop jar avro-tools-1.8.2.jar getschema hdfs_archive/mydoc.avro
would also done the job
,instead of java -jar, you can directly run it on hdfs thanks to :
hadoop jar avro-tools-1.8.2.jar getschema hdfsPathTOAvroFile.avro
- « Previous
-
- 1
- 2
- Next »