02-24-2014 06:16 PM
I have a use case where Pig UDF processes some data and generates either string or int etc. I need to store that back into a hive table.
For this, I have to create Hive table in advance for that schema and I am using HCatStorer() to store the same in hive. How can I change my pig script to let that create the hive table for me on the fly.
Please note that I would not easily know the schema in pig script, so cannot use HiveColumnarLoader as well.
Example: Pig script generates output as id (int), terms (a,b,c – list of string) and some keyvalue pair. It needs to go in hive as int, array<string>, map<string,string>
This may change in future, there can be more columns or different data types. How can I make this dynamic within the pig script?
07-20-2014 06:56 AM