Member since
01-14-2015
5
Posts
1
Kudos Received
0
Solutions
08-26-2016
08:57 AM
David, Thanks a lot for your information. Really appreciate it. I will see if our team want to invest the time/resources to create a UDF in Hive. Normally, how long it takes to create a UDF like my case?
... View more
08-26-2016
08:36 AM
In each entry of the CSV file, the last column is the path (HDFS directory path with blob file name) of the blob file. So one blob file is related to each row entry in the CSV file. For the first concern, I just cannot load the blob files into the same Hive table based on the HDFS path.
... View more
08-26-2016
08:28 AM
Hi David, Thanks a lot for the response. The Blob table has no row numbers since we load the Blob files into Hive table from a HDFS directory. So I cannot use JOIN to combine the two tables. And yes, there is a 1:1 relationship with the number of Blobs to the number of CSV entries. Your solution is to use JOIN to combine two tables with same rowid but there is a problem that how can we have the rowid for the Blob tables (the row Blob data files are in one HDFS directory).
... View more
08-17-2016
09:16 AM
1 Kudo
We have a use case that want to use the binary data type in Hive table: 1. In HDFS directory (e.g /data/work/hive/test/), we have several blob files which we want to store in Hive table (table1) as binary data type. 2. We have another Hive table (table2) storing regular CSV data and row number is the same as the number of above blob files. 3. How we can combine these two tables as a new table (table3 with both tables' columns and rows)?
... View more
Labels:
- Labels:
-
Apache Hive
-
HDFS