Member since
12-16-2015
6
Posts
1
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4502 | 11-13-2017 02:56 PM | |
9482 | 11-10-2017 01:34 PM | |
2638 | 11-08-2017 11:47 AM |
11-13-2017
08:36 PM
@Bala Vignesh N V So you've renamed the partition in Hive and can see the new name there but when you look on HDFS it still has the original partition name, correct? In my example in the previous post I originally had 2 partitions (part=a and part=b) in Hive and I renamed part=a to part=z. On HDFS, part=a never changed but the PART_NAME column in the metastore database was updated to part=z. In Hive, I can only see part=z and part=b and if I do a SELECT for the data in part=z then it will lookup the LOCATION column from the metastore database for part=z, which still points to the part=a directory on HDFS, and read data for part=z from there. So this way for external tables, you can rename the partitions in Hive to whatever you like without affecting the underlying data on HDFS.
... View more
11-13-2017
02:56 PM
@Miguel Rodriguez The JSON you are using isn't valid, you need to remove the double quotes around the map column value. So your JSON would be: {"a":"a1","b":"b1","d":{"key1":"pair1","key2":"pair2"}}
... View more
11-10-2017
01:34 PM
@Bala Vignesh N V This is normal behaviour for EXTERNAL tables since Hive is not managing the underlying data. You can see in the metastore database that Hive keeps a mapping of the partition name to the location on HDFS. Your ALTER command will update this mapping and change the PART_NAME value. hive=> SELECT t."TBL_NAME", s."LOCATION", p."PART_NAME" FROM "SDS" s, "TBLS" t, "PARTITIONS" p WHERE t."TBL_ID"=p."TBL_ID" AND s."SD_ID"=p."SD_ID";
TBL_NAME | LOCATION | PART_NAME
----------+----------------------------------------------+-----------
mytable | hdfs://my.example.com:8020/tmp/mytable/part=a | part=z
mytable | hdfs://my.example.com:8020/tmp/mytable/part=b | part=b If you want to rename the partition name on HDFS and have it reflected in the data then you do the rename on HDFS and then DROP the partition in the external table and ADD the renamed one.
... View more
11-09-2017
01:45 PM
@Ravi teja You can get this information from the metastore database: hive=> SELECT "TBL_NAME", "COLUMN_NAME", "TYPE_NAME" FROM "TBLS" , "COLUMNS_V2" WHERE "TBL_ID"="CD_ID" UNION SELECT "TBL_NAME", "PKEY_NAME", "PKEY_TYPE" FROM "PARTITION_KEYS" p , "TBLS" t WHERE p."TBL_ID"=t."TBL_ID" ORDER BY "TBL_NAME";
TBL_NAME | COLUMN_NAME | TYPE_NAME
------------+-------------+-----------
tableA | aaa | string
tableA | bbb | string
tableB | foo | string
tableB | bar | int
tableC | cola | string
tableD | colb | string
(6 rows)
... View more
11-08-2017
11:47 AM
1 Kudo
@Arjan Onwezen the hcatalog jsonSerde doesn't look like it uses the SERDEPROPERTIES mappings and only maps exact fieldnames to column names or JSON field names matching Hive internal column names, i.e _col0, _col1. I think the JsonSerde you are referring to is the rcongui Hive-JSON-Serde which should work as you've outlined.
... View more
11-08-2017
09:43 AM
The import/export can be time consuming, you could try distcp'ing the non-transactional partitions over to the DR non-transactional and using MSCK REPAIR TABLE to pick them up? You'd still need to run the copy from non-tranasctional to transactional again.
... View more