About rreddy

rreddy · ‎03-27-2017

One option is to delete existing external table and create new table that includes new column. Since this is Hive metadata operation, your data files wont be touched. Downside is that you will have to execute alter table command to redefine partitions on new table.

anandi · ‎08-09-2016

HI Pierre, We would need to look at the code. Can you a do a persist just before stage 63 and before stage 65 check the spark UI storage tab and executor tab for data skew. If there is data skew, you will need to add a salt key to your key. You could also look at creating a dataframe from the RDD rdd.toDF() and apply UDF on it. DF manage memory more efficiently. Best, Amit

rreddy · ‎07-18-2016

We explicitly listed out FQDN's of all hosts in both the clusters under [domain_realm] section of krb5.conf file. We have to update this file everytime we add node to our clusters and our clusters are currently less than <100 nodes and this solution is manageable but for large clusters this may be challenge.

brijesh_jaggi1 · ‎05-25-2018

this seems like a better solution to me https://community.hortonworks.com/questions/8010/hives-alter-table-partition-concatenate-not-workin.html

Online	Offline
Last Visited	‎08-24-2022 03:40 PM

Member Since	‎02-08-2016 02:27 PM
Last Visited	‎08-24-2022 03:40 PM
Posts	39
Kudos received	29

Cloudera Community

Re: Move data gives Found duplicated storage UUID ...

Re: minimum number of journal nodes needed.

Re: KDC Cross Realm Trust Setup- krb5.conf config?...

Re: Best way to merge multi part file into single ...

Re: Kafka 0.9 new-producer on kerberized HDP 2.4

Re: how to add partition to existing table having ...

Re: Spark driver memory keeps growing

Re: KDC Cross Realm Trust Setup- krb5.conf config?...

Re: Best way to merge multi part file into single ...