About ChineduLB

ChineduLB · ‎05-10-2019

We do not want to keep the old partitions. We just want to re-partition the data using the timestamps value. The data only exists currently as partitioned by the string value

ChineduLB · ‎05-09-2019

Hi EricL ColumnA is of a different data type than ColumnB ColumnA contains Department Names (string) and ColumnB contains TimeStamps (Date-Time) the table is already paritioned by the department names which is strings now we want to change and partition by the TimeStamp column (date-time) could you explain your process little more

ChineduLB · ‎05-08-2019

I was considering .... 1. Create new ext table with new partition 2. insert into newtable select ... from oldtable ... to new hdfs location 3. drop old table and delete hdfs folders problem here is... at some point both tables will have to exists

ChineduLB · ‎05-03-2019

Hi, We have an existing external Hive Table containing millions of rows partitioned by columnA of type string. We want to change this to ColumnB of type timestamp What's the most efficient way to go about this, considering we have all this rows of data already stored in the existing partition structure

ChineduLB · ‎04-16-2019

Hi, Can I create a parameterized view in impala somethig like the below pseudo code: Create View MyView as SELECT col1, col2 col3 FROM table_one WHERE startdate = ${date1} and enddate = ${date2} ...

ChineduLB · ‎04-07-2019

we need to partition our Hive Table based on date. Date/Month/Year is it better to use int or string for the partition types. ex: CREATE EXTERNAL TABLE partition (id string, event timestamp and so on) PARTITIONED BY (year INT, month INT, day INT) Stored as Parquet vs CREATE EXTERNAL TABLE partition (id string, event timestamp and so on) PARTITIONED BY (year string, month string, day string) Stored as Parquet Noticed that we couldn't do queries like: ... where day > 10 with the string option

ChineduLB · ‎04-02-2019

Hi all. Need to generate unique id's in our hadoop cluster during data ingestion. We have parallel processes ingesting data from different sources into hive tables, we'd like a unique ID for each data row inserted. I understand zookeper offers Unique ID generation for distributed scenarios. Please help with how do we do this, can't find sample of documentation. Also please let me know If there is a better distributed unique id generator in the cloudera environment Thanks

ChineduLB · ‎04-01-2019

Thanks

ChineduLB · ‎03-27-2019

Can we take advantage of Hive table partitions when querying with impala Are there any issues or problems we might run into given this scenario. We currently have partitioned hive tables... will we be missing anything if we dont convert to impala tables

ChineduLB · ‎03-25-2019

Thanks

Online	Offline
Last Visited	‎05-21-2024 09:00 AM

Member Since	‎02-11-2019 07:55 AM
Last Visited	‎05-21-2024 09:00 AM
Posts	81
Kudos received	3

Cloudera Community

Re: How to change partition on existing Hive Table...

Re: How to change partition on existing Hive Table...

Re: How to change partition on existing Hive Table...

How to change partition on existing Hive Table wit...

Create Parameterized view Impala

Best Data Type for Hive Date Partition

Generating Unique ID using Zookeeper

Re: Impala querying Hive partitions

Impala querying Hive partitions

Re: Get Last Insert in Impala partition