Member since
02-11-2019
81
Posts
3
Kudos Received
0
Solutions
05-10-2019
06:51 AM
We do not want to keep the old partitions. We just want to re-partition the data using the timestamps value. The data only exists currently as partitioned by the string value
... View more
05-09-2019
09:01 AM
Hi EricL ColumnA is of a different data type than ColumnB ColumnA contains Department Names (string) and ColumnB contains TimeStamps (Date-Time) the table is already paritioned by the department names which is strings now we want to change and partition by the TimeStamp column (date-time) could you explain your process little more
... View more
05-08-2019
06:52 AM
I was considering .... 1. Create new ext table with new partition 2. insert into newtable select ... from oldtable ... to new hdfs location 3. drop old table and delete hdfs folders problem here is... at some point both tables will have to exists
... View more
05-03-2019
01:02 PM
Hi, We have an existing external Hive Table containing millions of rows partitioned by columnA of type string. We want to change this to ColumnB of type timestamp What's the most efficient way to go about this, considering we have all this rows of data already stored in the existing partition structure
... View more
Labels:
- Labels:
-
Apache Hive
04-16-2019
01:37 PM
Hi, Can I create a parameterized view in impala somethig like the below pseudo code: Create View MyView as SELECT col1, col2 col3 FROM table_one WHERE startdate = ${date1} and enddate = ${date2} ...
... View more
Labels:
- Labels:
-
Apache Impala
04-07-2019
04:41 AM
we need to partition our Hive Table based on date. Date/Month/Year is it better to use int or string for the partition types. ex: CREATE EXTERNAL TABLE partition (id string, event timestamp and so on) PARTITIONED BY (year INT, month INT, day INT) Stored as Parquet vs CREATE EXTERNAL TABLE partition (id string, event timestamp and so on) PARTITIONED BY (year string, month string, day string) Stored as Parquet Noticed that we couldn't do queries like: ... where day > 10 with the string option
... View more
Labels:
- Labels:
-
Apache Hive
04-02-2019
06:20 AM
Hi all. Need to generate unique id's in our hadoop cluster during data ingestion. We have parallel processes ingesting data from different sources into hive tables, we'd like a unique ID for each data row inserted. I understand zookeper offers Unique ID generation for distributed scenarios. Please help with how do we do this, can't find sample of documentation. Also please let me know If there is a better distributed unique id generator in the cloudera environment Thanks
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Zookeeper
03-27-2019
07:36 AM
Can we take advantage of Hive table partitions when querying with impala Are there any issues or problems we might run into given this scenario. We currently have partitioned hive tables... will we be missing anything if we dont convert to impala tables
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala