Member since
02-08-2019
28
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4333 | 06-13-2019 12:19 AM |
09-24-2019
01:46 AM
1 Kudo
@eMazarakis, later releases do not support asterisk either, it will be treated as a literal. The expressions that are available can be found here in chapter 'To drop or alter multiple partitions'. Previously, I was referring to the intention behind "part_col='201801*' ", it suggests that the desired outcome of this expression would be to remove all data from January 2018 in one operation. However, as it is not possible in CDH 5.9, I was proposing to choose a different partition strategy if multiple partitions have to be dropped frequently and the size of the data allows. For example, if after ingestion only 1 analytic query is executed on the data, then the days have to be dropped one-by-one, which is 32 operations. Therefore, if the size of the data allows, the number of operations could be reduced to 2 with a different partition strategy where the table is partitioned by month.
... View more
08-14-2019
05:27 PM
For number 2, ANY changes outside of Impala, you will need INVALIDATE METADATA, or if new data added, then REFRESH will do. Work is underway to improve it: https://issues.apache.org/jira/browse/IMPALA-3124 Cheers Eric
... View more
06-13-2019
12:19 AM
@Consult I found the solution. The sqoop command creates a YARN process, type MAPREDUCE. So if we only kill the processes through unix shell, this YARN process will continue to run at the background. So from the cloudera manager, we go to YARN --> Applications and then we kill the YARN process. .
... View more