About eMazarakis

tmater · ‎09-24-2019

@eMazarakis, later releases do not support asterisk either, it will be treated as a literal. The expressions that are available can be found here in chapter 'To drop or alter multiple partitions'. Previously, I was referring to the intention behind "part_col='201801*' ", it suggests that the desired outcome of this expression would be to remove all data from January 2018 in one operation. However, as it is not possible in CDH 5.9, I was proposing to choose a different partition strategy if multiple partitions have to be dropped frequently and the size of the data allows. For example, if after ingestion only 1 analytic query is executed on the data, then the days have to be dropped one-by-one, which is 32 operations. Therefore, if the size of the data allows, the number of operations could be reduced to 2 with a different partition strategy where the table is partitioned by month.

Tim Armstrong · ‎09-20-2019

So it looks like column specific is only on a table without partitions (non-incremental) @hores that's incorrect, non-incremental compute stats works on partitioned tables and is generally the preferred method for collecting stats on partitioned tables. We've generally tried to steer people away from incremental stats because of the size issues on large tables, It would also be error-prone to use correctly and complex to implement - what happens if you compute incremental stats with different subsets of the columns? You can end up with different subsets of the columns on different partitions and then you have to somehow reconcile it all each time.

EricL · ‎08-14-2019

For number 2, ANY changes outside of Impala, you will need INVALIDATE METADATA, or if new data added, then REFRESH will do. Work is underway to improve it: https://issues.apache.org/jira/browse/IMPALA-3124 Cheers Eric

EricL · ‎08-09-2019

Hi, "HiveServer2 Enable Impersonation is setting to TRUE" is probably the reason. When Impersonation is true, it means Hive will impersonate as the end user who runs the query to submit jobs. Your ACL output showed that the directory is owned by "hive:hive" and as @Tomas79 found out, you have sticky bit set, so if hive needs to impersonate as the end user, the end user who runs the query will not be able to delete the path as he/she is not the owner. If impersonation is OFF, then HS2 will run query as "hive" user (the user that runs HS2 process), then you should not see such issue. I assume you have no sentry? As sentry will require Impersonation to be OFF on HS2 side, so that all queries will be running under "hive" user. To test the theory, try to remove the sticky bit on this path and drop again in Hive. Cheers Eric

eMazarakis · ‎06-13-2019

@Consult I found the solution. The sqoop command creates a YARN process, type MAPREDUCE. So if we only kill the processes through unix shell, this YARN process will continue to run at the background. So from the cloudera manager, we go to YARN --> Applications and then we kill the YARN process. .

eMazarakis · ‎03-15-2019

Dear @AnisurRehman You can import data from RDBMS to HDFS only with SQOOP. Then If you want to manipulate this table through Impala-Shell then you only need to run the following command from a pc where Impala is installed. impala-shell -d db_name -q "INVALIDATE METADATA tablename"; You have to do INVALIDATE because your table is new for Impala daemon metadata. Then if you append new data-files to the existing tablename table you only need to do refesh, the command is impala-shell -d db_name -q "REFRESH tablename"; Refresh due to the fact that you do not want the whole metadata for the specific table, only the block location for the new data-files. So after that you can quey the table through Impala-shell and Impala query editor.

Online	Offline
Last Visited	‎12-13-2025 06:30 AM

Member Since	‎02-08-2019 01:15 AM
Last Visited	‎12-13-2025 06:30 AM
Posts	28
Kudos received	2

Cloudera Community

Re: How to kill a sqoop import .. command through ...

Re: How can I Alter multiple partitions at once on...

Re: Impala compute incremental stats on specific c...

Re: When I have to Refresh / Invalidate Metadata a...

Re: How to properly delete a hive table through HU...

Re: How to kill a sqoop import .. command through ...

Re: how to sqoop with Impala