Member since
06-13-2023
9
Posts
2
Kudos Received
0
Solutions
01-28-2024
05:13 AM
2 Kudos
Hi @yagoaparecidoti thanks for the help. I’ve resolved the problem
... View more
01-24-2024
05:45 AM
Hi everyone, I'm trying to figure out what causes this Impala warning that is generated when external tables are created: Impala does not have READ_WRITE access to path 'hdfs://xxxxxxxxxxx' At first, we thought it was a permissions issue with Ranger, but after numerous attempts, we still get the same warning. We also checked the permissions on the HDFS paths, but there are folders with permissions of 775, 750, and 755, so there doesn't seem to be a correlation between the warning and POSIX permissions. Could it be an issue with the user and/or group? In many paths, the owner and group of the folder are hdfs:hdfs. Should the owner be changed to impala? Unfortunately, I haven't found any helpful documentation on this topic that would allow me to eliminate the warning. Many users think that the tables are actually not being created and that there is no way to write to them.
... View more
Labels:
- Labels:
-
Apache Impala
12-13-2023
02:04 AM
Hello @James G the only mistake is the lack of display of the tables. Unfortunately, I have no way to execute any command on Impala.
... View more
12-12-2023
12:22 PM
Hello everyone, recently on Ranger, we noticed that when enabling users in a security zone, they are unable to see tables in a database through Impala. When connecting via JDBC through the command line or directly from HUE, they can view the tables and their respective data, but if all of this is done through Impala, they have no visibility at all. We have 4 clusters, and all of them have the same type of configuration and permissions, but we encounter the issue in 3 of these clusters. All configurations in Ranger have been checked and appear to be correct in all environments. At this point, I have no idea where to begin the analysis of the problem. Also, there are no access issues reported from Ranger to tables in the security zone, and there are no error logs or access denied logs. I realize that without providing an error log, it's difficult to conduct an analysis, but perhaps some of you have suggestions on where to start investigating the problem. Have any of you ever encountered a similar case? Thanks!!!
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Ranger
11-13-2023
02:13 PM
@DianaTorres I'm sorry but unfortunately the problem still persists even after trying the suggestions in the previous posts
... View more
11-09-2023
12:51 AM
Hello Miklos, unfortunately, what you suggested had no effect. We continue to have the same problem, with creating a single parquet file.
... View more
11-07-2023
12:23 PM
Hello everyone, my team using TEZ, in particular Hive, has noticed that during an insert with a very simple select a single parquet file of 1.5 gb per partition is generated in the output table. To try to remedy the problem, a number of settings were used at the session level but had no effect. Below are the sets used at the session level: SET hive.execution.engine=tez; SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.optimise.sort.dynamic.partition.threshold=0; --SET tez.grouping.max-size=268435456; --SET hive.exec.reducers.bytes.per.reducer=536870912; --SET tez.grouping.split-count=18; SET hive.vectorized.execution.reduce.enabled = true; SET hive.vectorized.execution.reduce.groupby.enabled = true; --SET hive.tez.auto.reducer.parallelism=false; --SET mapred.reduce.tasks=12; --SET hive.tez.partition.size=104857600; --SET hive.tez.partition.num=10; SET hive.parquet.output.block.size=104857600; I would like to ask if there is a way to always have parquet type files but broken up into smaller files as shown in the image below We cannot understand what the cause might be. Files structured in this way do not guarantee sufficient parallelism for other jobs present (such as sqoop)
... View more
Labels:
06-14-2023
02:01 PM
Hi Yuexin, you have been very helpful. Unfortunately, if I wanted to use "Dynamic Queue Scheduling" in CDP 717 at the moment, I would not have the guarantee to solve any problems via Cloudera support. In fact, it is not recommended to use it in production. Thank you very much
... View more
06-13-2023
02:18 PM
Hi, I'm using the Cloudera CDP 7.1.7 Private Cloud Base version and would like to have a confirmation if there is a possibility to be able to set, in the capacity scheduler of Yarn, time rules as I did previously in the old CDH. It would be interesting to be able to use that functionality also in CDP, but in the Cloudera documentation I've searched on the web there's no mention of it. Could you give me some hints if this feature is still present in CDP? Thanks
... View more
Labels:
- Labels:
-
Apache YARN