Member since
12-29-2019
11
Posts
0
Kudos Received
0
Solutions
04-21-2021
12:17 AM
Hello Impala doesn't support parameterized view Some walk arounds been discussed here: https://stackoverflow.com/questions/52063217/create-parameterized-view-in-impala
... View more
04-14-2020
08:44 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. The commit-id mentioned here isn't related to any Kafka usage related terms such as 'commit offsets' or other terms. This commit id refers to the Kafka source from which it was built. It is not an error but just an info message. This doesn't impact Kafka client's functionality in any way. Let me know if this helps. Cheers,
... View more
04-14-2020
05:45 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. This issue is due to the missing "kafka-python" module in your Python installation. You have to manually install the "kafka-python" module using the mentioned command in the edge node and all the hosts on which Spark job executes. $ pip install kafka-python
... View more
02-12-2020
05:43 AM
Thanks @stevenmatison I am using Parquet format, I tried with ORC not a significant difference, then I changed following setting as follows: Not knowing a lot on the following settings but based on my research. I am not using partitions yet. set hive.cbo.enable=true; set hive.compute.query.using.stats=true; set hive.stats.fetch.column.stats=true; set hive.stats.fetch.partition.stats=true; set hive.vectorized.execution = ture set hive.vectorized.execution.enabled = true also I changed following execution engine set hive.execution.engine = spark I think changing engine to spark made a lot of difference.... Now query is running from 2.48 min to 15 sec I am quite satisfied with current performance but I would sure appreciate other advise for me and for the community. Thanks and appreciate you response. Andy
... View more