Member since
12-30-2015
73
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
931 | 02-14-2020 11:38 PM | |
819 | 02-13-2020 02:08 PM | |
1605 | 02-04-2020 10:14 PM | |
1511 | 01-26-2017 10:38 AM |
02-14-2020
11:38 PM
Wire compatibility ❏ Preserves compatibility with Hadoop 2 clients ❏ Distcp/WebHDFS compatibility preserved
... View more
02-13-2020
02:08 PM
@attilabukor Hi, Thank you for your comment. Yesterday, I got a comment from HaoHao about this issue. The issue not being able to create KUDU table on CDH 6.3.2 is related to remote HMS configuration in hive-site.xml KuduTable.java ( on CDH 6.3.2 ) has a logic validating if `hmsuris` is null or empty. If `hmsuris` is empty or null, it raises exception and fails. This has been fixed on master branch, but I'm not sure if this fix will be delivered with CDH 6.3.3 Can you confirm if the fix is shipped with CDH 6.3.3? Here is the bug report ticket. https://issues.apache.org/jira/browse/IMPALA-8974 Gatsby
... View more
02-11-2020
11:32 PM
My current environment is CDH 6.3.2
Impala v3.2.0-cdh6.3.2
kudu 1.10.0-cdh6.3.2
Somehow, creating table with kudu storage gives IllegalArgumentException. It was ok with kudu 1.7.0-cdh5.16.2
CREATE TABLE test_mlee ( id BIGINT, name STRING, PRIMARY KEY(id) ) PARTITION BY HASH PARTITIONS 16 STORED AS KUDU ERROR: IllegalArgumentException: null
Any comment is appreciated Thank you in advance.
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
02-04-2020
10:14 PM
1 Kudo
have you checked the tested/supported OpenJDK list? https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_java_requirements.html#concept_hzw_zyl_rcb JDK 8u40, 8u45, 8u60, and 8u242 are not supported due to JDK issues impacting CDH functionality:
... View more
- Tags:
- jdk
02-04-2020
04:23 PM
Hello,
Thank you for reading this question.
Recently, one of Hadoop clusters we have has been upgraded to Hadoop 3.0 ( CDH 6.3.2 ).
I'm curious that the copied data from Hadoop 3.0 ( by distcp ) can be used by Hadoop 2.6 cluster.
I was able to distcp from Hadoop 3.0 cluster to Hadoop 2.6, but I found out this document.
https://hadoop.apache.org/docs/r3.0.3/hadoop-distcp/DistCp.html#Copying_Between_Versions_of_HDFS
Can you give me some comment for this?
Thank you very much in advance.
... View more
Labels:
- Labels:
-
Apache Hadoop
09-07-2017
01:36 PM
Hello, Have you encountered this problem? $ beeline -u jdbc:hive2://
Could not find valid SPARK_HOME while searching ['/home', '/usr/bin']
/usr/bin/beeline: line 32: /bin/spark-class: No such file or directory
/usr/bin/beeline: line 32: exec: /bin/spark-class: cannot execute: No such file or directory
... View more
Labels:
- Labels:
-
Apache Impala
04-28-2017
04:22 PM
Alon, I'm sorry that this is not answer to your question. actually, I have question for you. In order to use the python package you're using, do I have to run cloudera manager on my cluster? Thank you Gatsby
... View more
04-28-2017
04:20 PM
Joe, ah. i see so, at any given time, a file handle is used by only one thread. it means a file handle is not used by multiple threads at the same time. I'm very wonrdering how you know about this very well 🙂 Thank you very much Gatsby
... View more
04-28-2017
02:55 PM
Joe, By the way, do you think one cached file handle is used by multiple threads? Gatsby
... View more
04-28-2017
11:04 AM
Joe, Thank you for your comment. Your comment really help me confirm and understand how file-handle-cache works. Like you said, over time, the hit-count is going up and miss-cout is becoming stable 🙂
... View more
04-24-2017
03:29 PM
Yeap, you're right
... View more
04-24-2017
02:16 PM
Hello, I have a qusetion about these values in impala-server metric. I set `max_cached_file_handles` to 10,000 max_cached_file_handles (uint64) Maximum number of HDFS file handles that will be cached. Disabled if set to 0. 0 10000 However, there are still missed ` impala-server.io.mgr.cached-file-handles-miss-count`. impala-server.io.mgr.cached-file-handles-hit-count 668 Number of cache hits for cached HDFS file handles impala-server.io.mgr.num-cached-file-handles 253 Number of currently cached HDFS file handles in the IO manager. impala-server.io.mgr.cached-file-handles-miss-count 1467 Number of cache misses for cached HDFS file handles Can you tell me how to reduce the miss-count? Thank you Gatsby
... View more
Labels:
- Labels:
-
Apache Impala
03-21-2017
12:08 AM
Alex, Thank you again. Subquery approach has been recommended to our team as a long term solution. However, for short-tem solution to avoid regression impact, using view with limited partitions has been selected. If I remember correctly, in MySQL `table A` data can be limited by `ON Clause` before joining so that cadidates for join can be reduced. Thank you for your valuable comment. Gatsby
... View more
03-20-2017
11:09 PM
Alex, First of all, thank you very much for your explanation. You're right. the second query selects partition in table A. And, I'm fully aware of the difference between the first one( using on clause ) and second on ( where clause ) like the way you explained. The reason different variances were tried to find out ways to limit table A data before joinging two tables. ( yes, second query doesn't work this way ) In MySQL, table A could be limited by `ON clause`, but with Impala, I don't know how to do it. Do you think using subquery is the best way? Thank you Gatsby
... View more
03-20-2017
04:18 PM
1 Kudo
Hi Henry, I have a question for you and it is about `partition pruning` ( about pruning ) Let's say there are two tables A and B. And, each table is partitioned by yearweek. And, here is the query I'd like to run. ( Yes. I need to use left join to get result what I want ) SELECT * FROM A
LEFT OUTER JOIN B
ON A.account_id = B.account_id
AND A.yearweek = 201710 and B.yearweek = 201710 Even this doesn't select specific partition in `table A` SELECT * FROM A
LEFT OUTER JOIN B
ON A.account_id = B.account_id
AND B.yearweek = 201710
WHERE A.yearweek = 201710 Like you said, `A.yearweek` = 201710 on `On clause` couldn't select partition yearweek=201710. This might be filter is applied from `Left to Right`. In order to select specific partition for `table A`. I used `dynamic partition` and updated query like this. SELECT * FROM (SELECT * FROM A WHERE yearweek = 201710) a
LEFT OUTER JOIN B b
ON a.account_id = b.account_id
AND b.yearweek = 201710 Do you think this is best I can do? Or is there way to limit data for `table A` by using `On clause`? And, is there any refrerence you would recommend for me to upderstand how JOIN works in Impala and Hive? Thank you very much in advance. Gatsby
... View more
02-02-2017
09:27 AM
From log history, I found out someone keep running 'invalidate metadata' without table name. 😞 Thank you for your comment Gatsby
... View more
02-02-2017
09:26 AM
yeap. you're right. I will take a look log. Thank you Gatsby
... View more
02-01-2017
08:17 PM
I found out this open ticket - https://issues.cloudera.org/browse/IMPALA-1575 Probably the query you saw is from Hue. Gatsby
... View more
02-01-2017
02:09 PM
@gaurang Today, I had some issue with slow quries. And, the issue was related to metadata Catalog Daemon caches. How often do you make quries to that TABLE/VIEW ( I don't think your issue is related to VIEW )? In my case, metadata for TABLE was reloaded very often because Catalog Daemon flushes out metadata. Take a look your catalog daemon and check if TABLE metadata is cached. Gatsby
... View more
02-01-2017
02:04 PM
Hello, I have a question about metadata loaded in Impala catalog daemon. If I understand correctly, catalog daemon read TABLE metadata from Hive metadata store and caches in memory. My question is what trigger flushing out this cached TABLE metadata. The reason I ask this question is that I noticed that TABLE metadata is flushed out from catalog daemon after some time. And, since TABLE metadata doesn't exist, catalog daemon tries to load metadata again from metastore. Is there some kind of configuration I can set to control TABLE metadata lifecycle ( or memory )? Thank you Gatsby
... View more
Labels:
- Labels:
-
Apache Impala
02-01-2017
12:44 PM
@Lars Volker I have a question for you. How long the metadata loaded from Hive metastore by Impala Catalog Daemon stay in memory? I'm using Impala 2.7 ( KUDU ). It seems the metadata is flushed more often than before. Is there any configuration for life cycle for metadata in catalog daemon has? @gaurang I'm asking this question here because I guess @Lars Volker answer can help resolve your issue. Thank you Gatsby
... View more
02-01-2017
09:20 AM
@dsss yeap. I agree with you. we can't just kill those quries manually. I think most of times quries in exception status and in filght quries from Hue. @Lars Volker Is there way to log slow quries like MySQL slow quries? Is there way to set timeout for long running quries so that those quries can be killed? Thank you Gatsby
... View more
01-31-2017
05:45 PM
@Tim Armstrong Thank you very much for your explanation. 🙂 Gatsby
... View more
01-31-2017
04:36 PM
FYI, `COMPUTE STATS` can run with first level partition. https://issues.cloudera.org/browse/IMPALA-1570
... View more
01-31-2017
04:11 PM
@Lars Volker I thought that way too. Thank you for confirming it 🙂 Gatsby
... View more
01-31-2017
03:52 PM
You might need to click the `cancel` link next to query. Although a query is under exception status, it seems still running and using resources Gatsby
... View more
01-31-2017
03:46 PM
@Lars Volker I have a question for you. Does VIEW still need to load metadata separately even if metadata of TABLE for VIEW is already loaded? @gaurang which CDH/Impala are you using? Thank you Gatsby
... View more
01-31-2017
02:48 PM
Thank you. Then, why does this say that Impala2.8/CDH5.10? http://www.cloudera.com/documentation/enterprise/release-notes/topics/impala_new_features.html#new_features_280
... View more
01-31-2017
01:42 PM
Hi, I noticed that CDH5.10 is realesed and has Impala 2.8. However, it seems there are not Impala 2.8 RPM packages for RedHat6. Can you tell me where I can them? Thank you Gatsby
... View more
Labels:
- Labels:
-
Apache Impala