Member since
12-30-2015
73
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1502 | 02-14-2020 11:38 PM | |
1535 | 02-13-2020 02:08 PM | |
2812 | 02-04-2020 10:14 PM | |
2238 | 01-26-2017 10:38 AM |
02-14-2020
11:38 PM
Wire compatibility ❏ Preserves compatibility with Hadoop 2 clients ❏ Distcp/WebHDFS compatibility preserved
... View more
02-13-2020
02:08 PM
@attilabukor Hi, Thank you for your comment. Yesterday, I got a comment from HaoHao about this issue. The issue not being able to create KUDU table on CDH 6.3.2 is related to remote HMS configuration in hive-site.xml KuduTable.java ( on CDH 6.3.2 ) has a logic validating if `hmsuris` is null or empty. If `hmsuris` is empty or null, it raises exception and fails. This has been fixed on master branch, but I'm not sure if this fix will be delivered with CDH 6.3.3 Can you confirm if the fix is shipped with CDH 6.3.3? Here is the bug report ticket. https://issues.apache.org/jira/browse/IMPALA-8974 Gatsby
... View more
02-11-2020
11:32 PM
My current environment is CDH 6.3.2
Impala v3.2.0-cdh6.3.2
kudu 1.10.0-cdh6.3.2
Somehow, creating table with kudu storage gives IllegalArgumentException. It was ok with kudu 1.7.0-cdh5.16.2
CREATE TABLE test_mlee ( id BIGINT, name STRING, PRIMARY KEY(id) ) PARTITION BY HASH PARTITIONS 16 STORED AS KUDU ERROR: IllegalArgumentException: null
Any comment is appreciated Thank you in advance.
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
02-04-2020
10:14 PM
1 Kudo
have you checked the tested/supported OpenJDK list? https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_java_requirements.html#concept_hzw_zyl_rcb JDK 8u40, 8u45, 8u60, and 8u242 are not supported due to JDK issues impacting CDH functionality:
... View more
02-04-2020
04:23 PM
Hello,
Thank you for reading this question.
Recently, one of Hadoop clusters we have has been upgraded to Hadoop 3.0 ( CDH 6.3.2 ).
I'm curious that the copied data from Hadoop 3.0 ( by distcp ) can be used by Hadoop 2.6 cluster.
I was able to distcp from Hadoop 3.0 cluster to Hadoop 2.6, but I found out this document.
https://hadoop.apache.org/docs/r3.0.3/hadoop-distcp/DistCp.html#Copying_Between_Versions_of_HDFS
Can you give me some comment for this?
Thank you very much in advance.
... View more
Labels:
- Labels:
-
Apache Hadoop
09-07-2017
01:36 PM
Hello, Have you encountered this problem? $ beeline -u jdbc:hive2://
Could not find valid SPARK_HOME while searching ['/home', '/usr/bin']
/usr/bin/beeline: line 32: /bin/spark-class: No such file or directory
/usr/bin/beeline: line 32: exec: /bin/spark-class: cannot execute: No such file or directory
... View more
Labels:
- Labels:
-
Apache Impala
04-24-2017
03:29 PM
Yeap, you're right
... View more
03-21-2017
12:08 AM
Alex, Thank you again. Subquery approach has been recommended to our team as a long term solution. However, for short-tem solution to avoid regression impact, using view with limited partitions has been selected. If I remember correctly, in MySQL `table A` data can be limited by `ON Clause` before joining so that cadidates for join can be reduced. Thank you for your valuable comment. Gatsby
... View more
03-20-2017
11:09 PM
Alex, First of all, thank you very much for your explanation. You're right. the second query selects partition in table A. And, I'm fully aware of the difference between the first one( using on clause ) and second on ( where clause ) like the way you explained. The reason different variances were tried to find out ways to limit table A data before joinging two tables. ( yes, second query doesn't work this way ) In MySQL, table A could be limited by `ON clause`, but with Impala, I don't know how to do it. Do you think using subquery is the best way? Thank you Gatsby
... View more
03-20-2017
04:18 PM
1 Kudo
Hi Henry, I have a question for you and it is about `partition pruning` ( about pruning ) Let's say there are two tables A and B. And, each table is partitioned by yearweek. And, here is the query I'd like to run. ( Yes. I need to use left join to get result what I want ) SELECT * FROM A
LEFT OUTER JOIN B
ON A.account_id = B.account_id
AND A.yearweek = 201710 and B.yearweek = 201710 Even this doesn't select specific partition in `table A` SELECT * FROM A
LEFT OUTER JOIN B
ON A.account_id = B.account_id
AND B.yearweek = 201710
WHERE A.yearweek = 201710 Like you said, `A.yearweek` = 201710 on `On clause` couldn't select partition yearweek=201710. This might be filter is applied from `Left to Right`. In order to select specific partition for `table A`. I used `dynamic partition` and updated query like this. SELECT * FROM (SELECT * FROM A WHERE yearweek = 201710) a
LEFT OUTER JOIN B b
ON a.account_id = b.account_id
AND b.yearweek = 201710 Do you think this is best I can do? Or is there way to limit data for `table A` by using `On clause`? And, is there any refrerence you would recommend for me to upderstand how JOIN works in Impala and Hive? Thank you very much in advance. Gatsby
... View more