Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3124 | 12-25-2018 10:42 PM | |
| 14040 | 10-09-2018 03:52 AM | |
| 4701 | 02-23-2018 11:46 PM | |
| 2420 | 09-02-2017 01:49 AM | |
| 2838 | 06-21-2017 12:06 AM |
09-13-2016
10:26 PM
2 Kudos
You are right, because Ambari supports only one ZK quorum, if you have 6 ZKs in your blueprint you will end up with a single 6-node ZK quorum, and you cannot change that using zkcli. Instead you can try: 1 cluster (a): Install the cluster from your blueprint without Kafka and with only 3 ZK nodes. When the cluster is up and running install the second, Kafka ZK quorum manually, you can find instructions here. Finally, add Kafka using Ambari and set its ZK quorum to the one you istalled manually. 1 cluster (b): Install the cluster from blueprint including Kafka and 3 ZK nodes. Then install another ZK manually using the above link, and change Kafka settings to use the new Zookeeper. I'd avoid this solution because the first Kafka will be "poluted" by Kafka ZK directories. 1 cluster (c), your solution with 6 ZK nodes: Remove 3 ZK nodes using Ambari and then install another ZK manually and change Kafka settings like in 1(b). 2 clusters: Install one cluster with all services but without Kafka and
its Zookeeper, and one more Kafka-only cluster having only Kafka and
its Zookeeper. This is the easiest solution because you can automate
cluster deployment using Ambari, and you can monitor all your components
using Ambari, but you will have 2 clusters.
... View more
09-13-2016
09:37 PM
Hi @Wing Lo If facet pivot is exactly what you need how about accepting and upvoting Matt's answer? Thanks!
... View more
09-08-2016
09:21 AM
The column list specification "INSERT INTO wkf107422_12_1_0 ( spersid )" is available starting with Hive-1.2 but you are most likely using an older version of Hive which doesn't support this feature (added by HIVE-9481). In old versions your SELECT statement has to provide all schema columns, in your case all 4. Regarding your "complementary information", it works but the resulting table wkf107422_12_1_1 contains only one column corresponding to spersid. To confirm try "describe wkf107422_12_1_1".
... View more
09-08-2016
06:30 AM
The more the nodes in a ZK ensemble (quorum) the faster the reads but the slower the writes. That's because a read can be done from any node, but a write is not complete before all nodes are updated. On top of that, early versions of Kafka (0.8.2 and older) keep Kafka offsets on ZK. Therefore, as already suggested by @mqureshi, it's the best to start by creating a dedicated ZK for Kafka, I'd go for 3 nodes, and keep the 5-node ZK for everything else. Beefing up the number of ZK's to 7 or more is a resounding No. Regarding the installation and management of the new Kafka ZK, it's pretty straightforward to install it manually, just follow the steps in one of "Non-Ambari cluster installation guides" like this one. You can also try to create a cluster composed of only Kafka and ZK and manage it by its own Ambari instance.
... View more
09-07-2016
11:36 AM
1 Kudo
You need an additional, temporary table to read your input file, and then some date conversion: hive> create table tmp(a string, b string) row format delimited fields terminated by ',';
hive> load data local inpath 'a.txt' overwrite into table tmp;
hive> create table mytime(a string, b timestamp);
hive> insert into table mytime select a, from_unixtime(unix_timestamp(b, 'dd-MM-yyyy HH:mm')) from tmp;
hive> select * from mytime;
a 2015-11-20 22:07:00
b 2015-08-17 09:45:00
... View more
09-07-2016
06:13 AM
Interesting, so the JIRA removed the "empty regions are not merged away" clause. If so, I'd not enable normalization of pre-split tables.
... View more
09-06-2016
09:46 AM
scala> val a = sc.textFile("/user/.../path/to/your/file").map(x => x.split("\t")).filter(x => x(0) != x(1))
scala> a.take(4)
res2: Array[Array[String]] = Array(Array(1, 4), Array(2, 5), Array(1, 5))
Try the snippet above, just insert the path to your file on hdfs.
... View more
09-06-2016
09:21 AM
In your example, 2 zero-size regions have been merged, while the logic page says: "empty" regions (less than 1MB, with the previous note) are not merged away. This is by design to prevent normalization from undoing the pre-splitting of a table. Can you kindly explain why.
... View more
09-04-2016
04:50 AM
It doesn't work because, to quote the related wiki page: When using group by clause, the select statement can only include columns included in the group by clause, and aggregate functions on other columns. So, your query will work if you remove "group by" and "min".
... View more
09-02-2016
09:06 AM
Hi @swathi thukkaraju, I see that you are using this solution in another question, so I guess it worked. If so, can you please accept & up-vote my answer to help us manage resolved questions. Thanks!
... View more