Member since
02-23-2017
34
Posts
15
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5122 | 02-28-2019 10:37 AM | |
7469 | 02-27-2019 03:34 PM | |
3877 | 01-09-2019 07:39 PM | |
3964 | 01-08-2019 10:46 AM | |
9813 | 04-03-2018 03:17 PM |
01-08-2019
04:48 PM
1 Kudo
That's correct, it is a manual process to get back up to three Masters, though the remaining two-Master deployment will still be useable until then. KUDU-2181 would improve this, but no one is working on it right now, AFAIK.
... View more
01-08-2019
10:46 AM
1 Kudo
The master role generally takes few resources, and it isn't uncommon to colocate a Tablet Server and Master. Additionally, it's is strongly encouraged to have at least four Tablet Servers for a minimal deployment for higher availability in the case of failures (see the note about KUDU-1097 here). That said, with five nodes, you could have three Masters and five Tablet Servers. Even still, five Tablet Servers isn't huge, so please take a look at the Scaling Limits and Scaling Guide to ensure you sufficiently provision your cluster. If a node with both roles fails, the following will happen: On the Master side, if the failed Master was a leader Master, a new leader will be elected between the remaining two Masters and business will continue as usual. No automatic recovery will happen to bring up a new Master, and these steps should be followed to recover the failed Master when convenient. On the Tablet Server side, the tablet replicas that existed on the Tablet Server role will automatically be re-replicated to the remaining four Tablet Servers. If you went with three Tablet Servers, this would not happen, since the remaining two Tablet Servers would already have replicas of every tablet, and the cluster would be stuck with every tablet having two copies instead of three. The service would still function, but another failure would render every tablet unavailable.
... View more
12-10-2018
10:35 AM
This has come up a few times on mailing lists and on the Apache Kudu slack, so I'll post here too; it's worth noting that if you want a single-partition table, you can omit the PARTITION BY clause entirely. See the "Note" here: https://www.cloudera.com/documentation/enterprise/latest/topics/kudu_impala.html#concept_g51_5vk_ft
... View more
06-22-2018
11:39 AM
That error generally means that there may be another Kudu node already running using that directory. Do you perhaps have both a Kudu master and tablet server running on that machine? If so, you'll need to change your configurations so your directories don't overlap.
... View more
06-08-2018
07:16 PM
AFAIK there isn't any official guidance for using PySpark with Kudu. That said, have you taken a look at this repo that has some info on using Kudu with PySpark?
... View more
04-05-2018
10:23 AM
1 Kudo
That is up to your workload and how much storage you need per node. It's common to see anywhere from 6 to 12 disks per tablet server. Check out the limitations documentation for some guidance there.
... View more
04-04-2018
03:12 PM
Yep, that would be ideal in that background flushes/compactions would not affect write performance and Raft elections.
... View more
04-03-2018
03:17 PM
1 Kudo
> Should multiple directories be used for storing the Kudu master data? The master nodes generally don't see a huge amount of disk IO, as their role is primarily focused tablet placement, rather than data storage. The reason fs_data_dirs is plural for the master is that tablet servers and master nodes leverage the same FS configuration code. Feel free to use a single directory. I wouldn't expect it to bottleneck your cluster. > Are there significant benefits of having multiple Kudu master data > directories or inherit risks with just a single master data directory? Not really. The master isn't a bottleneck for the most part, and they only store a few GBs on disk. Also disk failures are not handled for masters as they are on tablet servers, so the extra disks don't provide any added fault tolerance either. > I've read that SSDs are recommended for the WAL directories. Is there a major > performance impact if the WAL directory is on the same mount point as one of > the data directories? It's not uncommon to see this, where the fs_wal_dir is the same as the first entry of fs_data_dirs. There is a caveat to this that in Kudu 1.5 and below, the first data directory also stored tablet-specific metadata that is used for the Raft consensus protocol, and we've seen this lead to occasional dips in performance when tablet server ingest workloads coincide with periods of high Raft election traffic. This is less relevant for masters, which generally don't get bottlenecked by disk IO.
... View more
02-20-2018
10:16 AM
There might be a better answer, but I don't think it's too uncommon to periodically write all of your Kudu data to HDFS in Parquet. That way, if your Kudu cluster somehow becomes irrecoverably broken, you'll at least be able to bring back any lost tables from the HDFS copy.
... View more
02-20-2018
10:11 AM
How did you add the master? Did you follow the steps outlined in the master migration guide? https://kudu.apache.org/docs/administration.html#migrate_to_multi_master
... View more
- « Previous
-
- 1
- 2
- Next »