Member since
02-23-2017
34
Posts
15
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5383 | 02-28-2019 10:37 AM | |
7968 | 02-27-2019 03:34 PM | |
4129 | 01-09-2019 07:39 PM | |
4216 | 01-08-2019 10:46 AM | |
10099 | 04-03-2018 03:17 PM |
08-25-2021
01:07 AM
hi,adar: if both the WAL segments and the CFiles are copied duing a tablet copy,then the follower tablet will alse flushing wal data to disk when growing up to 8M,in my opinion there has no difference between master tablet and follower tablet during the reading and writing,is that right?
... View more
02-18-2021
06:03 AM
Please add kudu-spark2-tools_2.11-1.6.0.jar depending on your spark and scala version. Use spark context instead of hive. Apache Spark 2.3 Below is the code for your reference: ------------------------------------- Read kudu table from pyspark with below code: kuduDF = spark.read.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").load() kuduDF.show(5) Write to kudu table with below code: DF.write.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").mode("append").save() ---------------------------------------- Reference link: https://medium.com/@sciencecommitter/how-to-read-from-and-write-to-kudu-tables-in-pyspark-via-impala... If in case you want to use Scala below is the reference link: https://kudu.apache.org/docs/developing.html
... View more
11-28-2019
12:24 PM
Kudu's HMS integration is a CDH 6.3 feature. If you would like to use it end-to-end, you'll have to use it with CDH 6.3. That said, even without the Kudu's native HMS integration, in all versions of CDH, when using Impala "internal" or "managed" Kudu tables (see here for more details), Impala will create HMS metadata on behalf of Kudu.
... View more
11-18-2019
12:51 AM
I see all my Tablet Servers Healthy and the Summary by Table also shows them Healthy. Nothing in 'Recovering / Under-Replicated / Unavailable'
... View more
03-01-2019
07:21 AM
Thanks again this seemed to have fixed the issue I am going through most of the tables as it looks like a few of them are still trying to talk to the old UUID.
... View more
03-01-2019
12:45 AM
You have mentioned that NTP is not related to the problem. Let's consider this scenario: 1. Impala Daemon is working with the READ_AT_SNAPSHOT setting enabled. Impala daemon makes a read operation in Kudu. It sets the read timestamp T1 immediately after the preceiding write operation. 2. Kudu despatches the read request to some replica R1. This replica R1 is running on a machine with poorly configured NTP, so the local time on this machine is 1 minute behind. 3. The replica R1 waits for the timeout specified by '--safe_time_max_lag_ms': 30 seconds. After the timeout, the local time is still 30 seconds behind T1 (ideally). Does this lead to the problem under discussion: 'Tablet is lagging too much to be able to serve snapshot scan'?
... View more
01-09-2019
07:39 PM
1 Kudo
In terms of resources, five masters wouldn't strain the cluster much. The big change is that the state that lives on the master (e.g. catalog metadata, etc.) would need to be replicated with a replication factor of 5 in mind (i.e. at least 3 copies to be considered "written"). While this is possible, the recommended configuration is 3. It is the most well-tested and commonly-used.
... View more
12-10-2018
10:35 AM
This has come up a few times on mailing lists and on the Apache Kudu slack, so I'll post here too; it's worth noting that if you want a single-partition table, you can omit the PARTITION BY clause entirely. See the "Note" here: https://www.cloudera.com/documentation/enterprise/latest/topics/kudu_impala.html#concept_g51_5vk_ft
... View more
06-22-2018
11:39 AM
That error generally means that there may be another Kudu node already running using that directory. Do you perhaps have both a Kudu master and tablet server running on that machine? If so, you'll need to change your configurations so your directories don't overlap.
... View more
04-05-2018
10:23 AM
1 Kudo
That is up to your workload and how much storage you need per node. It's common to see anywhere from 6 to 12 disks per tablet server. Check out the limitations documentation for some guidance there.
... View more