About awong

chn · ‎08-25-2021

hi,adar: if both the WAL segments and the CFiles are copied duing a tablet copy,then the follower tablet will alse flushing wal data to disk when growing up to 8M,in my opinion there has no difference between master tablet and follower tablet during the reading and writing,is that right?

Jinn · ‎02-18-2021

Please add kudu-spark2-tools_2.11-1.6.0.jar depending on your spark and scala version. Use spark context instead of hive. Apache Spark 2.3 Below is the code for your reference: ------------------------------------- Read kudu table from pyspark with below code: kuduDF = spark.read.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").load() kuduDF.show(5) Write to kudu table with below code: DF.write.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").mode("append").save() ---------------------------------------- Reference link: https://medium.com/@sciencecommitter/how-to-read-from-and-write-to-kudu-tables-in-pyspark-via-impala... If in case you want to use Scala below is the reference link: https://kudu.apache.org/docs/developing.html

awong · ‎11-28-2019

Kudu's HMS integration is a CDH 6.3 feature. If you would like to use it end-to-end, you'll have to use it with CDH 6.3. That said, even without the Kudu's native HMS integration, in all versions of CDH, when using Impala "internal" or "managed" Kudu tables (see here for more details), Impala will create HMS metadata on behalf of Kudu.

Amn_468 · ‎11-18-2019

I see all my Tablet Servers Healthy and the Summary by Table also shows them Healthy. Nothing in 'Recovering / Under-Replicated / Unavailable'

mfulwiler · ‎03-01-2019

Thanks again this seemed to have fixed the issue I am going through most of the tables as it looks like a few of them are still trying to talk to the old UUID.

arseniy · ‎03-01-2019

You have mentioned that NTP is not related to the problem. Let's consider this scenario: 1. Impala Daemon is working with the READ_AT_SNAPSHOT setting enabled. Impala daemon makes a read operation in Kudu. It sets the read timestamp T1 immediately after the preceiding write operation. 2. Kudu despatches the read request to some replica R1. This replica R1 is running on a machine with poorly configured NTP, so the local time on this machine is 1 minute behind. 3. The replica R1 waits for the timeout specified by '--safe_time_max_lag_ms': 30 seconds. After the timeout, the local time is still 30 seconds behind T1 (ideally). Does this lead to the problem under discussion: 'Tablet is lagging too much to be able to serve snapshot scan'?

awong · ‎01-09-2019

In terms of resources, five masters wouldn't strain the cluster much. The big change is that the state that lives on the master (e.g. catalog metadata, etc.) would need to be replicated with a replication factor of 5 in mind (i.e. at least 3 copies to be considered "written"). While this is possible, the recommended configuration is 3. It is the most well-tested and commonly-used.

awong · ‎12-10-2018

This has come up a few times on mailing lists and on the Apache Kudu slack, so I'll post here too; it's worth noting that if you want a single-partition table, you can omit the PARTITION BY clause entirely. See the "Note" here: https://www.cloudera.com/documentation/enterprise/latest/topics/kudu_impala.html#concept_g51_5vk_ft

awong · ‎06-22-2018

That error generally means that there may be another Kudu node already running using that directory. Do you perhaps have both a Kudu master and tablet server running on that machine? If so, you'll need to change your configurations so your directories don't overlap.

awong · ‎04-05-2018

That is up to your workload and how much storage you need per node. It's common to see anywhere from 6 to 12 disks per tablet server. Check out the limitations documentation for some guidance there.

Online	Offline
Last Visited	‎08-30-2021 03:24 PM

Member Since	‎02-23-2017 10:17 AM
Last Visited	‎08-30-2021 03:24 PM
Posts	34
Kudos received	15

Cloudera Community

Re: Kudu Error code: WRONG_SERVER_UUID

Re: Kudu read fails: Tablet is lagging too much to...

Re: Small Kudu Cluster

Re: Small Kudu Cluster

Re: Kudu Master Directories

Re: Data replication in Kudu

Re: Error while inserting data into Kudu table fro...

Re: Problem with Kudu table name

Re: Kudu T-Server Error

Re: Kudu Error code: WRONG_SERVER_UUID

Re: Kudu read fails: Tablet is lagging too much to...

Re: Small Kudu Cluster

Re: Error during CREATE KUDU table using IMPALA

Re: cdh Kudu tablet servers failed to start

Re: Kudu Master Directories