Member since
09-13-2017
20
Posts
0
Kudos Received
0
Solutions
01-03-2018
05:23 PM
@dalves, Thanks for your quick reply.
... View more
01-02-2018
05:49 PM
Can I brief it as: The NTP is used to make the MVCC and READ_AT_SNAPSHOT scan accurately. If the max_clock_sync_error_usec is large will result in the scan deviation is more larger too, and vice versa?
... View more
01-02-2018
05:09 PM
Hi adar, I think I already get the answer from your post. Maybe the following explanation can clarify my original question. When I insert data into kudu, only write to a majority's write-ahead logs. The internal flushing and/or compacting for each tablet will generate a set of CFiles as replicas. And all scan only need to scan the replica (a set of CFiles which contain the base data and the delta data) and the MemRowset to return the query result. Is this right? But for the tablet coping, it will only copy the wal or both the wal and the replica will be copied? Best regards, Tony
... View more
01-01-2018
06:53 PM
Hi Awong, Right, Kudu replicates data logically to multiple tservers based on each table's replication factor (typically 3), and in doing so, writes are only considered successful once durably written to a majority's write-ahead logs. From then on, each tserver can maintain the data via flushing and compactions, "decoupled" from the writes to the log. After the flushing and compaction of tserver, each tablet will have 2 physical replications. And the subsequent CLOSEST_REPLICA scan don't have to compact the wal, is this right? Best regards, Tony
... View more
12-14-2017
09:55 PM
Hi, Kudu now use raft to ensure consensus , why it still need NTP (as far as I know, the raft features don't need NTP)? What's the responsibility of NTP in KUDU? It is used to ensure the scan consistency? Our tservers and masters always crash due to ntp unsync, and I change the max_clock_sync_error_usec to 30000000 now, will this influence the cluster? I also saw the commit about KUDU-1578, it said: In the case that the clock is out of sync for a significantly long time,
the max error will grow large enough to eclipse the 10-second default,
at which point it will still crash as before. But, if NTP is properly
restored within a few minutes, the server should remain operational. What's the meaning of max error? Best regards, Tony
... View more
Labels:
- Labels:
-
Apache Kudu
12-14-2017
05:20 PM
@awong,Thanks for your quick reply.
... View more
12-13-2017
05:27 PM
Hi awong, READ_LATEST doesn't guarantee consistency because when a scan gets sent to a replica (not necessarily the leader), that replica will respond with the latest data it has available (rather than at a specific timestamp). If that replica is being caught up or is behind in terms replication for some reason, this will be a stale result. 1. You mean the impala 2.10 always choose the ReplicaSelection.CLOSEST_REPLICA to build the scanner ? because I only use impala to insert and select. 2. If the scanner choose ReplicaSelection.LEADER_ONLY and READ_LATEST to build, only the leadership change will cause the scan inconsistency? Best regards, Tony
... View more
12-01-2017
12:54 AM
When I try to use impala to transfer massive data (about 100G) for one time and select count(1) immediately, I get the wrong num. Then I execute the same sql again, the total count is correct. I want to know besides leader change, is there have any other internal ops can cause the scan inconsistency? If I change the impala configure kudu_read_mode: READ_LATEST to kudu_read_mode: READ_AT_SNAPSHOT, what's the timestamp that the impala will transimit? If the READ_AT_SNAPSHOT can resolve the issue? I am using the impala 2.10.0 + kudu 1.5.0.
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu