Member since
11-12-2013
41
Posts
11
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4464 | 10-18-2019 02:11 PM | |
4753 | 07-10-2019 03:20 PM | |
3465 | 03-24-2019 02:52 PM | |
5330 | 03-20-2019 09:01 AM | |
1705 | 12-13-2018 05:06 PM |
03-23-2022
08:28 AM
you can also try using below, CREATE TABLE my_first_table ( id BIGINT, name STRING, PRIMARY KEY(id) ) PARTITION BY HASH PARTITIONS 16 STORED AS KUDU TBLPROPERTIES ( 'kudu.master_addresses' = '<master1>[:port],<master2>[:port],<master3>[:port]' );
... View more
08-25-2021
01:07 AM
hi,adar: if both the WAL segments and the CFiles are copied duing a tablet copy,then the follower tablet will alse flushing wal data to disk when growing up to 8M,in my opinion there has no difference between master tablet and follower tablet during the reading and writing,is that right?
... View more
09-21-2020
08:44 AM
I have tested the backup/restore solution and seems to be working like charm with spark :
-First, check and record the names as given in the list of the kudu_master (or the primary elected master in case of multi masters ) http://Master1:8051/tables
-Download the kudu-backupX.X.jar in case you can't find it in /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/ and put it there
-In kuduMasterAddresses you put the name of your Kudu_master or the names of your three masters separated by ','
-Backup : sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduBackup /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE
-COPY : sudo -u hdfs hadoop distcp -i - hdfs:///PATH_HDFS/DB.TABLE hdfs://XXX:8020/kudu_backups/ -Restore:
sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduRestore /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE finally INVALIDATE METADATA
... View more
07-17-2019
09:28 AM
Kudu is often bottlenecked by the speed at which it can flush data to disk. This usually corresponds to the number of data directories (and to maintenance_manager_num_threads). So certainly the more disks (and thus disk bandwidth) that Kudu has access to, the faster it can ingest data. If you reduce the number of partitions, you'll generally be reducing the overall ingest speed because you're reducing write parallelism. If your goal is to reduce ingest speed, then by all means explore reducing the number of partitions.
... View more
05-16-2019
09:52 PM
You can most certainly project more than one column at a time in an Impala query, be it from a table in Kudu or from HDFS. Based on your problem description, it almost sounds like a problem with your terminal, or with the impala-shell configuration. Have you looked at the impala-shell configuration options? Maybe something there can help solve the problem.
... View more
03-24-2019
02:52 PM
No, the rebalancer doesn't fix leader skew. It may in a future release. Leaders can cluster onto one tserver when individual tservers are restarted; if you restart the entire cluster all at once you might be able to redistribute leadership more evenly. You're right that if you're only using one host to initiate reads, the reads will go to the local tserver rather than round-robin across the cluster. The master doesn't directly tell where clients to scan; it just provides them with enough information to make that decision based on their replica selection policy. There's also no way to do round robin (or randomized) replica selection.
... View more
03-20-2019
09:01 AM
1 Kudo
Indeed, there's going to be a significant amount of memory consumed just as overhead to support that number of tablets. So you should either reduce the number of tablets per tserver, or increase the amount of RAM available to Kudu on those heavily-loaded machines.
... View more
12-13-2018
05:06 PM
1 Kudo
Yes, unfortunately this was an oversight, and will be corrected in a future release. For now you can either do the workaround you suggested (manually underallocating for the others, then manually configuring Kudu's memory limit). Or, if you want to roll up your sleeves a bit: ssh to the CM server machine. For each file named KUDU*.jar (in /usr/share/cmf/common_jars, I believe): Extract the JAR file using the 'jar' utility. Modify descriptor/service.sdl by finding the memory_limit_hard_bytes entry, adding a comma after "default" : 4294967296, and adding a new line with the contents "autoConfigShare" : 100 Recreate the JAR around the extracted files (including the modified service.sdl) Overwrite the existing KUDU*.jar files with the new ones you created in step #5. Restart the CM server. You should now see an entry in he static service pools UI for Kudu's Tablet Server Hard Memory Limit.
... View more
08-03-2018
12:05 PM
2 Kudos
This error indicates that a Kudu tablet server has closed a scanner because it hadn't been accessed in some time. It could manifest on a particularly slow query that very infrequently fetches new data from Kudu. Can you share the Impala query profile for one of the failed queries, as well as the Impala daemon coordinator log? You can also work around the issue by reconfiguring Kudu's --scanner_ttl_ms flag to a much higher value (the default is 60s), though this will come at the potential cost of Kudu memory if any clients are orphaning their scanners.
... View more