About Alexey1c

Alexey1c · ‎05-26-2021

Hi! Those warning messages about dropped RPC requests due to backpressure is a sign that particular tablet server is likely overloaded. Consider the following remedies: Upgrade to the recent version of Kudu (1.14 as of now). Since Kudu 1.9.0 there have been many fixes which might help to reduce memory pressure for write-intensive workloads (e.g. see KUDU-2727, KUDU-2929), read-only workloads (KUDU-2836), and bunch of other improvements. BTW, if you are using CDH, then upgrading to CDH6.3.4 is a good first step in that direction: CDH6.3.4 contains fixes for KUDU-2727, KUDU-2929, KUDU-2836 (those were back-ported into CDH6.3.4). Make sure the tablet replica distribution is even across tablet servers: run the 'kudu cluster rebalance' CLI tool. If you suspect replica hot-spotting, consider re-creating the table in question to fan out the write stream across multiple tablets. I guess reading this guide might be useful: https://kudu.apache.org/docs/schema_design.html If nothing from the above helps, consider adding a few more tablet server nodes into your cluster. Once new nodes are added into the cluster, don't forget to run the 'kudu cluster rebalance' CLI tool. Kind regards, Alexey

Alexey1c · ‎09-30-2019

Hi, I think you will need Impala to make Superset working with Kudu. At http://superset.apache.org/#databases it's mentioned the database engine needs '... proper DB-API driver and SQLAlchemy dialect ...' to be usable by Superset. I guess the '...proper DB-API driver ...' is based on JDBC, and there isn't JDBC for Kudu as of now. As far as I know, there isn't native Superset Kudu connector either. However, contributions are always welcome! Kind regards, Alexey

Alexey1c · ‎08-22-2019

Hi, Kudu requires the machine clock of master and tablet servers nodes is synchronized using NTP : https://kudu.apache.org/docs/troubleshooting.html#ntp Kudu is tested with ntpd, but I guess chronyd might work as well. Whether using ntpd or chronyd, it's necessary to make sure the machine's clock is synchronized so ntp_adjtime() Linux system call doesn't return an error (see http://man7.org/linux/man-pages/man2/adjtimex.2.html for more technical details). It's not enough just to have ntpd (or chronyd) running. It's necessary to make sure the clock is synchronized. I would verify that the NTP daemon is properly configured and tracks the clocks of the reference servers. For the instructions to check the sync status of machine's clock, see https://kudu.apache.org/docs/troubleshooting.html#ntp if using ntpd or https://docs.fedoraproject.org/en-US/Fedora/18/html/System_Administrators_Guide/sect-Checking_if_chrony_is_synchronized.html for chronyd. Hope this helps, Alexey

Alexey1c · ‎08-22-2019

Whoops, the correct link to the WIP patch for PySpark integration work is http://gerrit.cloudera.org:8080/13088

Alexey1c · ‎08-22-2019

Hi, I'm not sure there is a full-fledged documentation on Kudu PySpark API: the connector is still in early development phase, if I'm not mistaken. However, the following in-flight patch has a few examples that might be helpful: https://gerrit.cloudera.org/#/c/13102/2/docs/developing.adoc But it doesn't answer your question about KuduContext: I'm not sure that functionality is implemented at this point. There was a WIP patch posted some time ago: https://gerrit.cloudera.org/#/c/13086/ However, I don't know how what that status of that work at this point, unfortunately.

Alexey1c · ‎05-30-2019

Hi, I don't know much about Kudu+PySpark except that there is a lot of room for improvement there, but maybe a couple of examples in the following patch-in-flight could be useful: https://gerrit.cloudera.org/#/c/13102/

Alexey1c · ‎10-17-2018

Oh, sorry -- it seems you are at 5.13.0 and that flag is not available in that version yet (but it's present starting 5.14.0). I'm afraid you need either to introduce custom mappings for those kudu service principals (so they would be mapped into 'kudu') or upgrade to 5.14 or higher to get access to that flag. Setting superuser ACL to '*' would not allow tablet servers to register with masters anyway because of the following: https://github.com/apache/kudu/blob/master/src/kudu/master/master_service.cc#L122

Alexey1c · ‎10-17-2018

Hi Christophe, It seems in your case kudu service principals (like 'kudu/XXX1119.krj.gie@REALM') are not mapped into 'kudu' as expected, but into name of local users (like 'm-zhdp-s-hwefzjneur'). If I'm not mistaken, that's exactly https://issues.apache.org/jira/browse/KUDU-2198. As a workaround, I can suggest to add --use_system_auth_to_local=false to the Kudu flags (both masters and tservers). If using CM, add that flag into the 'Kudu Service Advanced Configuration Snippet (Safety Valve) for gflagfile'. Hope this helps. Regards, Alexey

Alexey1c · ‎10-01-2018

The 'ps' sample output from one your servers looks fine. Just another question: I assume the 'superuser_acl' property in you CM configuration (that's blurred out) contains 'kudu' (or whatever you have for the Kudu service principal), right? If not, add that into the list. Anyway, it's hard to say what's wrong looking at the configuration snippets and playing the 'guess what?' game. I would highly recommend following Will's advise on looking into the logs of master(s) and tablet servers for the error details. I think that will give you a firm starting point in troubleshooting the issue and save some time for everybody. Regards, Alexey

Alexey1c · ‎09-19-2018

Some additional points, just in case: Using nscd (https://linux.die.net/man/8/nscd) might help with slow DNS resolutions, but if using /etc/hosts-based resolutions isn't helping (as reported in one of earlier posts), maybe it's worth verifying that those files are used (i.e. check nss.conf, etc.) Sometimes transparent hugepages support might be a drag: https://alexandrnikitin.github.io/blog/transparent-hugepages-measuring-the-performance-impact/ Maybe, try to disable it at one of your servers (e.g., try that on a machine that runs kudu-tserver and as less other processes as possible) and collect some performance metrics, comparing THP enabled/disabled. If the distribution of tablet replicas is not balanced, the tablet servers hosting greater number of replicas might experience RPC queue overflows more often than others. In the master branch of the Kudu git repo a new rebalancer tool was introduced recently: https://github.com/apache/kudu/blob/master/docs/administration.adoc#running-the-tablet-rebalancing-tool You could try to build it from source and run against your cluster with --report_only option first to see whether it makes sense to rebalance your cluster. If the rebalancer's report shows big imbalance in tablet replicas distribution, running the rebalancer tool might help. Thanks, Alexey

Online	Offline
Last Visited	‎10-12-2023 05:20 PM

Member Since	‎03-16-2017 02:07 PM
Last Visited	‎10-12-2023 05:20 PM
Posts	37
Kudos received	6

Cloudera Community

Re: Apache Superset with Apache Kudu

Re: PySpark - Kudu API/command reference

Re: KUDU Couldn't send request to peer Status: Re...

Re: Apache Superset with Apache Kudu

Re: KUDU time sync error

Re: PySpark - Kudu API/command reference

Re: PySpark - Kudu API/command reference

Re: Apache kudu

Re: Kudu and Kerberos

Re: Kudu and Kerberos

Re: Kudu and Kerberos

Re: Kudu backpressure and service queue is full