<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Low Performance with Kudu and potential network error in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69898#M79803</link>
    <description>&lt;P&gt;Sounds like good news. Thanks for the update!&lt;/P&gt;</description>
    <pubDate>Wed, 11 Jul 2018 21:54:09 GMT</pubDate>
    <dc:creator>mpercy</dc:creator>
    <dc:date>2018-07-11T21:54:09Z</dc:date>
    <item>
      <title>Low Performance with Kudu and potential network error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69319#M79800</link>
      <description>&lt;P&gt;I&amp;nbsp;recently tried to make a real time PoC using Kafka, Spark2 and Kudu and I'm&amp;nbsp;facing severe performance impact with Kudu.&lt;/P&gt;&lt;P&gt;To sum up the case, some devices send short messages to a Kafka topic, collected by a Spark2 Streaming job (5 seconds batch) and persisted to Kudu throw upsert operations.&lt;BR /&gt;Volume is up to 600.000 messages / 5 seconds, 40bytes / message. My&amp;nbsp;Kudu counts with 6 Tablet Servers (Hard Memory = 20GB / Kudu Tablet Server), each TS has 3 SSD disks and the table is defined with hash partition of 100 buckets, and 3 Masters.&lt;BR /&gt;Kudu is deployed with a Cloudera CDH ExpressEdition 5.14 in Azure to make the PoC, and all the nodes are in the same vnet.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I&amp;nbsp;start the Spark2 job (execution time remains between 1 and 3 seconds), everything seems to go fine the first 24 hours. Next it appears a kind of bottleneck upserting rows into Kudu, and the delay varies between 5 and more than 600 seconds.&lt;/P&gt;&lt;P&gt;Analizing the Kudu TS logs, I&amp;nbsp;can see warning and error messages I&amp;nbsp;can't interpret.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For instance:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 08:21:50.793898 2051 negotiation.cc:313] Failed RPC negotiation. Trace:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;0622 08:21:50.793636 (+ 0us) reactor.cc:579] Submitting negotiation task for server connection from 10.164.3.101:45662&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;0622 08:21:50.793817 (+ 181us) server_negotiation.cc:176] Beginning negotiation&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;0622 08:21:50.793819 (+ 2us) server_negotiation.cc:365] Waiting for connection header&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;0622 08:21:50.793862 (+ 43us) negotiation.cc:304] Negotiation complete: Network error: Server connection negotiation failed: server connection from xxx.xxx.xxx.xxx:45662: BlockingRecv error: Recv() got EOF from remote (error 108)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;Metrics: {"server-negotiator.queue_time_us":151,"thread_start_us":86,"threads_started":1}&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This warning appears several times on several TS, but I&amp;nbsp;didn't notice any trouble with our network.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times" size="2"&gt;E0622 11:24:43.982810 28825 hybrid_clock.cc:395] Unable to read clock for last 0.774s: Service unavailable: Error reading clock. Clock considered unsynchronized&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif" size="3"&gt;This error has begun&amp;nbsp;after 5 days using Kudu day and night.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:47.979626 66209 connection.cc:657] server connection from xxx.xxx.xxx.xxx:50238 send error: Network error: failed to write to TLS socket: Connection reset by peer&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:47.979672 66209 connection.cc:422] Connection torn down before Call kudu.tserver.TabletServerService.Write from &lt;SPAN&gt;xxx&lt;/SPAN&gt;&lt;SPAN&gt;.xxx.xxx.xxx&lt;/SPAN&gt;:50238 (ReqId={client: afb1a862dcca4b3494e27a011fe31d63, seq_no=4637933, attempt_no=1}) could send its response&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:47.979729 66209 connection.cc:188] Error closing socket: Network error: TlsSocket::Close: Success&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:47.981634 66209 connection.cc:511] server connection from &lt;SPAN&gt;xxx&lt;/SPAN&gt;&lt;SPAN&gt;.xxx.xxx.xxx&lt;/SPAN&gt;:33178 recv error: Network error: failed to read from TLS socket: Connection reset by peer (error 104)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:47.981720 66209 connection.cc:188] Error closing socket: Network error: TlsSocket::Close: Broken pipe&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:48.057577 66210 connection.cc:657] server connection from &lt;SPAN&gt;xxx&lt;/SPAN&gt;&lt;SPAN&gt;.xxx.xxx.xxx&lt;/SPAN&gt;:41744 send error: Network error: failed to write to TLS socket: Broken pipe&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:48.057612 66210 connection.cc:422] Connection torn down before Call kudu.tserver.TabletServerService.Write from &lt;SPAN&gt;xxx&lt;/SPAN&gt;&lt;SPAN&gt;.xxx.xxx.xxx&lt;/SPAN&gt;:41744 (ReqId={client: 47ee414afec74fcb86f86a19b62d8f7d, seq_no=4631450, attempt_no=1}) could send its response&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:48.057677 66210 connection.cc:188] Error closing socket: Network error: TlsSocket::Close: Success&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0622 11:24:48.057819 66210 connection.cc:422] Connection torn down before Call kudu.tserver.TabletServerService.Write from &lt;SPAN&gt;xxx&lt;/SPAN&gt;&lt;SPAN&gt;.xxx.xxx.xxx&lt;/SPAN&gt;:41744 (ReqId={client: 47ee414afec74fcb86f86a19b62d8f7d, seq_no=4631470, attempt_no=1}) could send its response&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif" size="3"&gt;In this case, I&amp;nbsp;notice this warning on several TS, but I can't identify any trouble with my&amp;nbsp;network (an independant vnet in Azure)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0621 14:34:19.201867 125277 env_posix.cc:862] Time spent sync call for /data/kudu/0/data/data/008809b1a384414b93b9d95b6275217a.metadata: real 2.234s user 0.000s sys 0.000s&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0621 14:34:32.088975 125277 env_posix.cc:862] Time spent sync call for /data/kudu/0/data/data/56153c373b084500b3b1fd9310c30c53.data: real 1.786s user 0.000s sys 0.000s&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0621 14:37:23.982235 125277 env_posix.cc:862] Time spent sync call for /data/kudu/0/data/data/df02bb85c15c4652ae653d68e7b8d0ae.metadata: real 7.751s user 0.000s sys 0.000s&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;W0621 14:37:45.769942 125275 kernel_stack_watchdog.cc:191] Thread 118579 stuck at /data/jenkins/workspace/generic-package-centos64-7-0-impala/topdir/BUILD/kudu-1.6.0-cdh5.14.0/src/kudu/consensus/log.cc:664 for 557ms:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;Kernel stack:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc03eecc5&amp;gt;] do_get_write_access+0x285/0x4c0 [jbd2]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc03eef27&amp;gt;] jbd2_journal_get_write_access+0x27/0x40 [jbd2]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc0442d4b&amp;gt;] __ext4_journal_get_write_access+0x3b/0x80 [ext4]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc04144a0&amp;gt;] ext4_reserve_inode_write+0x70/0xa0 [ext4]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc0414523&amp;gt;] ext4_mark_inode_dirty+0x53/0x210 [ext4]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc0417af0&amp;gt;] ext4_dirty_inode+0x40/0x60 [ext4]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff8122ffba&amp;gt;] __mark_inode_dirty+0x16a/0x270&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff8121ee01&amp;gt;] update_time+0x81/0xd0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff8121eef0&amp;gt;] file_update_time+0xa0/0xf0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff811865a8&amp;gt;] __generic_file_aio_write+0x198/0x400&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff81186869&amp;gt;] generic_file_aio_write+0x59/0xa0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffc040aeeb&amp;gt;] ext4_file_write+0xdb/0x470 [ext4]&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff812020e9&amp;gt;] do_sync_readv_writev+0x79/0xd0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff81203cce&amp;gt;] do_readv_writev+0xce/0x260&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff81203ef5&amp;gt;] vfs_writev+0x35/0x60&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff812042f2&amp;gt;] SyS_pwritev+0xc2/0xf0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffff816b89fd&amp;gt;] system_call_fastpath+0x16/0x1b&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times" size="2"&gt;User stack:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="andale mono,times" size="2"&gt;(thread did not respond: maybe it is blocking signals)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In this case, I&amp;nbsp;just can't understand what the message means.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm sure something is wrong with my global configuration (Kudu or Network), but I can't understand what.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help would be great!&lt;/P&gt;&lt;P&gt;Thanks&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:22:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69319#M79800</guid>
      <dc:creator>ChelouteMS</dc:creator>
      <dc:date>2022-09-16T13:22:33Z</dc:date>
    </item>
    <item>
      <title>Re: Low Performance with Kudu and potential network error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69370#M79801</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the warning you saw in watchdog, it looks like a kernal bug&amp;nbsp;&lt;SPAN&gt;on EL 6 machines using EXT 4, which requires either&amp;nbsp;upgrade to RHEL7 or to use XFS instead of EXT4.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;And would you mind sharing the master log when you see the&amp;nbsp;slowness (from the symptom you described, one possiblity could be KUDU-2264)? Thanks!&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Best,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Hao&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jun 2018 22:07:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69370#M79801</guid>
      <dc:creator>Hao Hao</dc:creator>
      <dc:date>2018-06-25T22:07:50Z</dc:date>
    </item>
    <item>
      <title>Re: Low Performance with Kudu and potential network error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69873#M79802</link>
      <description>Hi, well... my fault... The problem wasn't Kudu but the Azure ELB in front of all the platform: traffic was blocked due to DDOS policy... Everything else became unstable when traffic started to be blocked.</description>
      <pubDate>Wed, 11 Jul 2018 16:32:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69873#M79802</guid>
      <dc:creator>ChelouteMS</dc:creator>
      <dc:date>2018-07-11T16:32:26Z</dc:date>
    </item>
    <item>
      <title>Re: Low Performance with Kudu and potential network error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69898#M79803</link>
      <description>&lt;P&gt;Sounds like good news. Thanks for the update!&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jul 2018 21:54:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Low-Performance-with-Kudu-and-potential-network-error/m-p/69898#M79803</guid>
      <dc:creator>mpercy</dc:creator>
      <dc:date>2018-07-11T21:54:09Z</dc:date>
    </item>
  </channel>
</rss>

