Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Thrift Buffer Size Received from Impala

Thrift Buffer Size Received from Impala

Explorer

In the process of investigating performance of data retrieval from Impala we have encountered unexpected behaviour (it is unexpected by us at least).

 

We have a C++ application that connects to Impala’s HiveServer2 using Thrift directly. We know that we will be retrieving a large number of rows (> 1 million) from Impala and therefore we want to retrieve as much data as possible in each network round trip. In an attempt to achieve this we increase the Thrift buffer size from the default of 512 bytes.

 

Using strace we can see that the non-default buffer size is being requested, but the number of bytes returned is less than the request in many cases as the request size is increased. In order to quantify this I ran simple tests with the following buffer sizes:

 

  • 512 (matching the default)
  • 1024
  • 8192
  • 1048576 (expected optimium for throughput)

For each buffer size I executed the same query and ran strace on the client process (using a script). Extracting only the recvfrom calls from each of the strace output files showed the following - histograms available here

 

  • 512 bytes requested resulted in 98.55% of the responses matching the requested size
  • 1024 bytes requested resulted in 96.00% of the responses matching the requested size
  • 8192 bytes requested resulted in 55.73% of the responses matching the requested size
  • 1048576 bytes requested resulted in none of the responses returning 1M and a broad range of sizes being returned.

My questions are:

 

  1. Is this expected behaviour?
  2. Are we missing some configuration option that will allow us to minimise network round trips when retrieving a significant amount of data?

Comparing the query profile for the 512 and 8192 byte requests reveals the scale of the performance benefit:

 

512 byte requests

 

 Query Timeline: 34s612ms
      - Start execution: 64.337us (64.337us)
      - Planning finished: 4.871ms (4.807ms)
      - Ready to start remote fragments: 155.137ms (150.266ms)
      - Remote fragments started: 156.321ms (1.184ms)
      - Rows available: 368.767ms (212.445ms)
      - First row fetched: 368.962ms (195.516us)
      - Unregister query: 34s610ms (34s241ms)
 ImpalaServer:
    - ClientFetchWaitTimer: 30s435ms
    - RowMaterializationTimer: 3s497ms1M

 

8192 byte requests

 

Query Timeline: 14s755ms
      - Start execution: 110.916us (110.916us)
      - Planning finished: 4.852ms (4.741ms)
      - Ready to start remote fragments: 154.449ms (149.597ms)
      - Remote fragments started: 156.79ms (1.629ms)
      - Rows available: 363.807ms (207.727ms)
      - First row fetched: 365.63ms (1.255ms)
      - Unregister query: 14s754ms (14s389ms)
 ImpalaServer:
    - ClientFetchWaitTimer: 10s603ms
    - RowMaterializationTimer: 3s495ms

 

Note that as shown in the histograms, the 8k requests are only satisfied 55.73% of the time, with 14.94% still using 512 bytes, so there is scope for a greater performance increase.

 

The following shows an excerpt from the strace output for 1M requests:

 

recvfrom(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\036028\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0, NULL, NULL) = 12288

recvfrom(5, "\036028\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0, NULL, NULL) = 6144

recvfrom(5, "\0\0\0\0\0\0\0\0\0\0\036028\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0, NULL, NULL) = 8704

recvfrom(5, "\0\0\0\0\0\0\0\0\0\0\036028\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0, NULL, NULL) = 6879

 

Thanks in advance for any guidance.