08-04-2016 10:17 PM - edited 08-05-2016 12:41 AM
For broadcast transmission, is it correct to say that, total_network_send_timer - thrift_transmit_timer is the time spend on locking and wait?
The following snappit is obtaind from our Impala cluster.
The data stream sender broadcasts data to 38 Impalad nodes.
Since the TransmitDataRPCTime (aggregated network transmission time for all sender threads) is only 15s667ms, therefore, my understanding is that the rest of the time, i.e., 9m4s - 15s667ms is the time spent on wait/locking/updating the network send time counter.
Fragment F23: Instance e040e6c84d494ed6:3255ca5dbe354ce4 (host=cdh-datanode-104.lufax.storage:22000):(Total: 9m12s, non-child: 8m52s, % non-child: 96.43%) Hdfs split stats (<volume id>:<# splits>/<split lengths>): 7:4/140.02 MB 4:4/340.69 MB 0:10/972.22 MB 1:7/776.15 MB 6:5/265.73 MB 9:9/964.64 MB 5:6/431.62 MB 8:3/328.13 MB 3:4/445.24 MB 2:9/742.39 MB - AverageThreadTokens: 47.69 - BloomFilterBytes: 0 - PeakMemoryUsage: 320.12 MB (335674080) - PerHostPeakMemUsage: 14.78 GB (15868058032) - PrepareTime: 176.046us - RowsProduced: 32.75M (32749654) - TotalCpuTime: 7h10m - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 9m4s - TotalStorageWaitTime: 1s531ms DataStreamSender (dst_id=55):(Total: 19s452ms, non-child: 19s452ms, % non-child: 100.00%) - BytesSent: 18.81 GB (20198277370) - NetworkThroughput(*): 1.20 GB/sec - OverallThroughput: 990.22 MB/sec - PeakMemoryUsage: 202.47 KB (207328) - RowsReturned: 32.75M (32749654) - SerializeBatchTime: 3s768ms - TransmitDataRPCTime: 15s677ms - UncompressedRowBatchSize: 32.46 GB (34850497072)
08-05-2016 05:35 PM
I'm not the most knowledgeable person about this part of the code, but what you're saying is correct. One of the likely causes of long wait times is if the receiver is consuming data slower than the sender is sending it.