Member since
12-21-2020
91
Posts
8
Kudos Received
13
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1987 | 08-12-2021 05:16 AM | |
2210 | 06-29-2021 06:21 AM | |
2664 | 06-16-2021 07:15 AM | |
1879 | 06-14-2021 12:08 AM | |
6239 | 05-14-2021 06:03 AM |
08-18-2022
12:33 AM
Hi @Juanes , I think I downgraded glibc as well. Not exactly sure though as it was sometime back. I think there is an option to downgrade an individual package with rpm without touching its dependencies. But I would still advise you to be cautious. Maybe try it out once on a test system? Thanks, Megh
... View more
08-18-2022
12:06 AM
Hi @Juanes , Try downgrading the packages in question to the required version and then retry. I have also faced similar issue earlier, downgrading worked for me. Thanks, Megh
... View more
03-11-2022
03:45 AM
We've set this property in advanced hive-site section
... View more
03-11-2022
03:41 AM
Hi @na2_koihey11 , We're also using HDP 3.1 and we are able to create non-transactional tables by default by setting hive.default.fileformat.managed to TextFile. Thanks, Megh
... View more
03-08-2022
04:49 AM
check hive.default.fileformat and hive.default.fileformat.managed properties
... View more
08-24-2021
12:07 AM
Extremely grateful for this feature @VidyaSargur ! Looking forward to continue contributing to community! 🙂
... View more
08-12-2021
05:16 AM
Hi @vciampa , For the first question, Go into the HDFS configuration in CDP, and search for "SSL Client". Add the properties given in the link shared by you under "HDFS Advanced Configuration Snippet (Safety Valve) for ssl-client.xml". Also, you will see "Cluster-Wide Default TLS/SSL Client Truststore Location" and "Cluster-Wide Default TLS/SSL Client Truststore Password". Set these values accordingly to your Truststore location and Truststore password. Not sure about question 2 at the moment, will let you know if I have some info. Thanks, Megh
... View more
08-12-2021
05:01 AM
Hi @VidyaSargur , We're having an open thread on Cloudera support case, where @asish and I are discussing this same issue. As and when we get a resolution for this, We'll update this question and close. Thanks, Megh
... View more
08-02-2021
03:56 AM
Hello @asish , Apologies for not replying earlier. I recently thought that I'll compare a simple count(*) query performance on a single partition on my old cluster vs the current cluster and observe the difference I'm getting in the query counters. What I discovered is that there is one GC_TIME_MILLIS counter which is significantly higher in the current cluster. to give an example, this is the query I ran on both the clusters (Both clusters are having identical hardware and same amount of resources): select count(*) from mydb.tbl1 where date_partition_col='2021-07-30'; Following is the trace from the old cluster: ----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 64 64 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 79.65 s
----------------------------------------------------------------------------------------------
INFO : Status: DAG finished successfully in 79.64 seconds
INFO :
INFO : Query Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : OPERATION DURATION
INFO : ----------------------------------------------------------------------------------------------
INFO : Compile Query 0.23s
INFO : Prepare Plan 7.90s
INFO : Get Query Coordinator (AM) 0.21s
INFO : Submit Plan 0.24s
INFO : Start DAG 0.96s
INFO : Run DAG 79.64s
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : Task Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS
INFO : ----------------------------------------------------------------------------------------------
INFO : Map 1 73845.00 1,642,830 10,289 127,396,873 95
INFO : Reducer 2 31853.00 4,120 71 64 0
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_SUCCEEDED_TASKS: 65
INFO : TOTAL_LAUNCHED_TASKS: 65
INFO : DATA_LOCAL_TASKS: 7
INFO : RACK_LOCAL_TASKS: 57
INFO : AM_CPU_MILLISECONDS: 14500
INFO : AM_GC_TIME_MILLIS: 0
INFO : File System Counters:
INFO : FILE_BYTES_READ: 300
INFO : FILE_BYTES_WRITTEN: 3840
INFO : HDFS_BYTES_READ: 39636387430
INFO : HDFS_BYTES_WRITTEN: 248
INFO : HDFS_READ_OPS: 130
INFO : HDFS_WRITE_OPS: 2
INFO : HDFS_OP_CREATE: 1
INFO : HDFS_OP_GET_FILE_STATUS: 3
INFO : HDFS_OP_OPEN: 127
INFO : HDFS_OP_RENAME: 1
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : SPILLED_RECORDS: 0
INFO : NUM_SHUFFLED_INPUTS: 64
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : GC_TIME_MILLIS: 10360
INFO : TASK_DURATION_MILLIS: 1505522
INFO : CPU_MILLISECONDS: 1646950
INFO : PHYSICAL_MEMORY_BYTES: 139586437120
INFO : VIRTUAL_MEMORY_BYTES: 607444127744
INFO : COMMITTED_HEAP_BYTES: 139586437120
INFO : INPUT_RECORDS_PROCESSED: 127429957
INFO : INPUT_SPLIT_LENGTH_BYTES: 39636387430
INFO : OUTPUT_RECORDS: 64
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 384
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 896
INFO : OUTPUT_BYTES_PHYSICAL: 3328
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_BYTES: 1792
INFO : SHUFFLE_BYTES_DECOMPRESSED: 896
INFO : SHUFFLE_BYTES_TO_MEM: 1652
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_DISK_DIRECT: 140
INFO : SHUFFLE_PHASE_TIME: 30751
INFO : FIRST_EVENT_RECEIVED: 89
INFO : LAST_EVENT_RECEIVED: 30749
INFO : HIVE:
INFO : CREATED_FILES: 1
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 127396873
INFO : RECORDS_OUT_0: 1
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 95
INFO : RECORDS_OUT_INTERMEDIATE_Reducer_2: 0
INFO : RECORDS_OUT_OPERATOR_FS_13: 1
INFO : RECORDS_OUT_OPERATOR_GBY_10: 64
INFO : RECORDS_OUT_OPERATOR_GBY_12: 1
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_11: 95
INFO : RECORDS_OUT_OPERATOR_SEL_9: 127429893
INFO : RECORDS_OUT_OPERATOR_TS_0: 127429893
INFO : TaskCounter_Map_1_INPUT_urcs_transactions:
INFO : INPUT_RECORDS_PROCESSED: 127429893
INFO : INPUT_SPLIT_LENGTH_BYTES: 39636387430
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : OUTPUT_BYTES: 384
INFO : OUTPUT_BYTES_PHYSICAL: 3328
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 896
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 64
INFO : SPILLED_RECORDS: 0
INFO : TaskCounter_Reducer_2_INPUT_Map_1:
INFO : FIRST_EVENT_RECEIVED: 89
INFO : INPUT_RECORDS_PROCESSED: 64
INFO : LAST_EVENT_RECEIVED: 30749
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : NUM_SHUFFLED_INPUTS: 64
INFO : SHUFFLE_BYTES: 1792
INFO : SHUFFLE_BYTES_DECOMPRESSED: 896
INFO : SHUFFLE_BYTES_DISK_DIRECT: 140
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_TO_MEM: 1652
INFO : SHUFFLE_PHASE_TIME: 30751
INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
INFO : OUTPUT_RECORDS: 0
INFO : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters:
INFO : GROUPED_INPUT_SPLITS_Map_1: 64
INFO : INPUT_DIRECTORIES_Map_1: 1
INFO : INPUT_FILES_Map_1: 127
INFO : RAW_INPUT_SPLITS_Map_1: 127
INFO : Completed executing command(queryId=hive_20210802162145_c6dc5748-ef38-42fb-becd-46dd33d2dd16); Time taken: 88.954 seconds
INFO : OK
+------------+
| _c0 |
+------------+
| 127429893 |
+------------+
1 row selected (89.234 seconds) Same query trace from the new cluster: ----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 64 64 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 176.45 s
----------------------------------------------------------------------------------------------
INFO : Status: DAG finished successfully in 176.42 seconds
INFO :
INFO : Query Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : OPERATION DURATION
INFO : ----------------------------------------------------------------------------------------------
INFO : Compile Query 0.14s
INFO : Prepare Plan 0.04s
INFO : Get Query Coordinator (AM) 0.00s
INFO : Submit Plan 4.79s
INFO : Start DAG 0.04s
INFO : Run DAG 176.42s
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : Task Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS
INFO : ----------------------------------------------------------------------------------------------
INFO : Map 1 173773.00 9,458,900 151,977 127,398,948 95
INFO : Reducer 2 53863.00 2,490 0 64 0
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_SUCCEEDED_TASKS: 65
INFO : TOTAL_LAUNCHED_TASKS: 65
INFO : DATA_LOCAL_TASKS: 11
INFO : RACK_LOCAL_TASKS: 53
INFO : AM_CPU_MILLISECONDS: 22160
INFO : AM_GC_TIME_MILLIS: 0
INFO : File System Counters:
INFO : FILE_BYTES_READ: 480
INFO : FILE_BYTES_WRITTEN: 3840
INFO : HDFS_BYTES_READ: 39636387430
INFO : HDFS_BYTES_WRITTEN: 109
INFO : HDFS_READ_OPS: 129
INFO : HDFS_WRITE_OPS: 2
INFO : HDFS_OP_CREATE: 1
INFO : HDFS_OP_GET_FILE_STATUS: 2
INFO : HDFS_OP_OPEN: 127
INFO : HDFS_OP_RENAME: 1
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : SPILLED_RECORDS: 0
INFO : NUM_SHUFFLED_INPUTS: 64
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : GC_TIME_MILLIS: 151977
INFO : TASK_DURATION_MILLIS: 4806095
INFO : CPU_MILLISECONDS: 9461390
INFO : PHYSICAL_MEMORY_BYTES: 318765006848
INFO : VIRTUAL_MEMORY_BYTES: 607499304960
INFO : COMMITTED_HEAP_BYTES: 318765006848
INFO : INPUT_RECORDS_PROCESSED: 127429957
INFO : INPUT_SPLIT_LENGTH_BYTES: 39636387430
INFO : OUTPUT_RECORDS: 64
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 384
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 896
INFO : OUTPUT_BYTES_PHYSICAL: 3328
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_BYTES: 1792
INFO : SHUFFLE_BYTES_DECOMPRESSED: 896
INFO : SHUFFLE_BYTES_TO_MEM: 1568
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_DISK_DIRECT: 224
INFO : SHUFFLE_PHASE_TIME: 54159
INFO : FIRST_EVENT_RECEIVED: 92
INFO : LAST_EVENT_RECEIVED: 54149
INFO : DATA_BYTES_VIA_EVENT: 0
INFO : HIVE:
INFO : CREATED_FILES: 1
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 127398948
INFO : RECORDS_OUT_0: 1
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 95
INFO : RECORDS_OUT_INTERMEDIATE_Reducer_2: 0
INFO : RECORDS_OUT_OPERATOR_FS_13: 1
INFO : RECORDS_OUT_OPERATOR_GBY_10: 64
INFO : RECORDS_OUT_OPERATOR_GBY_12: 1
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_11: 95
INFO : RECORDS_OUT_OPERATOR_SEL_9: 127429893
INFO : RECORDS_OUT_OPERATOR_TS_0: 127429893
INFO : TaskCounter_Map_1_INPUT_urcs_transactions:
INFO : INPUT_RECORDS_PROCESSED: 127429893
INFO : INPUT_SPLIT_LENGTH_BYTES: 39636387430
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : DATA_BYTES_VIA_EVENT: 0
INFO : OUTPUT_BYTES: 384
INFO : OUTPUT_BYTES_PHYSICAL: 3328
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 896
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 64
INFO : SPILLED_RECORDS: 0
INFO : TaskCounter_Reducer_2_INPUT_Map_1:
INFO : FIRST_EVENT_RECEIVED: 92
INFO : INPUT_RECORDS_PROCESSED: 64
INFO : LAST_EVENT_RECEIVED: 54149
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : NUM_SHUFFLED_INPUTS: 64
INFO : SHUFFLE_BYTES: 1792
INFO : SHUFFLE_BYTES_DECOMPRESSED: 896
INFO : SHUFFLE_BYTES_DISK_DIRECT: 224
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_TO_MEM: 1568
INFO : SHUFFLE_PHASE_TIME: 54159
INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 64 64 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 176.45 s
----------------------------------------------------------------------------------------------
+------------+
| _c0 |
+------------+
| 127429893 |
+------------+
1 row selected (181.488 seconds) The only significant difference I see between these two is for GC_TIME_MILLIS. I tried to tune multiple Hive and YARN config settings related to Heap Memory and GC but there is no difference observed. Any ideas? Thanks, Megh
... View more
07-22-2021
05:50 AM
Hi @Anyy , What is the from table in the second query? 1st query can work as is. Thanks, Megh
... View more