Member since
09-16-2021
315
Posts
46
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
116 | 11-10-2024 11:19 PM | |
272 | 10-25-2024 05:02 AM | |
1476 | 09-10-2024 07:50 AM | |
583 | 09-04-2024 05:35 AM | |
1449 | 08-28-2024 12:40 AM |
10-08-2024
10:34 PM
1 Kudo
According to the requirement, please find below the insert statement. INSERT INTO TAB_SCAPI_DIGNASMBP_1728048648203 (C1, C2) VALUES (1,"|VALUE ONE;"), (2,"|VALUE TWO;"), (3,"|NULL;"), (4,"|VALUE FOUR;"), (5,"|VALUE FIVE;"); Kindly review the results after executing the above insert statement. INFO : Compiling command(queryId=hive_20241009052425_caec4881-3c5a-4a18-9599-b20c368de25d): INSERT INTO TAB_SCAPI_DIGNASMBP_1728048648203 (C1, C2) VALUES
(1,"|VALUE ONE;"),
(2,"|VALUE TWO;"),
(3,"|NULL;"),
(4,"|VALUE FOUR;"),
(5,"|VALUE FIVE;")
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, type:int, comment:null), FieldSchema(name:_col1, type:varchar(20), comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20241009052425_caec4881-3c5a-4a18-9599-b20c368de25d); Time taken: 1.655 seconds
INFO : Executing command(queryId=hive_20241009052425_caec4881-3c5a-4a18-9599-b20c368de25d): INSERT INTO TAB_SCAPI_DIGNASMBP_1728048648203 (C1, C2) VALUES
(1,"|VALUE ONE;"),
(2,"|VALUE TWO;"),
(3,"|NULL;"),
(4,"|VALUE FOUR;"),
(5,"|VALUE FIVE;")
INFO : Query ID = hive_20241009052425_caec4881-3c5a-4a18-9599-b20c368de25d
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20241009052425_caec4881-3c5a-4a18-9599-b20c368de25d
INFO : Session is already open
INFO : Dag name: INSERT INTO TAB_SCAPI_DIGNASMBP_17...FIVE;") (Stage-1)
INFO : Tez session was closed. Reopening...
INFO : Session re-established.
INFO : Session re-established.
INFO : Status: Running (Executing on YARN cluster with App id application_1728390353038_0009)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 9.37 s
----------------------------------------------------------------------------------------------
INFO : Status: DAG finished successfully in 8.10 seconds
INFO :
INFO : Query Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : OPERATION DURATION
INFO : ----------------------------------------------------------------------------------------------
INFO : Compile Query 1.66s
INFO : Prepare Plan 0.36s
INFO : Get Query Coordinator (AM) 0.01s
INFO : Submit Plan 7.02s
INFO : Start DAG 0.11s
INFO : Run DAG 8.10s
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : Task Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS
INFO : ----------------------------------------------------------------------------------------------
INFO : Map 1 3598.00 5,920 102 3 1
INFO : Reducer 2 365.00 830 0 1 0
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_SUCCEEDED_TASKS: 2
INFO : TOTAL_LAUNCHED_TASKS: 2
INFO : RACK_LOCAL_TASKS: 1
INFO : AM_CPU_MILLISECONDS: 3450
INFO : AM_GC_TIME_MILLIS: 21
INFO : File System Counters:
INFO : FILE_BYTES_READ: 141
INFO : FILE_BYTES_WRITTEN: 141
INFO : HDFS_BYTES_WRITTEN: 373
INFO : HDFS_READ_OPS: 5
INFO : HDFS_WRITE_OPS: 5
INFO : HDFS_OP_CREATE: 3
INFO : HDFS_OP_GET_FILE_STATUS: 5
INFO : HDFS_OP_RENAME: 2
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : SPILLED_RECORDS: 0
INFO : NUM_SHUFFLED_INPUTS: 1
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : GC_TIME_MILLIS: 102
INFO : TASK_DURATION_MILLIS: 3845
INFO : CPU_MILLISECONDS: 6750
INFO : PHYSICAL_MEMORY_BYTES: 4227858432
INFO : VIRTUAL_MEMORY_BYTES: 10972979200
INFO : COMMITTED_HEAP_BYTES: 4227858432
INFO : INPUT_RECORDS_PROCESSED: 5
INFO : INPUT_SPLIT_LENGTH_BYTES: 1
INFO : OUTPUT_RECORDS: 1
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 88
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 96
INFO : OUTPUT_BYTES_PHYSICAL: 133
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_BYTES: 109
INFO : SHUFFLE_BYTES_DECOMPRESSED: 96
INFO : SHUFFLE_BYTES_TO_MEM: 0
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_DISK_DIRECT: 109
INFO : SHUFFLE_PHASE_TIME: 65
INFO : FIRST_EVENT_RECEIVED: 39
INFO : LAST_EVENT_RECEIVED: 39
INFO : DATA_BYTES_VIA_EVENT: 0
INFO : HIVE:
INFO : CREATED_FILES: 2
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 3
INFO : RECORDS_OUT_0: 1
INFO : RECORDS_OUT_1_default.tab_scapi_dignasmbp_1728048648203: 5
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 1
INFO : RECORDS_OUT_INTERMEDIATE_Reducer_2: 0
INFO : RECORDS_OUT_OPERATOR_FS_12: 1
INFO : RECORDS_OUT_OPERATOR_FS_5: 5
INFO : RECORDS_OUT_OPERATOR_GBY_10: 1
INFO : RECORDS_OUT_OPERATOR_GBY_8: 1
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_9: 1
INFO : RECORDS_OUT_OPERATOR_SEL_1: 1
INFO : RECORDS_OUT_OPERATOR_SEL_3: 5
INFO : RECORDS_OUT_OPERATOR_SEL_7: 5
INFO : RECORDS_OUT_OPERATOR_TS_0: 1
INFO : RECORDS_OUT_OPERATOR_UDTF_2: 5
INFO : TOTAL_TABLE_ROWS_WRITTEN: 5
INFO : TaskCounter_Map_1_INPUT__dummy_table:
INFO : INPUT_RECORDS_PROCESSED: 4
INFO : INPUT_SPLIT_LENGTH_BYTES: 1
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : DATA_BYTES_VIA_EVENT: 0
INFO : OUTPUT_BYTES: 88
INFO : OUTPUT_BYTES_PHYSICAL: 133
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 96
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 1
INFO : SPILLED_RECORDS: 0
INFO : TaskCounter_Reducer_2_INPUT_Map_1:
INFO : FIRST_EVENT_RECEIVED: 39
INFO : INPUT_RECORDS_PROCESSED: 1
INFO : LAST_EVENT_RECEIVED: 39
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : NUM_SHUFFLED_INPUTS: 1
INFO : SHUFFLE_BYTES: 109
INFO : SHUFFLE_BYTES_DECOMPRESSED: 96
INFO : SHUFFLE_BYTES_DISK_DIRECT: 109
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_TO_MEM: 0
INFO : SHUFFLE_PHASE_TIME: 65
INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
INFO : OUTPUT_RECORDS: 0
INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode
INFO : Starting task [Stage-0:MOVE] in serial mode
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0 ze=67, rawDataSize=62, numFilesErasureCoded=0]
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 9.40 s
----------------------------------------------------------------------------------------------
5 rows affected (18.56 seconds)
0: jdbc:hive2://node2.playground-ggangadharan> select * from TAB_SCAPI_DIGNASMBP_1728048648203;
INFO : Compiling command(queryId=hive_20241009052444_2f65b80f-2ad3-412e-8ac4-03d8987a02db): select * from TAB_SCAPI_DIGNASMBP_1728048648203
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_scapi_dignasmbp_1728048648203.c1, type:int, comment:null), FieldSchema(name:tab_scapi_dignasmbp_1728048648203.c2, type:varchar(20), comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20241009052444_2f65b80f-2ad3-412e-8ac4-03d8987a02db); Time taken: 0.352 seconds
INFO : Executing command(queryId=hive_20241009052444_2f65b80f-2ad3-412e-8ac4-03d8987a02db): select * from TAB_SCAPI_DIGNASMBP_1728048648203
INFO : Completed executing command(queryId=hive_20241009052444_2f65b80f-2ad3-412e-8ac4-03d8987a02db); Time taken: 0.013 seconds
INFO : OK
+---------------------------------------+---------------------------------------+
| tab_scapi_dignasmbp_1728048648203.c1 | tab_scapi_dignasmbp_1728048648203.c2 |
+---------------------------------------+---------------------------------------+
| 1 | |VALUE ONE; |
| 2 | |VALUE TWO; |
| 3 | |NULL; |
| 4 | |VALUE FOUR; |
| 5 | |VALUE FIVE; |
+---------------------------------------+---------------------------------------+
5 rows selected (0.587 seconds)
0: jdbc:hive2://node2.playground-ggangadharan> desc formatted TAB_SCAPI_DIGNASMBP_1728048648203;
INFO : Compiling command(queryId=hive_20241009052507_76fa8fa5-9105-4cc5-adb8-6f93a6050c9c): desc formatted TAB_SCAPI_DIGNASMBP_1728048648203
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20241009052507_76fa8fa5-9105-4cc5-adb8-6f93a6050c9c); Time taken: 0.106 seconds
INFO : Executing command(queryId=hive_20241009052507_76fa8fa5-9105-4cc5-adb8-6f93a6050c9c): desc formatted TAB_SCAPI_DIGNASMBP_1728048648203
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20241009052507_76fa8fa5-9105-4cc5-adb8-6f93a6050c9c); Time taken: 0.169 seconds
INFO : OK
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| c1 | int | |
| c2 | varchar(20) | |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | default | NULL |
| OwnerType: | USER | NULL |
| Owner: | hive | NULL |
| CreateTime: | Wed Oct 09 05:18:20 UTC 2024 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://node4.playground-ggangadharan.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/tab_scapi_dignasmbp_1728048648203 | NULL |
| Table Type: | EXTERNAL_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"c1\":\"true\",\"c2\":\"true\"}} |
| | EXTERNAL | TRUE |
| | bucketing_version | 2 |
| | numFiles | 1 |
| | numRows | 5 |
| | rawDataSize | 62 |
| | totalSize | 67 |
| | transient_lastDdlTime | 1728451483 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL |
| OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL |
| Compressed: | No | NULL |
| Num Buckets: | -1 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| Storage Desc Params: | NULL | NULL |
| | field.delim | , |
| | line.delim | \n |
| | serialization.format | , |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
34 rows selected (0.363 seconds)
0: jdbc:hive2://node2.playground-ggangadharan> dfs -ls hdfs://node4.playground-ggangadharan.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/tab_scapi_dignasmbp_1728048648203
. . . . . . . . . . . . . . . . . . . . . . .> ;
Error: Error while processing statement: Permission denied: user [hive] does not have privilege for [DFS] command (state=,code=1)
0: jdbc:hive2://node2.playground-ggangadharan> !sh hdfs dfs -ls dfs -ls hdfs://node4.playground-ggangadharan.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/tab_scapi_dignasmbp_1728048648203
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p0.44702451/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p0.44702451/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ls: `dfs': No such file or directory
ls: `-ls': No such file or directory
Found 1 items
-rw-r--r-- 3 hive hive 67 2024-10-09 05:24 hdfs://node4.playground-ggangadharan.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/tab_scapi_dignasmbp_1728048648203/000000_0
Command failed with exit code = 1
0: jdbc:hive2://node2.playground-ggangadharan> !sh hdfs dfs -cat hdfs://node4.playground-ggangadharan.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/tab_scapi_dignasmbp_1728048648203/000000_0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p0.44702451/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p0.44702451/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
1,|VALUE ONE;
2,|VALUE TWO;
3,|NULL;
4,|VALUE FOUR;
5,|VALUE FIVE;
0: jdbc:hive2://node2.playground-ggangadharan> If you prefer not to include the "|" or ";" symbol , please modify the insert statement accordingly. Additionally, if you are reading the text file generated by an external process, it is recommended to adjust the delimiter accordingly to ensure that only the values are displayed in the result.
... View more
10-07-2024
01:02 AM
1 Kudo
@IanWilloughby Would it be possible for you to provide some sample records for the 'URL' column to help me gain a clearer understanding? Additionally, could you please share the specific versions of HIVE2 and HIVE3?
... View more
10-07-2024
12:29 AM
1 Kudo
We recommend utilizing CDW for Kubernetes on Hive. Based on the description, it seems that you are currently using the apache-hive library. In the upstream (Apache), images have already been pushed to Docker Hub, so you can utilize the same. I have attached the relevant documents for your reference. https://hive.apache.org/development/quickstart/ https://docs.cloudera.com/data-warehouse/cloud/overview/topics/dw-service-architecture.html
... View more
10-07-2024
12:17 AM
1 Kudo
The query seems to have failed during the compilation phase. Error while compiling statement At the same time notice below as well , This error typically occurs when one of the child tasks fails. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask To investigate further, please provide the beeline console output and complete stack-trace from HS2 logs if the failure occurred during the compilation phase. If the failure occurred on the YARN side, please share the complete stack trace from the child task along with the beeline console output for additional assistance. Additionally, please include the DDL of the associated tables.
... View more
09-10-2024
07:50 AM
2 Kudos
The error you are encountering, java.lang.ArithmeticException: Decimal precision 45 exceeds max precision 38" occurs because Spark automatically infers the schema for the Oracle NUMBER type. When the data has a very large precision, such as 35 digits in your case, Spark may overestimate the precision due to how it handles floating-point and decimal values. To explain the issue further: Oracle's NUMBER data type is highly flexible and can store values with a very large precision. However, Spark's Decimal type has a maximum precision of 38, which limits the number of digits it can accurately represent. According to the documentation, Spark's decimal data type can have a precision of up to 38, and the scale can also be up to 38 (but must be less than or equal to the precision). To resolve this issue, you should ensure that your Oracle database does not have values larger than the maximum precision and scale allowed by Spark. You can verify this by running the following query in Oracle: SELECT MAX(LENGTH(large_number)) FROM example_table If the result is greater than 38, you can try using the following query to read the data as a string instead of a decimal data type: SELECT TO_CHAR(large_number) AS large_number FROM example_table. Spark Schema : >>> df=spark.read.format("jdbc").option("url", oracle_url).option("query", "SELECT TO_CHAR(large_number) as large_number FROM example_table_with_decimal").option("user", "user1").option("password", "password").option("driver", "oracle.jdbc.driver.OracleDriver").load()
>>> df.printSchema()
root
|-- LARGE_NUMBER: string (nullable = true)
>>>
>>>
>>>
>>>
>>> df=spark.read.format("jdbc").option("url", oracle_url).option("query", "SELECT large_number FROM example_table_with_decimal").option("user", "user1").option("password", "password").option("driver", "oracle.jdbc.driver.OracleDriver").load()
>>>
>>>
>>>
>>> df.printSchema()
root
|-- LARGE_NUMBER: decimal(35,5) (nullable = true)
>>>
... View more
09-10-2024
04:02 AM
1 Kudo
@zhuodongLi Upon reviewing the screenshot, it was observed that the child tasks have failed due to too many output errors . It is recommended to validate the failed attempts and determine if the blamed for read error is caused by the same nodemanager host. Please review the nodemanager logs for the corresponding node during the specific time period. If feasible, consider stopping the nodemanager on the host and then try rerunning the query. Additionally, please follow the instructions provided in the KB to remove the usercache directories from yarn. After completing these steps, re-run the query.
... View more
09-04-2024
05:35 AM
If setting the proper queue name resolves the problem, it is possible that the query may have been submitted in the default queue, where it competes for resources with other queries and fails due to a timeout error
... View more
09-02-2024
10:22 PM
1 Kudo
Since the failure occurred within the Tez job's child tasks, please share the yarn log or complete Stacktrace from one of the failed child task attempts. This will help us identify the root cause of the failure and provide appropriate recommendations to resolve the problem.
... View more
08-29-2024
11:09 PM
1 Kudo
You need to use Hive Warehouse Connector (HWC) to query Hive managed tables from Spark. Ref - https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/integrating-hive-and-bi/topics/hive_hivewarehouseconnector_for_handling_apache_spark_data.html
... View more
08-28-2024
01:40 AM
Unfortunately, it is not possible to change the Application-Name of an already started Application Master in Apache Hadoop YARN. The Application-Name is set when the application is submitted and cannot be modified during runtime. The Application-Name is typically specified as a parameter when submitting the application using the spark-submit command or the YARN REST API. Once the application is started, the Application-Name is fixed and cannot be changed. If you need to change the Application-Name, you will need to stop the existing application and submit a new one with the desired name.
... View more