Member since
03-16-2020
337
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5238 | 02-18-2022 12:59 AM |
08-13-2024
04:33 AM
1 Kudo
Could you please try the steps detailed at https://hub.docker.com/r/apache/hive If you want to use your own core-site.xml/hdfs-site.xml/yarn-site.xml or hive-site.xml for the service, you can provide the environment variable HIVE_CUSTOM_CONF_DIR for the command. For example: Put the custom configuration file under the directory /opt/hive/conf and run: docker run -d -p 9083:9083 --env SERVICE_NAME=metastore \--env DB_DRIVER=postgres -v /opt/hive/conf:/hive_custom_conf --env HIVE_CUSTOM_CONF_DIR=/hive_custom_conf \--name metastore apache/hive:${HIVE_VERSION}
... View more
04-24-2023
06:12 AM
Please check https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/migrate-hive-workloads/topics/hive-repl-load-overview.html and try REPL between HDP 3 and CDP 7 to take replicate HMS metadata with or without actual data
... View more
04-24-2023
06:00 AM
Please check the number of open connections for each HiveServer2/HiveMetastore instance from CM UI to see if HS2/HMS instances are overloaded with high number of client connections if HMS backend database is performing optimally or not the performance of KDC or AD server if authentication is enabled if there are high JVM pauses in CM charts of HS2/HMS or look for traces matching Detected pause in JVM or host machine (eg GC) If there is no abnormality in any of the above cases, then we might need to collect jstack for all the three processes i.e beeline, HS2 and HMS to confirm the location of slowness due to which connection is getting hung
... View more
01-23-2023
08:41 AM
java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at org.apache.orc.impl.StringRedBlackTree.add(StringRedBlackTree.java:59)
at org.apache.orc.impl.writer.StringTreeWriter.writeBatch(StringTreeWriter.java:70)
at org.apache.orc.impl.writer.StructTreeWriter.writeFields(StructTreeWriter.java:64)
at org.apache.orc.impl.writer.StructTreeWriter.writeBatch(StructTreeWriter.java:78)
at org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56)
at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557) The above error will be thrown if there is a schema mismatch between table metadata and orc file like create table test(str string); -- table metadata and orcfile dump looks like Type: struct<str:int> ... Please correct schema and try again
... View more
01-20-2023
08:52 AM
In case of HWC, user query will be processed by HWC API connecting to HS2 server where HS2 will execute query either within HS2 or Tez/LLAP daemons In case of Spark API, spark's framework is used to execute the query by getting necessary metadata about table from HMS Please refer to below articles to know more about HWC https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html https://community.cloudera.com/t5/Community-Articles/Integrating-Apache-Hive-with-Apache-Spark-Hive-Warehouse/ta-p/249035
... View more
01-20-2023
08:06 AM
Invalid OperationHandle: OperationHandle This exception occurs when there are multiple HiveServer2 instances and access them using Zookeeper/Knox with failover configured When a query(irrespective of number of rows) took more time and HS2 is not able to respond within the defined timeout, ZK/KNOX will do a failover to the next available HS2 Since the other HS2 is unaware of the Query/Operation Handle, it throws Invalid OperationHandle exception To solve this problem Check if we can optimize the query to run faster either by adding a filter or splitting the available data into multiple tables and then query them in separate queries etc Check if HS2 is utilized beyond its capacity like using 200 connections at a given point in time for a 24GB heap of HS2/HMS HMS backend database not able to cope up to serve requests from HMS Check yarn queue has enough capacity to serve the query otherwise query will be in waiting state Check if HDFS is healthy and Namenode is able to respond to the requests without delays Sometimes if Ranger needs to check too many files/directories in HDFS before the query gets executed If Load balancer is used, sticky sessions should be enabled so that one-one relationship gets established for opened connections avoiding failover to another HS2 instance The above explanation holds good for any version of Hive
... View more
03-11-2022
05:00 AM
@useryy2 I have tried the same query in Hive-3.1.3000.7.1.7.0-551 and did not get any error +----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE EXTERNAL TABLE `testexplode`( |
| `name` string, |
| `childs` array<string>, |
| `amap` map<string,string>) |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' |
| WITH SERDEPROPERTIES ( |
| 'collection.delim'=',', |
| 'field.delim'='\t', |
| 'line.delim'='\n', |
| 'mapkey.delim'=':', |
| 'serialization.format'='\t') |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.mapred.TextInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION |
| 'hdfs://c1448-xxxx.cloudera.com:8020/warehouse/tablespace/external/hive/testexplode' |
| TBLPROPERTIES ( |
| 'bucketing_version'='2', |
| 'transient_lastDdlTime'='1646831327') |
+----------------------------------------------------+ And the query result is 0: jdbc:hive2://c1448-xxxx.coelab.cloudera.c> select explode(childs) from testexplode;
INFO : Compiling command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9): select explode(childs) from testexplode
INFO : No Stats for default@testexplode, Columns: amap, name, childs
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9); Time taken: 2.262 seconds
INFO : Executing command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9): select explode(childs) from testexplode
INFO : Completed executing command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9); Time taken: 0.062 seconds
INFO : OK
+---------+
| col |
+---------+
| child1 |
| child2 |
| child3 |
| child4 |
| child5 |
| child6 |
| child7 |
| child8 |
+---------+
8 rows selected (2.626 seconds) Could you please try connect beeline with --verbose=true flag and share the complete stack trace to see the exact cause of the problem, also share hive version being used
... View more
02-18-2022
12:59 AM
1 Kudo
@vladenache The issue seems to be with field FieldSchema(name:80t_lab.fan_glo (q5), type:tinyint, comment:null)], properties:null) Please check schema of the table for above field and correct it's name. From the attached stack trace it is trying to identify q5 as a data type and failing as it does not exist
... View more
02-02-2022
12:06 AM
HI @vladenache Could you please share next 50 lines after the below line Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.parquet.schema.OriginalType.q5 or full output of beeline console which includes full stack trace
... View more
01-31-2022
10:20 PM
1 Kudo
Hi @vladenache From the pasted stack trace, we could see that enum q5 does not exist. From the code base org.apache.parquet.schema.OriginalType please refer to https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/schema/OriginalType.java So the problem could be with Incompatible versions of Parquet/Hive in use, please let us know what versions of hive, parquet are in use or Hadoop distribution version also helps Source data format issues, please share schema of source table and sample data would help Please share full stack trace to understand more about code execution path while query is getting executed
... View more