About tarak271

tarak271 · ‎08-13-2024

Could you please try the steps detailed at https://hub.docker.com/r/apache/hive If you want to use your own core-site.xml/hdfs-site.xml/yarn-site.xml or hive-site.xml for the service, you can provide the environment variable HIVE_CUSTOM_CONF_DIR for the command. For example: Put the custom configuration file under the directory /opt/hive/conf and run: docker run -d -p 9083:9083 --env SERVICE_NAME=metastore \--env DB_DRIVER=postgres -v /opt/hive/conf:/hive_custom_conf --env HIVE_CUSTOM_CONF_DIR=/hive_custom_conf \--name metastore apache/hive:${HIVE_VERSION}

tarak271 · ‎04-24-2023

Please check https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/migrate-hive-workloads/topics/hive-repl-load-overview.html and try REPL between HDP 3 and CDP 7 to take replicate HMS metadata with or without actual data

tarak271 · ‎04-24-2023

Please check the number of open connections for each HiveServer2/HiveMetastore instance from CM UI to see if HS2/HMS instances are overloaded with high number of client connections if HMS backend database is performing optimally or not the performance of KDC or AD server if authentication is enabled if there are high JVM pauses in CM charts of HS2/HMS or look for traces matching Detected pause in JVM or host machine (eg GC) If there is no abnormality in any of the above cases, then we might need to collect jstack for all the three processes i.e beeline, HS2 and HMS to confirm the location of slowness due to which connection is getting hung

tarak271 · ‎01-23-2023

java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.orc.impl.StringRedBlackTree.add(StringRedBlackTree.java:59) at org.apache.orc.impl.writer.StringTreeWriter.writeBatch(StringTreeWriter.java:70) at org.apache.orc.impl.writer.StructTreeWriter.writeFields(StructTreeWriter.java:64) at org.apache.orc.impl.writer.StructTreeWriter.writeBatch(StructTreeWriter.java:78) at org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56) at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557) The above error will be thrown if there is a schema mismatch between table metadata and orc file like create table test(str string); -- table metadata and orcfile dump looks like Type: struct<str:int> ... Please correct schema and try again

tarak271 · ‎01-20-2023

In case of HWC, user query will be processed by HWC API connecting to HS2 server where HS2 will execute query either within HS2 or Tez/LLAP daemons In case of Spark API, spark's framework is used to execute the query by getting necessary metadata about table from HMS Please refer to below articles to know more about HWC https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html https://community.cloudera.com/t5/Community-Articles/Integrating-Apache-Hive-with-Apache-Spark-Hive-Warehouse/ta-p/249035

tarak271 · ‎01-20-2023

Invalid OperationHandle: OperationHandle This exception occurs when there are multiple HiveServer2 instances and access them using Zookeeper/Knox with failover configured When a query(irrespective of number of rows) took more time and HS2 is not able to respond within the defined timeout, ZK/KNOX will do a failover to the next available HS2 Since the other HS2 is unaware of the Query/Operation Handle, it throws Invalid OperationHandle exception To solve this problem Check if we can optimize the query to run faster either by adding a filter or splitting the available data into multiple tables and then query them in separate queries etc Check if HS2 is utilized beyond its capacity like using 200 connections at a given point in time for a 24GB heap of HS2/HMS HMS backend database not able to cope up to serve requests from HMS Check yarn queue has enough capacity to serve the query otherwise query will be in waiting state Check if HDFS is healthy and Namenode is able to respond to the requests without delays Sometimes if Ranger needs to check too many files/directories in HDFS before the query gets executed If Load balancer is used, sticky sessions should be enabled so that one-one relationship gets established for opened connections avoiding failover to another HS2 instance The above explanation holds good for any version of Hive

tarak271 · ‎03-11-2022

@useryy2 I have tried the same query in Hive-3.1.3000.7.1.7.0-551 and did not get any error +----------------------------------------------------+ | createtab_stmt | +----------------------------------------------------+ | CREATE EXTERNAL TABLE `testexplode`( | | `name` string, | | `childs` array<string>, | | `amap` map<string,string>) | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' | | WITH SERDEPROPERTIES ( | | 'collection.delim'=',', | | 'field.delim'='\t', | | 'line.delim'='\n', | | 'mapkey.delim'=':', | | 'serialization.format'='\t') | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.mapred.TextInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' | | LOCATION | | 'hdfs://c1448-xxxx.cloudera.com:8020/warehouse/tablespace/external/hive/testexplode' | | TBLPROPERTIES ( | | 'bucketing_version'='2', | | 'transient_lastDdlTime'='1646831327') | +----------------------------------------------------+ And the query result is 0: jdbc:hive2://c1448-xxxx.coelab.cloudera.c> select explode(childs) from testexplode; INFO : Compiling command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9): select explode(childs) from testexplode INFO : No Stats for default@testexplode, Columns: amap, name, childs INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9); Time taken: 2.262 seconds INFO : Executing command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9): select explode(childs) from testexplode INFO : Completed executing command(queryId=hive_20220311124936_b4346a2e-121e-4d4c-8c8f-b05ce3cfc4c9); Time taken: 0.062 seconds INFO : OK +---------+ | col | +---------+ | child1 | | child2 | | child3 | | child4 | | child5 | | child6 | | child7 | | child8 | +---------+ 8 rows selected (2.626 seconds) Could you please try connect beeline with --verbose=true flag and share the complete stack trace to see the exact cause of the problem, also share hive version being used

tarak271 · ‎02-18-2022

@vladenache The issue seems to be with field FieldSchema(name:80t_lab.fan_glo (q5), type:tinyint, comment:null)], properties:null) Please check schema of the table for above field and correct it's name. From the attached stack trace it is trying to identify q5 as a data type and failing as it does not exist

tarak271 · ‎02-02-2022

HI @vladenache Could you please share next 50 lines after the below line Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.parquet.schema.OriginalType.q5 or full output of beeline console which includes full stack trace

tarak271 · ‎01-31-2022

Hi @vladenache From the pasted stack trace, we could see that enum q5 does not exist. From the code base org.apache.parquet.schema.OriginalType please refer to https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/schema/OriginalType.java So the problem could be with Incompatible versions of Parquet/Hive in use, please let us know what versions of hive, parquet are in use or Hadoop distribution version also helps Source data format issues, please share schema of source table and sample data would help Please share full stack trace to understand more about code execution path while query is getting executed

Online	Offline
Last Visited	‎12-17-2024 03:38 AM

Member Since	‎03-16-2020 01:45 AM
Last Visited	‎12-17-2024 03:38 AM
Posts	337
Kudos received	3

Cloudera Community

Re: create table as select failed, but only when s...

Re: Configuration kerberos authorization via JDBC ...

Re: URGENT: MIgrate Hive data from HDP 3 to CDP 7 ...

Re: Hive Connection Timeout

Re: Hive compactor error

Re: HiveWarehouseSession vs SQLContext spark execu...

Re: Invalid OperationHandle: OperationHandle [opTy...

Re: Function "explode" doesn't work in Beeline

Re: create table as select failed, but only when s...

Re: create table as select failed, but only when s...

Re: create table as select failed, but only when s...