About asreekumar

zzeng · ‎09-08-2024

With the Hive (newer than Hive 2.2), you can use Merge INTO MERGE INTO target_table AS target USING source_table AS source ON target.id = source.id WHEN MATCHED THEN UPDATE SET target.name = source.name, target.age = source.age WHEN NOT MATCHED THEN INSERT (id, name, age) VALUES (source.id, source.name, source.age);

DarveshK · ‎07-23-2020

Run msck command for the table you want to truncate in hive shell. hive> use <database-name>; hive> msck repair table <table-name>; If it will show any error then rectify it as we got one of our partition was missing. So we create that partition directory on hdfs location and re-run msck repair command. Now it would not show any issue. Now running truncate command will run successfully. hive> truncate table <table-name>; [NOTE: Please update database and table name as per the requirement]

asreekumar · ‎03-13-2018

For full stack of error message from hive you could look into hiveserver log like @Slim was mentioning. Since hive interactive is enabled I believe you should look into hsihiveserver log on the node where HiveServer2 Interactive is running

ksuresh · ‎09-08-2017

That explains it, thanks!

asreekumar · ‎07-23-2017

Consolidating below some of the errors thrown by Spark Thrift Server during SQL execution, that could be worked around by configuring certain parameters of spark-thrift-sparkconf.conf and hive-site.xml Error 1: Join condition is missing or trivial.Use the CROSS JOIN syntax to allow cartesian products between these relations.; Resolution: spark.sql.crossjoin.enabled: true Error 2: Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "eval(Lorg/apache/spark/sql/catalyst/InternalRow;)Z" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate" grows beyond 64 KB Resolution: spark.sql.codegen.wholeStage : false Error 3: java.lang.OutOfMemoryError: Java heap space Resolution: spark.driver.memory : 10g <to a higher-value> spark.sql.ui.retainedExecutions: 5 <to some lower-value> Error 4: org.apache.spark.SparkException: Exception thrown in awaitResult: (state=,code=0) Resolution: hive.metastore.try.direct.sql: false (in hive-site.xml) To enable heap dump collection for spark driver and executors, for debugging out of memory errors spark.driver.extraJavaOptions: '-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<path-to-dump-file-location>' spark.executor.extraJavaOptions: '-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<path-to-dump-file-location>'

asreekumar · ‎07-13-2017

If the issue is that the number of rows is too high starting beeline with beeline --incremental=true will be of help

asreekumar · ‎06-29-2017

Thanks @Kshitij Badani, will configure a local user as I am about to try some spark queries

asreekumar · ‎04-28-2017

Hive is having an improvised explain plan with the hive.explain.user set to true. Now the indentation and arrows clearly depict the execution flow. Retrieving the join order needed a lot of back and forth references in the old style which was made as easy as following the arrows now, even with the details of mappers and reducers. Let us take a look at one sample explain plan and retrieve the join order Plan optimized by CBO. Vertex dependency in root stage Map 2 <- Map 1 (BROADCAST_EDGE), Map 4 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) Reducer 3 <- Map 2 (SIMPLE_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Reducer 3 vectorized, llap File Output Operator [FS_101] Group By Operator [GBY_100] (rows=1 width=736) Output:["_col0","_col1","_col2","_col3"],aggregations:["min(VALUE._col0)","min(VALUE._col1)","min(VALUE._col2)","min(VALUE._col3)"] <-Map 2 [SIMPLE_EDGE] llap SHUFFLE [RS_45] Group By Operator [GBY_44] (rows=1 width=736) Output:["_col0","_col1","_col2","_col3"],aggregations:["min(_col1)","min(_col3)","min(_col14)","min(_col19)"] Map Join Operator [MAPJOIN_92] (rows=80573328660391072 width=114) Conds:MAPJOIN_91._col11, _col7=RS_41._col0, _col0(Inner),Output:["_col1","_col3","_col14","_col19"] <-Map 9 [BROADCAST_EDGE] llap BROADCAST [RS_41] PartitionCols:_col0, _col0 Select Operator [SEL_23] (rows=1960358 width=104) Output:["_col0","_col1"] Filter Operator [FIL_86] (rows=1960358 width=104) predicate:id is not null TableScan [TS_21] (rows=1960358 width=104) default@title,t,Tbl:COMPLETE,Col:NONE,Output:["id","title"] <-Map Join Operator [MAPJOIN_91] (rows=73248479012737488 width=125) Conds:MAPJOIN_90._col10=RS_38._col0(Inner),Output:["_col1","_col3","_col7","_col11","_col14"] <-Map 8 [BROADCAST_EDGE] llap BROADCAST [RS_38] PartitionCols:_col0 Select Operator [SEL_20] (rows=2 width=40) Output:["_col0"] Filter Operator [FIL_85] (rows=2 width=40) predicate:((UDFToString(role) = 'actress') and id is not null) TableScan [TS_18] (rows=4 width=40) default@role_type,rt,Tbl:COMPLETE,Col:NONE,Output:["id","role"] <-Map Join Operator [MAPJOIN_90] (rows=66589524931926816 width=138) Conds:MAPJOIN_89._col0, _col6=RS_35._col0, _col0(Inner),Output:["_col1","_col3","_col7","_col10","_col11","_col14"] <-Map 7 [BROADCAST_EDGE] llap BROADCAST [RS_35] PartitionCols:_col0, _col0 Select Operator [SEL_17] (rows=764741 width=105) Output:["_col0","_col1"] Filter Operator [FIL_84] (rows=764741 width=105) predicate:((UDFToString(gender) = 'f') and (name like '%An%') and id is not null) TableScan [TS_15] (rows=3058966 width=105) default@name,n,Tbl:COMPLETE,Col:NONE,Output:["id","name","gender"] <-Map Join Operator [MAPJOIN_89] (rows=60535930444216152 width=152) Conds:MAPJOIN_88._col7, _col4=RS_32._col0, _col1(Inner),Output:["_col0","_col1","_col3","_col6","_col7","_col10","_col11"] <-Map 6 [BROADCAST_EDGE] llap BROADCAST [RS_32] PartitionCols:_col0, _col1 Select Operator [SEL_14] (rows=11636372 width=8) Output:["_col0","_col1"] Filter Operator [FIL_83] (rows=11636372 width=8) predicate:(movie_id is not null and company_id is not null) TableScan [TS_12] (rows=11636372 width=8) default@movie_companies,mc,Tbl:COMPLETE,Col:NONE,Output:["movie_id","company_id"] <-Map Join Operator [MAPJOIN_88] (rows=55032662847395656 width=167) Conds:MAPJOIN_87._col0, _col2=RS_29._col0, _col2(Inner),Output:["_col0","_col1","_col3","_col4","_col6","_col7","_col10"] <-Map 4 [BROADCAST_EDGE] llap BROADCAST [RS_29] PartitionCols:_col0, _col2 Select Operator [SEL_8] (rows=6110483 width=116) Output:["_col0","_col1","_col2","_col4"] Filter Operator [FIL_81] (rows=6110483 width=116) predicate:((note) IN ('(voice)', '(voice: Japanese version)', '(voice) (uncredited)', '(voice: English version)') and person_id is not null and person_role_id is not null and movie_id is not null and role_id is not null) TableScan [TS_6] (rows=12220966 width=116) default@cast_info,ci,Tbl:COMPLETE,Col:NONE,Output:["person_id","movie_id","person_role_id","note","role_id"] <-Map Join Operator [MAPJOIN_87] (rows=50029692413268030 width=184) Conds:(Inner),(Inner),Output:["_col0","_col1","_col2","_col3","_col4"] <-Map 1 [BROADCAST_EDGE] llap BROADCAST [RS_24] Select Operator [SEL_2] (rows=701938 width=104) Output:["_col0","_col1"] Filter Operator [FIL_79] (rows=701938 width=104) predicate:person_id is not null TableScan [TS_0] (rows=701938 width=104) default@aka_name,an,Tbl:COMPLETE,Col:NONE,Output:["person_id","name"] <-Map 5 [BROADCAST_EDGE] llap BROADCAST [RS_26] Select Operator [SEL_11] (rows=34365 width=259) Output:["_col0"] Filter Operator [FIL_82] (rows=34365 width=259) predicate:((UDFToString(country_code) = '[us]') and id is not null) TableScan [TS_9] (rows=68731 width=259) default@company_name,cn,Tbl:COMPLETE,Col:NONE,Output:["id","country_code"] <-Select Operator [SEL_5] (rows=2074019 width=104) Output:["_col0","_col1"] Filter Operator [FIL_80] (rows=2074019 width=104) predicate:id is not null TableScan [TS_3] (rows=2074019 width=104) default@char_name,chn,Tbl:COMPLETE,Col:NONE,Output:["id","name"] 1. Start reading from the bottom 2. Look for the keyword 'TableScan', the row below it will have the details of the tables being joined 3. Look for the keyword 'Join Operator', the indent of this line will tell us the tables that are part of this join 4. This joined table group will form candidate for another join if the line starts with '<-' 5. Similarly keep going to top to get the complete join order The join order in the above example boils down to (title(t),(((movie_companies(mc),(cast_info(ci),(company_name(cn),aka_name(an),char_name(chn)))),name(n)),role_type(rt)))

asreekumar · ‎04-28-2017

Looks like the data was removed from hdfs but no drop partition query was executed on hive side. The data being displayed from 'show partitions <table_name>' command retrieves data from hive metastore db, which has the information on table partition but not the real data, which is stored in hdfs. About the data missing from hdfs, it might be worth checking the hdfs-audit.log to trace the lifecycle of the directory/file

CloudBoy · ‎02-22-2019

I am also facing the same issue in HDP3... Is it resolved...????

Online	Offline
Last Visited	‎04-15-2020 08:34 PM

Member Since	‎08-15-2019 09:55 AM
Last Visited	‎04-15-2020 08:34 PM
Posts	29
Kudos received	110

Cloudera Community

Re: hive.llap.io.enabled is off after expressupgra...

Re: Error in Hive - Tez

Re: Temporary table not found (ODBC driver issue ?...

Re: Unable to start the Hive2 Interactive Server

Re: Hive Ambari View coniguration for Hive SSL

Re: update data in Hive using join

Re: can not truncate table

Re: How to debug Hive2 beeline ?

Re: hive.llap.io.enabled is off after expressupgra...

Some Spark Thrift Server errors, with their work a...

Re: How to increase beeline java heap size?

Re: Login credentials for zeppelin ui

Join Order from explain plan

Re: a particular set of data is missing in HDFS di...

Re: Unable to start the Hive2 Interactive Server