Support Questions
Find answers, ask questions, and share your expertise

HIVE LLAP BUG IN HDP2.6

HIVE LLAP BUG IN HDP2.6

Explorer

select * from table_a a left join table_b b on a.id = b.id and a.name = 'aaa';

this query will throw following exception:

Caused by: java.lang.RuntimeException: cannot find field _col7 from [0:key, 1:value] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485) at

.................

I think it tries to find column name in table b, of course there is no column name in table b, therefore this runtime exception is thrown.

Can you guys check if it is bug or something else?

How can I fix it?

BTW, this query can be executed with no exceptions in HIVE 1.2.

4 REPLIES 4

Re: HIVE LLAP BUG IN HDP2.6

I verified similar scenario in HDP 2.6.2 and it works as expected:

0: jdbc:hive2://p:21> select * from table_a a left join table_b b on a.id = b.id and a.name = 'aaa';

INFO  : Compiling command(queryId=hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156): select * from table_a a left join table_b b on a.id = b.id and a.name = 'aaa'

INFO  : We are setting the hadoop caller context from HIVE_SSN_ID:78c96242-f9ed-49cc-a06b-f3bcd9ef9010 to hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156

INFO  : Semantic Analysis Completed

INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:a.id, type:int, comment:null), FieldSchema(name:a.name, type:string, comment:null), FieldSchema(name:b.id, type:int, comment:null), FieldSchema(name:b.name, type:string, comment:null)], properties:null)

INFO  : Completed compiling command(queryId=hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156); Time taken: 4.431 seconds

INFO  : We are resetting the hadoop caller context to HIVE_SSN_ID:78c96242-f9ed-49cc-a06b-f3bcd9ef9010

INFO  : Concurrency mode is disabled, not creating a lock manager

INFO  : Setting caller context to query id hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156

INFO  : Executing command(queryId=hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156): select * from table_a a left join table_b b on a.id = b.id and a.name = 'aaa'

INFO  : Query ID = hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156

INFO  : Total jobs = 1

INFO  : Launching Job 1 out of 1

INFO  : Starting task [Stage-1:MAPRED] in serial mode

INFO  : Session is already open

INFO  : Dag name: select * from table_a a left join ta...'aaa'(Stage-1)

INFO  : Setting tez.task.scale.memory.reserve-fraction to 0.30000001192092896

INFO  : Status: Running (Executing on YARN cluster with App id application_1521722626997_0006)
INFO  : Status: DAG finished successfully in 0.06 seconds

INFO  : 

INFO  : Query Execution Summary

INFO  : ----------------------------------------------------------------------------------------------

INFO  : OPERATION                            DURATION

INFO  : ----------------------------------------------------------------------------------------------

INFO  : Compile Query                           4.43s

INFO  : Prepare Plan                            0.76s

INFO  : Submit Plan                             1.10s

INFO  : Start DAG                               0.66s

INFO  : Run DAG                                 0.06s

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : Task Execution Summary

INFO  : ----------------------------------------------------------------------------------------------

INFO  :   VERTICES      DURATION(ms)   CPU_TIME(ms)    GC_TIME(ms)   INPUT_RECORDS   OUTPUT_RECORDS

INFO  : ----------------------------------------------------------------------------------------------

INFO  :      Map 1              0.00              0              0               0                0

INFO  :      Map 2              0.00              0              0               0                0

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : org.apache.tez.common.counters.DAGCounter:

INFO  :    AM_CPU_MILLISECONDS: 2480

INFO  :    AM_GC_TIME_MILLIS: 14

INFO  : Resetting the caller context to HIVE_SSN_ID:78c96242-f9ed-49cc-a06b-f3bcd9ef9010

INFO  : Completed executing command(queryId=hive_20180404182111_e6a8b2f1-4c8f-458a-a731-4707f04b8156); Time taken: 2.691 seconds

INFO  : OK

--------------------------------------------------------------------------------

        VERTICES        MODE  STATUS      TOTAL  COMPLETED  RUNNING  PENDING  FAILED

--------------------------------------------------------------------------------

Map 1                   llap  SUCCEEDED          0        0        0       0       0

Map 2                   llap  SUCCEEDED          0        0        0       0       0

--------------------------------------------------------------------------------

VERTICES: 00/02  [>>--------------------------] 0%    ELAPSED TIME: 2.13 s     

--------------------------------------------------------------------------------

+-------+---------+-------+---------+--+

| a.id  | a.name  | b.id  | b.name  |

+-------+---------+-------+---------+--+

+-------+---------+-------+---------+--+

No rows selected (9.252 seconds)

0: jdbc:hive2://p:21>  

Can you share the DDL of the tables?

Re: HIVE LLAP BUG IN HDP2.6

Explorer

Here are tables structures.

CREATE TABLE `default.dual`( `id` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx' TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"id\":\"true\"}}', 'numFiles'='1', 'numRows'='1', 'rawDataSize'='4', 'totalSize'='187', 'transient_lastDdlTime'='1512564294')

CREATE TABLE `dafy_sales.ca_collect`( `id_credit` bigint, `creditee_name` string , `creditee_ident` string , `creditee_mobile` string , `write_status` bigint , `collected` bigint ', `json_report` string, `update_user` bigint, `update_time` timestamp, `update_ip` string, `last_code` string , `collect_count` bigint, `credit_model` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx' TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'last_modified_by'='sys_admin', 'last_modified_time'='1512014346', 'numFiles'='1', 'numRows'='125772', 'rawDataSize'='6012679932', 'totalSize'='639777965', 'transient_lastDdlTime'='1523120303')

67557-capture.png

Re: HIVE LLAP BUG IN HDP2.6

Expert Contributor

You are mixing ` and ' in your post. (Is that really what you are using for the definition? or just a copy paste thing?)

I'd start with a simple table definition and keep adding features until you reproduce the error with LLAP. As @Sindhu showed, the simplest case does work, so it's likely a specific feature that you are using that 's triggering the error.

Re: HIVE LLAP BUG IN HDP2.6

Explorer

Here is my hql.

select trunc(detect_date,"MM") as detect_month,nvl(id_sa,a.contract_no) as id_sa,count(1) as hrisk_num ,count(distinct case when instr(a.trigger_source,"WC")=0 then a.contract_no end) as alert_num from (select b.*,row_number() over(partition by contract_no,detect_date,source order by row_1 desc) as ranking from (select cast(to_date(common_survey_time) as timestamp) as detect_date,cast(case_message_source as string) as source,c2.TRIGGER_SOURCE,c2.contract_no ,cast(province as string) province,C2.source_result,1 as row_1 from dafy_sales.case_basic_info c1 join dafy_sales.case_detail_list c2 on c1.case_no=c2.case_no where instr(c1.STATUS,"xxxxxxxx")=0 union select detect_date,source,trigger_source,contract_no,province,source_result,2 as row_1 from risk_control.df_hrisk_list_hp ) b ) a

left join risk_control.df_contract_gl b on a.contract_no=b.contract_no and a.ranking=1 group by trunc(detect_date,"MM"),nvl(id_sa,a.contract_no);

if and a.ranking=1 is removed, this hql can be executed, or it will throw runtime exception.

68381-capture1.png

This hql can be executed in Hive1.2 with no errors.