Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Tez throwing NullPointerException on Group by query

Tez throwing NullPointerException on Group by query

Rising Star

I am trying to run below query in hive using TEZ and it is failing with NullPointerException whereas same query is running fine using MR execution engine. We are using hdp 2.3.2

select count(*) from (select serial_number from hive_demo.gdwi_test group by serial_number) q;

select serial_number from hive_demo.gdwi_test group by serial_number;

Both these queries are throwing NullPointerException.

Vertex failed, vertexName=Map 1, vertexId=vertex_1468116141308_1525_2_00, diagnostics=[Vertex vertex_1468116141308_1525_2_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: gdwi_test initializer failed, vertex=vertex_1468116141308_1525_2_00 [Map 1], java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:450) at org.apache.hadoop.io.Text.encode(Text.java:431) at org.apache.hadoop.io.Text.writeString(Text.java:480) at org.apache.hadoop.mapred.split.TezGroupedSplit.write(TezGroupedSplit.java:101) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.createSplitProto(MRInputHelpers.java:249) at org.apache.tez.mapreduce.hadoop.InputSplitInfoMem.createSplitsProto(InputSplitInfoMem.java:168) at org.apache.tez.mapreduce.hadoop.InputSplitInfoMem.getSplitsProto(InputSplitInfoMem.java:117) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:200) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:180) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
6 REPLIES 6

Re: Tez throwing NullPointerException on Group by query

Hi @Kashif Khan. In both of these queries the 'GROUP BY serial number' clause is not necessary - it is redundant processing since you are not aggregating by the Serial Number. If you remove the Group By it should work fine.

(I do agree that the Optimizer should just ignore the redundant code). ;-)

Re: Tez throwing NullPointerException on Group by query

Rising Star

The group by is used to dedup and count distinct serial_number.

Re: Tez throwing NullPointerException on Group by query

Aha. My apologies @Kashif Khan. I've always used 'Select Distinct' and 'Select count(distinct <col>) for that...

select count(distinct serial_number) from hive_demo.gdwi_test;  
select distinct serial_number from hive_demo.gdwi_test; 

Re: Tez throwing NullPointerException on Group by query

Hi @Kashif Khan

Is the table gdwi_test storing orc format? Are there any indexes on the table?

Re: Tez throwing NullPointerException on Group by query

Rising Star

The table is using TextinputFormat

Re: Tez throwing NullPointerException on Group by query

Rising Star

The inputformat name matters in this case.

NPE from Text.writeString(out, wrappedInputFormatName);

The table desc formatted is more relevant than the query pattern.

Don't have an account?
Coming from Hortonworks? Activate your account here