Support Questions

itsssudheer · ‎09-22-2017

Unable to query hive acid with spark getting following exception

Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "0012050_0000"

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:192)

at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)

... 118 more

Caused by: java.lang.NumberFormatException: For input string: "0012050_0000"

at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

at java.lang.Long.parseLong(Long.java:589)

at java.lang.Long.parseLong(Long.java:631)

at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:310)

at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:379)

at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)

at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

njayakumar · ‎09-22-2017

@sudheer

Could you please run the major compact after the ETL ingestion.

please find below alter statement for the reference 

alter table <<table_name>> compact 'MAJOR';

View solution in original post

deepesh1 · ‎09-22-2017

Reading Hive ACID ORC data has some issues, it seems the data is not visible unless you compact the table atleast once, see SPARK-16996. Hive ACID tables ORC data operations are currently not supported from Spark, see SPARK-15348.

njayakumar · ‎09-22-2017

@sudheer

Could you please run the major compact after the ETL ingestion.

please find below alter statement for the reference 

alter table <<table_name>> compact 'MAJOR';

thirun66 · ‎07-29-2018

Is this worked ?

maciek1 · ‎11-20-2017

@deepesh
I think there is also problem with hive modification made by hortonworks.
Could you please look at my comments in Spark Jira ? SPARK-16996

thirun66 · ‎07-29-2018

alter table <<table_name>> compact 'MAJOR';

Hi, Is this way worked for any one ?

If worked for you, can you help me out in for what table we need to this compact action.

In may case I have a master table mt1 and lookup tables t2,t3,t4,t5. And I am inserting this select result to target table t6. These all are orc type.

Thanks

Thirupathi Nerella

Cloudera Community

Support Questions

Query hive acid with spark