Support Questions
Find answers, ask questions, and share your expertise

error while inserting data in hive from spark in hdp3

error while inserting data in hive from spark in hdp3

New Contributor

Hi,

I am using HDP 3.1 and am aware about hive warehouse connector and its configs and usage with spark as given here: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_s...

I am getting below error while inserting data into hive table from spark-streaming job. I have made the required configs in ambari and pass the remaining properties while submitting spark application to yarn.

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null;     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)     at org.apache.spark.sql.hive.HiveExternalCatalog.doAlterTableStats(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.alterTableStats(ExternalCatalog.scala:187)     at org.apache.spark.sql.catalyst.catalog.SessionCatalog.alterTableStats(SessionCatalog.scala:376)     at org.apache.spark.sql.execution.command.CommandUtils$.updateTableStats(CommandUtils.scala:45)     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:177)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)     at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)     at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)     at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)     at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)     at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)     at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)     at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)     at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)     at com.dfs.datahub.V4POSStreamEDPDev$$anonfun$createContext$1.apply(V4POSStreamEDPDev.scala:644)     at com.dfs.datahub.V4POSStreamEDPDev$$anonfun$createContext$1.apply(V4POSStreamEDPDev.scala:519)     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)     at scala.util.Try$.apply(Try.scala:192)     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)     at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:770)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:725)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:713)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.spark.sql.hive.client.Shim_v3_0.alterTable(HiveShim.scala:1320)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply$mcV$sp(HiveClientImpl.scala:498)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:278)     at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:216)     at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:215)     at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:261)     at org.apache.spark.sql.hive.client.HiveClientImpl.alterTable(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClient$class.alterTable(HiveClient.scala:95)     at org.apache.spark.sql.hive.client.HiveClientImpl.alterTable(HiveClientImpl.scala:83)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply$mcV$sp(HiveExternalCatalog.scala:680)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)     ... 37 more Caused by: MetaException(message:Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:2379)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2366)     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:434)     at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:415)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)     at com.sun.proxy.$Proxy38.alter_table(Unknown Source)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2956)     at com.sun.proxy.$Proxy38.alter_table(Unknown Source)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:766)     ... 58 more

I am not able to find anything about the stats that hive and spark uses, I only know its used for query optimization by hive, but why spark is altering the metadata of hive?

I tried making the target hive table insert only and repairing it, but no luck with that, getting the same error.

Please help.

Thanks in advance.