Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

error while inserting data in hive from spark in hdp3

error while inserting data in hive from spark in hdp3

New Contributor

Hi,

I am using HDP 3.1 and am aware about hive warehouse connector and its configs and usage with spark as given here: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_s...

I am getting below error while inserting data into hive table from spark-streaming job. I have made the required configs in ambari and pass the remaining properties while submitting spark application to yarn.

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null;     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)     at org.apache.spark.sql.hive.HiveExternalCatalog.doAlterTableStats(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.alterTableStats(ExternalCatalog.scala:187)     at org.apache.spark.sql.catalyst.catalog.SessionCatalog.alterTableStats(SessionCatalog.scala:376)     at org.apache.spark.sql.execution.command.CommandUtils$.updateTableStats(CommandUtils.scala:45)     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:177)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)     at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)     at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)     at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)     at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)     at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)     at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)     at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)     at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)     at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)     at com.dfs.datahub.V4POSStreamEDPDev$$anonfun$createContext$1.apply(V4POSStreamEDPDev.scala:644)     at com.dfs.datahub.V4POSStreamEDPDev$$anonfun$createContext$1.apply(V4POSStreamEDPDev.scala:519)     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)     at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)     at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)     at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)     at scala.util.Try$.apply(Try.scala:192)     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)     at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)     at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:770)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:725)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:713)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.spark.sql.hive.client.Shim_v3_0.alterTable(HiveShim.scala:1320)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply$mcV$sp(HiveClientImpl.scala:498)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTable$1.apply(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:278)     at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:216)     at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:215)     at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:261)     at org.apache.spark.sql.hive.client.HiveClientImpl.alterTable(HiveClientImpl.scala:488)     at org.apache.spark.sql.hive.client.HiveClient$class.alterTable(HiveClient.scala:95)     at org.apache.spark.sql.hive.client.HiveClientImpl.alterTable(HiveClientImpl.scala:83)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply$mcV$sp(HiveExternalCatalog.scala:680)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doAlterTableStats$1.apply(HiveExternalCatalog.scala:662)     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)     ... 37 more Caused by: MetaException(message:Cannot change stats state for a transactional table without providing the transactional write state for verification (new write ID -1, valid write IDs null; current state {"BASIC_STATS":"true","COLUMN_STATS":{"application_id":"true","business_date":"true","cash_drawer_id":"true","cashier_actioncode":"true","cashier_enddttm":"true","cashier_lineno":"true","cashier_linetype":"true","cashier_seqno":"true","cashier_signoffdrawerid":"true","cashier_signondrawerid":"true","customer_actioncode":"true","customer_custidnum":"true","customer_custidtype":"true","customer_displayprompt":"true","customer_dtofenrol":"true","customer_enddttm":"true","customer_firstnm":"true","customer_grpno":"true","customer_lineno":"true","customer_linetype":"true","customer_membershipno":"true","customer_membershiptype":"true","customer_online":"true","customer_onlineval":"true","customer_paxaddress1":"true","customer_paxaddress2":"true","customer_paxautoconsumo":"true","customer_paxcity":"true","customer_paxcntryorigin":"true","customer_paxcountry":"true","customer_paxethn":"true","customer_paxgender":"true","customer_paxgrpnum":"true","customer_paxoffline":"true","customer_paxpassportno":"true","customer_paxpecemail":"true","customer_paxpostalcode":"true","customer_paxpremierid":"true","customer_paxshipto":"true","customer_paxvatcode":"true","customer_phnum":"true","customer_refno":"true","customer_seqno":"true","customer_surname":"true","customer_udtid":"true","dhtxnmbr":"true","division":"true","hdr_srrgt_id":"true","native_division_id":"true","operator_id":"true","organization_id":"true","otc_flg":"true","pos_id":"true","process_date":"true","remote_print":"true","report_group":"true","retail_store_id":"true","site_id":"true","supervisor_id":"true","tax_actioncode":"true","tax_enddttm":"true","tax_lineno":"true","tax_linetype":"true","tax_seqno":"true","tax_taxableamt":"true","tax_taxamt":"true","trans_type":"true","txn_end_time":"true","txn_grand_amount":"true","txn_net_amount":"true","txn_nmbr":"true","txn_start_time":"true","txn_type":"true","udt_desc":"true","udt_id":"true","user_defined_seq_no":"true","version_no":"true"}}; new state null)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:2379)     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2366)     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:434)     at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:415)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)     at com.sun.proxy.$Proxy38.alter_table(Unknown Source)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2956)     at com.sun.proxy.$Proxy38.alter_table(Unknown Source)     at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:766)     ... 58 more

I am not able to find anything about the stats that hive and spark uses, I only know its used for query optimization by hive, but why spark is altering the metadata of hive?

I tried making the target hive table insert only and repairing it, but no luck with that, getting the same error.

Please help.

Thanks in advance.

Don't have an account?
Coming from Hortonworks? Activate your account here