<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Hive metastore lost connection while executing alter table command in spark sql in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368927#M240314</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/104681"&gt;@DataEngAa&lt;/a&gt;&amp;nbsp;Welcome to the Cloudera Community!&lt;BR /&gt;&lt;BR /&gt;To help you get the best possible solution, I have tagged our Hive experts&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/70785"&gt;@Shmoo&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/12885"&gt;@mszurap&lt;/a&gt;&amp;nbsp; who may be able to assist you further.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on your post, and we hope you find a satisfactory solution to your query.&lt;/P&gt;</description>
    <pubDate>Wed, 19 Apr 2023 18:39:51 GMT</pubDate>
    <dc:creator>DianaTorres</dc:creator>
    <dc:date>2023-04-19T18:39:51Z</dc:date>
    <item>
      <title>Hive metastore lost connection while executing alter table command in spark sql</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368904#M240305</link>
      <description>&lt;P&gt;Hi ,&lt;BR /&gt;&lt;BR /&gt;I am trying to run a "alter table drop partition" command using spark sql which prints the metastore lost connection stack trace :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;23/04/19 06:47:53 WARN metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.&lt;BR /&gt;org.apache.thrift.transport.TTransportException: SASL authentication not complete&lt;BR /&gt;at org.apache.thrift.transport.TSaslTransport.write(TSaslTransport.java:472)&lt;BR /&gt;at org.apache.thrift.transport.TSaslClientTransport.write(TSaslClientTransport.java:37)&lt;BR /&gt;at org.apache.hadoop.hive.thrift.TFilterTransport.write(TFilterTransport.java:72)&lt;BR /&gt;at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:178)&lt;BR /&gt;at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:106)&lt;BR /&gt;at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70)&lt;BR /&gt;at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)&lt;BR /&gt;at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_partitions_ps_with_auth(ThriftHiveMetastore.java:2444)&lt;BR /&gt;at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_ps_with_auth(ThriftHiveMetastore.java:2431)&lt;BR /&gt;at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:1427)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)&lt;BR /&gt;at com.sun.proxy.$Proxy35.listPartitionsWithAuthInfo(Unknown Source)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2562)&lt;BR /&gt;at com.sun.proxy.$Proxy35.listPartitionsWithAuthInfo(Unknown Source)&lt;BR /&gt;at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2700)&lt;BR /&gt;at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2726)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:568)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:563)&lt;BR /&gt;at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)&lt;BR /&gt;at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)&lt;BR /&gt;at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)&lt;BR /&gt;at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)&lt;BR /&gt;at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)&lt;BR /&gt;at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply$mcV$sp(HiveClientImpl.scala:563)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:558)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:558)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:283)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:221)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:220)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:266)&lt;BR /&gt;at org.apache.spark.sql.hive.client.HiveClientImpl.dropPartitions(HiveClientImpl.scala:558)&lt;BR /&gt;at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply$mcV$sp(HiveExternalCatalog.scala:1009)&lt;BR /&gt;at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:1007)&lt;BR /&gt;at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:1007)&lt;BR /&gt;at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:99)&lt;BR /&gt;at org.apache.spark.sql.hive.HiveExternalCatalog.dropPartitions(HiveExternalCatalog.scala:1007)&lt;BR /&gt;at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropPartitions(ExternalCatalogWithListener.scala:211)&lt;BR /&gt;at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropPartitions(SessionCatalog.scala:846)&lt;BR /&gt;at org.apache.spark.sql.execution.command.AlterTableDropPartitionCommand.run(ddl.scala:545)&lt;BR /&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)&lt;BR /&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)&lt;BR /&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)&lt;BR /&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)&lt;BR /&gt;at org.apache.spark.sql.Dataset.&amp;lt;init&amp;gt;(Dataset.scala:194)&lt;BR /&gt;at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)&lt;BR /&gt;at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:282)&lt;BR /&gt;at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;BR /&gt;at py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;BR /&gt;at py4j.GatewayConnection.run(GatewayConnection.java:238)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is the command :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;spark.sql("alter table &amp;lt;table name&amp;gt; drop if exists partition(year= 2023)")&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2023 10:58:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368904#M240305</guid>
      <dc:creator>DataEngAa</dc:creator>
      <dc:date>2023-04-19T10:58:23Z</dc:date>
    </item>
    <item>
      <title>Re: Hive metastore lost connection while executing alter table command in spark sql</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368927#M240314</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/104681"&gt;@DataEngAa&lt;/a&gt;&amp;nbsp;Welcome to the Cloudera Community!&lt;BR /&gt;&lt;BR /&gt;To help you get the best possible solution, I have tagged our Hive experts&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/70785"&gt;@Shmoo&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/12885"&gt;@mszurap&lt;/a&gt;&amp;nbsp; who may be able to assist you further.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on your post, and we hope you find a satisfactory solution to your query.&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2023 18:39:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368927#M240314</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2023-04-19T18:39:51Z</dc:date>
    </item>
    <item>
      <title>Re: Hive metastore lost connection while executing alter table command in spark sql</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368985#M240321</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/104681"&gt;@DataEngAa&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;The stacktrace shows that the SparkSQL was trying to list the partitions first ("&lt;SPAN&gt;HiveMetaStoreClient.listPartitionsWithAuthInfo") when the connection was lost. The attached snippet does not show timing information, but most likely the request simply timed out after the predefined timeouts. Also likely the table you try to manipulate has lots of partitions (in Hive metastore, the partition directory count on hdfs is a different question)&lt;BR /&gt;The timeout is defined both on client (Spark) side and on server (Hive metastore) side. To increase the timeout to let it run for longer:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;1. Set the "hive.metastore.client.socket.timeout=1800" in hive-site.xml for Hive service wide AND in the Hive gateway safety valves. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2. If this is CDP, set it on the HiveOnTez service side too - to let HS2 pick that value too.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;3. Start your spark application with&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;--conf spark.hadoop.hive.metastore.client.socket.timeout=1800&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The above increases the timeouts to 30 mintutes from the default 5 minutes, which is usually too low for big tables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Best regards&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;Miklos&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2023 07:18:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368985#M240321</guid>
      <dc:creator>mszurap</dc:creator>
      <dc:date>2023-04-20T07:18:03Z</dc:date>
    </item>
    <item>
      <title>Re: Hive metastore lost connection while executing alter table command in spark sql</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368990#M240323</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/12885"&gt;@mszurap&lt;/a&gt;&amp;nbsp;for you response . I tried the suggested work around already and it seems like the issue still persists .&amp;nbsp;&lt;BR /&gt;I agree the table has lot of partitions but I am pretty sure the code times out before 5 mins .&amp;nbsp;&lt;BR /&gt;I have also tried enforcing the hive-site.xml with the updated timeout which also did not help much.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Only thing which worked was adding spark.catalog.recoverPartitions(table) before issuing the drop partition command .&amp;nbsp; I am really not sure as why recovering the partitions in the catalog eliminated the metastore warning .&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Below is the updated code which is working without any warning :&lt;/P&gt;&lt;PRE&gt;spark.sql.catalog.recoverPartitions(orders)&lt;BR /&gt;spark.sql("alter table orders drop if exists partition(year=2023)")&lt;BR /&gt;data.write.mode(&lt;SPAN&gt;'Overwrite'&lt;/SPAN&gt;).parquet(hdfsPath)&lt;/PRE&gt;&lt;P&gt;Any help here in understanding the problem will be much appreciated .&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2023 08:23:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-metastore-lost-connection-while-executing-alter-table/m-p/368990#M240323</guid>
      <dc:creator>DataEngAa</dc:creator>
      <dc:date>2023-04-20T08:23:48Z</dc:date>
    </item>
  </channel>
</rss>

