<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Create Hive table using pyspark:  Mkdirs failed to create file in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Create-Hive-table-using-pyspark-Mkdirs-failed-to-create-file/m-p/349151#M235539</link>
    <description>&lt;P&gt;Hello,&lt;BR /&gt;We would like to create a Hive table in the ussign pyspark dataframe cluster.&lt;/P&gt;&lt;P&gt;We have the script below, which has run well several times in the past on the same cluster. After some configuration changes in the cluster, the same script is showing the error below.&lt;BR /&gt;We were unable to identify what changes are made to the cluster to trigger this error in this script (we rearange some services at cluster, etc)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The simple script is:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;U&gt;# pyspark --&lt;STRONG&gt;master=yarn&lt;/STRONG&gt;&lt;/U&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;data = [("Java", "20000"), ("Python", "100000"), ("Scala", "3000")]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;rdd = spark.sparkContext.parallelize(data)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1 = rdd.toDF(columns)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.printSchema()&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.show()&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;from pyspark.sql import SQLContext&lt;/EM&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;from pyspark.sql import HiveContext&lt;/EM&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;sqlContext = HiveContext(sc)&lt;/EM&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.registerTempTable("evento_temp")&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sqlContext.sql("use default").show()&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Hive Session ID = bd9c459e-1ec8-483e-9543-c1527b33feec&lt;BR /&gt;22/07/30 13:55:45 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/30 13:55:45 WARN util.DriverDataSource: Registered driver with driverClassName=org.apache.derby.jdbc.EmbeddedDriver was not found, trying direct instantiation.&lt;BR /&gt;22/07/30 13:55:46 WARN util.DriverDataSource: Registered driver with driverClassName=org.apache.derby.jdbc.EmbeddedDriver was not found, trying direct instantiation.&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;22/07/30 13:55:46 WARN metastore.MetaStoreDirectSql: Self-test query [select "DB_ID" from "DBS"] failed; direct SQL is disabled&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;javax.jdo.JDODataStoreException: Error executing SQL query "select "DB_ID" from "DBS"".&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;.....&lt;/P&gt;&lt;P&gt;at java.base/java.lang.Thread.run(Thread.java:829)&lt;BR /&gt;NestedThrowablesStackTrace:&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;java.sql.SQLSyntaxErrorException: Table/View 'DBS' does not exist.&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)&lt;BR /&gt;at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)&lt;BR /&gt;at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;sqlContext.sql("CREATE TABLE IF NOT EXISTS evento STORED AS parquet as SELECT * from evento_temp").show()&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;22/07/29 17:07:08 WARN Datastore.Schema: The MetaData for "org.apache.hadoop.hive.metastore.model.MStorageDescriptor" is specified with a foreign-key at class level yet no "table" is defined. All foreign-keys at this level must have a table that the FK goes to.&lt;BR /&gt;22/07/29 17:07:08 WARN Datastore.Schema: The MetaData for "org.apache.hadoop.hive.metastore.model.MStorageDescriptor" is specified with a foreign-key at class level yet no "table" is defined. All foreign-keys at this level must have a table that the FK goes to.&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.HiveMetaStore: Location: file:/home/usr_cmteste3/spark-warehouse/evento specified for non-external table:evento&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;22/07/29 17:07:09 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 3.0 (TID 4, &amp;lt;&amp;lt;HOST&amp;gt;&amp;gt;, executor 2): org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Mkdirs failed to create file:/home/usr_cmteste3/spark-warehouse/evento/.hive-staging_hive_2022-07-29_17-07-08_935_7404207232723330868-1/-ext-10000/_temporary/0/_temporary/attempt_202207291707093395760670811853018_0003_m_000001_4 (exists=false, cwd=file:/data05/yarn/nm/usercache/usr_cmteste3/appcache/application_1659116901602_0017/container_e67_1659116901602_0017_01_000003)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 21 Apr 2026 07:51:51 GMT</pubDate>
    <dc:creator>paulo_klein</dc:creator>
    <dc:date>2026-04-21T07:51:51Z</dc:date>
    <item>
      <title>Create Hive table using pyspark:  Mkdirs failed to create file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Create-Hive-table-using-pyspark-Mkdirs-failed-to-create-file/m-p/349151#M235539</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;We would like to create a Hive table in the ussign pyspark dataframe cluster.&lt;/P&gt;&lt;P&gt;We have the script below, which has run well several times in the past on the same cluster. After some configuration changes in the cluster, the same script is showing the error below.&lt;BR /&gt;We were unable to identify what changes are made to the cluster to trigger this error in this script (we rearange some services at cluster, etc)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The simple script is:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;U&gt;# pyspark --&lt;STRONG&gt;master=yarn&lt;/STRONG&gt;&lt;/U&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;data = [("Java", "20000"), ("Python", "100000"), ("Scala", "3000")]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;rdd = spark.sparkContext.parallelize(data)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1 = rdd.toDF(columns)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.printSchema()&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.show()&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;from pyspark.sql import SQLContext&lt;/EM&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;from pyspark.sql import HiveContext&lt;/EM&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;sqlContext = HiveContext(sc)&lt;/EM&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;EM&gt;dfFromRDD1.registerTempTable("evento_temp")&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sqlContext.sql("use default").show()&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Hive Session ID = bd9c459e-1ec8-483e-9543-c1527b33feec&lt;BR /&gt;22/07/30 13:55:45 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/30 13:55:45 WARN util.DriverDataSource: Registered driver with driverClassName=org.apache.derby.jdbc.EmbeddedDriver was not found, trying direct instantiation.&lt;BR /&gt;22/07/30 13:55:46 WARN util.DriverDataSource: Registered driver with driverClassName=org.apache.derby.jdbc.EmbeddedDriver was not found, trying direct instantiation.&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;22/07/30 13:55:46 WARN metastore.MetaStoreDirectSql: Self-test query [select "DB_ID" from "DBS"] failed; direct SQL is disabled&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;javax.jdo.JDODataStoreException: Error executing SQL query "select "DB_ID" from "DBS"".&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;.....&lt;/P&gt;&lt;P&gt;at java.base/java.lang.Thread.run(Thread.java:829)&lt;BR /&gt;NestedThrowablesStackTrace:&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;java.sql.SQLSyntaxErrorException: Table/View 'DBS' does not exist.&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)&lt;BR /&gt;at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)&lt;BR /&gt;at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;EM&gt;sqlContext.sql("CREATE TABLE IF NOT EXISTS evento STORED AS parquet as SELECT * from evento_temp").show()&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;22/07/29 17:07:08 WARN Datastore.Schema: The MetaData for "org.apache.hadoop.hive.metastore.model.MStorageDescriptor" is specified with a foreign-key at class level yet no "table" is defined. All foreign-keys at this level must have a table that the FK goes to.&lt;BR /&gt;22/07/29 17:07:08 WARN Datastore.Schema: The MetaData for "org.apache.hadoop.hive.metastore.model.MStorageDescriptor" is specified with a foreign-key at class level yet no "table" is defined. All foreign-keys at this level must have a table that the FK goes to.&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored&lt;BR /&gt;22/07/29 17:07:08 WARN metastore.HiveMetaStore: Location: file:/home/usr_cmteste3/spark-warehouse/evento specified for non-external table:evento&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;22/07/29 17:07:09 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 3.0 (TID 4, &amp;lt;&amp;lt;HOST&amp;gt;&amp;gt;, executor 2): org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Mkdirs failed to create file:/home/usr_cmteste3/spark-warehouse/evento/.hive-staging_hive_2022-07-29_17-07-08_935_7404207232723330868-1/-ext-10000/_temporary/0/_temporary/attempt_202207291707093395760670811853018_0003_m_000001_4 (exists=false, cwd=file:/data05/yarn/nm/usercache/usr_cmteste3/appcache/application_1659116901602_0017/container_e67_1659116901602_0017_01_000003)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;&lt;STRONG&gt;at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 07:51:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Create-Hive-table-using-pyspark-Mkdirs-failed-to-create-file/m-p/349151#M235539</guid>
      <dc:creator>paulo_klein</dc:creator>
      <dc:date>2026-04-21T07:51:51Z</dc:date>
    </item>
    <item>
      <title>Re: Create Hive table using pyspark:  Mkdirs failed to create file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Create-Hive-table-using-pyspark-Mkdirs-failed-to-create-file/m-p/349294#M235596</link>
      <description>&lt;P&gt;I fix this issue copying:&lt;/P&gt;&lt;P&gt;cp /etc/hive/conf/&lt;STRONG&gt;hive-site.xml&lt;/STRONG&gt; &lt;STRONG&gt;/etc/spark/conf&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 01 Aug 2022 22:24:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Create-Hive-table-using-pyspark-Mkdirs-failed-to-create-file/m-p/349294#M235596</guid>
      <dc:creator>paulo_klein</dc:creator>
      <dc:date>2022-08-01T22:24:36Z</dc:date>
    </item>
  </channel>
</rss>

