- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
User class threw exception: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.io.IOException: Unable to create directory /tmp/hive/
- Labels:
-
Apache Hadoop
-
Apache Spark
Created ‎03-23-2021 03:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi community,
We run Spark 2.3.2 on Hadoop 3.1.1.
We use external ORC tables stored on HDFS.
We are encountering an issue on a job run under CRON when issuing the command `sql("msck repair table db.some_table")`. The table is partitioned and the issue is the following:
21/03/22 22:44:13 WARN HiveConf: HiveConf of name hive.heapsize does not exist
21/03/22 22:44:13 WARN HiveConf: HiveConf of name hive.stats.fetch.partition.stats does not exist
21/03/22 22:44:13 WARN HiveConf: HiveConf of name hive.plan.serialization.format does not exist
Hive Session ID = 2625af79-e021-4b57-9435-e0fea4f00803
21/03/22 22:44:13 INFO SessionState: Hive Session ID = 2625af79-e021-4b57-9435-e0fea4f00803
21/03/22 22:44:13 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.io.IOException: Unable to create directory /tmp/hive/2625af79-e021-4b57-9435-e0fea4f00803_resources;
org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.io.IOException: Unable to create directory /tmp/hive/2625af79-e021-4b57-9435-e0fea4f00803_resources;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.org$apache$spark$sql$hive$HiveSessionStateBuilder$$externalCatalog(HiveSessionStateBuilder.scala:39)
at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$1.apply(HiveSessionStateBuilder.scala:53)
at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$1.apply(HiveSessionStateBuilder.scala:53)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:90)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:90)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.databaseExists(SessionCatalog.scala:237)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.org$apache$spark$sql$catalyst$catalog$SessionCatalog$$requireDbExists(SessionCatalog.scala:176)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableMetadata(SessionCatalog.scala:400)
at org.apache.spark.sql.catalyst.catalog.CatalogUtils$.getMetaData(ExternalCatalogUtils.scala:265)
at org.apache.spark.sql.catalyst.catalog.CatalogUtils$.throwIfRO(ExternalCatalogUtils.scala:310)
at org.apache.spark.sql.hive.HiveTranslationLayerCheck$$anonfun$apply$1.applyOrElse(HiveTranslationLayerStrategies.scala:117)
at org.apache.spark.sql.hive.HiveTranslationLayerCheck$$anonfun$apply$1.applyOrElse(HiveTranslationLayerStrategies.scala:85)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)
at org.apache.spark.sql.hive.HiveTranslationLayerCheck.apply(HiveTranslationLayerStrategies.scala:85)
at org.apache.spark.sql.hive.HiveTranslationLayerCheck.apply(HiveTranslationLayerStrategies.scala:83)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:84)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:124)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:118)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:103)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
The same code that causes this issue is not giving any error on another environment, and all the other flows using the command do not have issues with it. As a side effect, it seems also that the table that was populated before issuing the partitions' repair is producing double entries for each new record.
I'm not sure if it's a permissions' problem, though that would be quite unusual, as all the other flows have never encountered problems with the same command, in case the command needed to use temporary files to store e.g. metastore information.
Might it be a problem with dependencies? HBase is involved initially to read from some sources. LLAP usage is avoided.
The code looks like:
df
.write
.format("orc")
.mode("append")
.partitionBy(singleColumn)
.option("compression", "snappy")
.save(hdfsPath)
sql(s"msck repair table $tableOfInterest") // $tableOfInterest = db.some_table
Thanks a lot in advance!
Cheers
Created ‎03-31-2021 02:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, we found the very stupid issue.
This specific job running as standalone was passing the "hive-site.xml" as file to the spark-submit, whereas all other jobs run under Oozie and make use of a generic spark-submit that doesnt pass the "hive-site.xml" file. This file specifies /tmp/hive as default directory to dump temporary resources and it came out that our user still has issues with that folder, issues that are being investigated. The workaround so far is to not pass the hive-site.xml file, so the default directory is instead /tmp, where we can happily live without issues.
All in all, it was a stupid "mistake" that let us know about other issues with out current system.
Cheers and thanks to all for the support!
Created ‎03-23-2021 06:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @zampJeri
Could you please let me know from which user you are running the spark application. Check that user is having creating files/directory access under /tmp/hive directory.
Created ‎03-23-2021 06:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @RangaReddy ,
Thanks for the reply.
If I do a simple
hdfs dfs -ls /tmp/hive
I see:
ls: Permission denied: user={myUser} access=READ_EXECUTE, inode="/tmp/hive":hive:hdfs:drwx-wx-wx
I guess that msck repair is using that folder to store temporary files. Is it because the spark submit suggests
--conf spark.datasource.hive.warehouse.load.staging.dir="/tmp"
?
Thanks
Created on ‎03-23-2021 10:29 AM - edited ‎03-23-2021 10:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@zampJeri This /tmp is about the OS file system, not HDFS. It wants to create the _resources files and unable. Does the user have permissions on /tmp/hive?
Created ‎03-24-2021 08:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes one of operation write or msck repair command is using temp directory. Current running user is not having create directory permission. Could you please give the proper permission and re run the job.
Created ‎03-24-2021 09:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, thanks for the replies.
The user has all the permissions to write to /tmp and subfolders.
We are currently investigating other parts of the code, even if the exception points to the specific line of the msck repair command. As far as I knew, that command would throw an exception if dealing with non-partitioned tables, but indeed the table under interest is partitioned. I'm not sure if an empty table could give troubles, but then other jobs should break just the same occasionally (especially the same code under a different environment - and it should be the same if considering authentication files passed to the submit).
In the beginning, we were using the Hive Warehouse Connector by means of
hive.execute("msck repair table etc...")
but we were told to stay away from triggering unnecessary LLAP (that was giving us a lot of troubles generally), so we removed all instances of HWC and all jobs run just fine with spark.sql.
Cheers!
Created ‎03-31-2021 02:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, we found the very stupid issue.
This specific job running as standalone was passing the "hive-site.xml" as file to the spark-submit, whereas all other jobs run under Oozie and make use of a generic spark-submit that doesnt pass the "hive-site.xml" file. This file specifies /tmp/hive as default directory to dump temporary resources and it came out that our user still has issues with that folder, issues that are being investigated. The workaround so far is to not pass the hive-site.xml file, so the default directory is instead /tmp, where we can happily live without issues.
All in all, it was a stupid "mistake" that let us know about other issues with out current system.
Cheers and thanks to all for the support!
