Member since
01-17-2020
6
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
20480 | 06-02-2020 02:03 PM |
01-09-2021
03:57 PM
Cool, thanks for the help! Just another question, if I just were to remove the directory I don't want to use for storing staging data and then restarted Yarn service, would there be any problem? The "yarn.nodemanager.local-dirs" property already includes three other directories for storing so it shouldn't be a problem, right?
... View more
01-08-2021
11:58 PM
We made a mistake while configuring the storage of our worker nodes and we ended up adding a data mount point of a really small disk to the yarn.nodemanager.local-dirs property. This causes some of our very data-intensive spark jobs to fail as this disk runs out of space pretty quickly as these jobs generate a lot of staging data. Right now the value of this property is the following: /u01/hadoop/yarn/nm,/u02/hadoop/yarn/nm,/u03/hadoop/yarn/nm,/u04/hadoop/yarn/nm We just want to remove /u04/hadoop/yarn/nm from the property, all the other directories are mounted on disks with enough capacity. So, is it safe to just remove this directory from the yarn.nodemanager.local-dirs property? Or do we have to back up the data in /u04/hadoop/yarn/nm and create another data mount point using a disk with enough capacity and put that backup there?
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Cloudera Manager
06-02-2020
02:03 PM
This indeed made it work. Thanks!
... View more
06-01-2020
06:08 PM
Hi I'm using pyspark and currently I am encountering an issue that I had not seen before when trying to write a dataframe to HDFS as a table. This is an example of the code I'm running: df.write.mode('overwrite').format('parquet').saveAsTable('{}.{}'.format(target_schema, table_name)) And the stacktrace of the error I'm getting is the following: Py4JJavaError: An error occurred while calling o1042.saveAsTable.
: org.apache.spark.SparkException: Job aborted.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:502)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)
at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:465)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:444)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:400)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure:
Aborting TaskSet 3.0 because task 3 (partition 3)
cannot run anywhere due to node and executor blacklist.
Most recent failure:
Lost task 3.0 in stage 3.0 (TID 5, <host_name>.com, executor 3): org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:257)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1363)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Decompression error: Unknown frame descriptor
at com.github.luben.zstd.ZstdInputStream.readInternal(ZstdInputStream.java:147)
at com.github.luben.zstd.ZstdInputStream.read(ZstdInputStream.java:107)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.apache.spark.io.ReadAheadInputStream$1.run(ReadAheadInputStream.java:168)
... 3 more The dataframes are actually oracle database tables that I read using JDBC. This has worked fine for a lot of tables, small and big. I'm not using any customSchema to change the source datatypes (which are all VARCHAR2s). The code used to run without issues in a small cluster made of vms running in a single server with somewhat enough resources, this problem started appearing when I migrated it to a cloud based cluster, where I have dedicated resources for each node (still vms though). Both of these "environments" use CDH 6.2 with the same packaging, though the cluster based one has Kerberos enabled. Any suggestions/guidance will be greatly appreciated
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark