Created 03-31-2018 03:46 PM
I am running a SPARK JDBC process to extract data from Teradata. The target files are written into HDFS as ORC. I use the jdbcDF.write.format("orc").save(file) to save files onto HDFS. I run 8 parallel threads using different where clauses on the same table. Most of the times the process succeeds and some times 4-5 out of 8 parallel threads fail with the above error. But still the files are committed into HDFS and the counts match with source. However the temporary files renaming says failed but a permanent file is created and _SUCCESS file is not present in these failed target folders.
Created 04-02-2018 01:22 PM
What's the full stack?
If the job doesn't create a _SUCCESS file then the overall job failed. A destination directory will have been created, because job attempts are created underneath it. When tasks and jobs are committed they rename things...if there's a failure in any of those operations then something has gone wrong.
Like said, post the stack trace and I'll try to make sense of it
Created 05-18-2018 01:26 PM
Hi Stevel,
I am also facing the same issue, I run 5 parallel threads which will write the records in the same table (inside HDFS) concurrently using java spark sql. PFA the full stack.java-spark-sql-error-logs.txt
Thanks in advance.
,HI Stevel,
The same issue i am also facing , while multiple threads trying to write the records in the same table(inside HDFS ) using java Spark Sql, getting the same error.
Thanks in advance.